kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-02-22 14:54:23 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	4f24b4fc9e	versions: nvidia: Bump kernel to the latest LTS As now that we have the decoupled rootfs / kernel, doing the bump becomes trivial. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-14 14:55:20 +01:00
Fabiano Fidêncio	6b2a3fab8e	workflows: nvidia: Adjust to kernel / roots build decouple We don't need to store the kernel headers anymore. We do need to store the kernel modules, instead. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-14 14:55:20 +01:00
Zvonko Kaiser	451dcb289a	kernel: bump kata_config_version We have kernel build changes bump the config version Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-01-14 14:55:20 +01:00
Zvonko Kaiser	34cde2637d	gpu: build_image.sh use versions.yaml We've done some bad file based driver determination, now with versions.yaml there is a single source of truth. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-01-14 14:55:20 +01:00
Zvonko Kaiser	664a3af02b	gpu: nvidia_chroot.sh update decoupling Remove all the driver build instructions, sicne those are now done in the kernel target. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-01-14 14:55:20 +01:00
Zvonko Kaiser	e9bb43ef01	gpu: deploy modules for kernel build We need to package the build modules for the rootfs to be able to consume it. We package the whole /lib/modules/$(uname -r) directory strip=2. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-01-14 14:55:20 +01:00
Zvonko Kaiser	d4962bafac	gpu: Add NVIDA modules to build-kernel.sh Checkout and build the kernel modules along with the kernel to avoid the kernel rootfs dependency. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-01-14 14:30:31 +01:00
Zvonko Kaiser	c42f7501fd	gpu: Remove building of Headers Since we build along the kernel we do not need to carry over the headers to the rootfs build. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-01-14 14:30:31 +01:00
Zvonko Kaiser	a00ebab8ad	gpu: versions.yaml nvidia driver pinning We want to have deterministic behaviour and only one valid driver version acceptable via versions.yaml Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-01-14 14:30:31 +01:00
Zvonko Kaiser	1f6cfb11b0	kernel: bugfix install yq We actually never installed yq to the kernel build, there are some path that use yq but were never hit, for the GPU use-case we need to read values from versions.yaml Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-01-14 14:30:31 +01:00
stevenhorsman	70e3e2b0c9	genpolicy: Bump openssl-src This is a vulnerability (CVE-2025-9230) in openssl, so move to 3.5.4 which has a fix for this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-14 14:05:48 +01:00
stevenhorsman	aace7a7336	versions: Bump openssl-src This is a vulnerability (CVE-2025-9230) in openssl, so move to 3.5.4 which has a fix for this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-14 14:05:48 +01:00
Fabiano Fidêncio	2acb94ef2d	arm64: Do not use DAX with the rootfs image Kernel 6.18.x has an issue with DAX, which is not yet fixed upstream: ``` [ 0.737679] EXT4-fs (pmem0p1): mounted filesystem 79676804-7c8b-491a-b2a6-9bae3c72af70 ro with ordered data mode. Quota mode: disabled. [ 0.737891] VFS: Mounted root (ext4 filesystem) readonly on device 259:1. [ 0.739119] devtmpfs: mounted [ 0.739476] Freeing unused kernel memory: 1920K [ 0.740156] Run /sbin/init as init process [ 0.740229] with arguments: [ 0.740286] /sbin/init [ 0.740321] with environment: [ 0.740369] HOME=/ [ 0.740400] TERM=linux [ 0.743162] Unable to handle kernel paging request at virtual address fffffdffbf000008 [ 0.743285] Mem abort info: [ 0.743316] ESR = 0x0000000096000006 [ 0.743371] EC = 0x25: DABT (current EL), IL = 32 bits [ 0.743444] SET = 0, FnV = 0 [ 0.743489] EA = 0, S1PTW = 0 [ 0.743545] FSC = 0x06: level 2 translation fault [ 0.743610] Data abort info: [ 0.743656] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 [ 0.743720] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 0.743785] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 0.743848] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000b9d17000 [ 0.743931] [fffffdffbf000008] pgd=10000000bfa3d403, p4d=10000000bfa3d403, pud=1000000040bfe403, pmd=0000000000000000 [ 0.744070] Internal error: Oops: 0000000096000006 [#1] SMP [ 0.748888] CPU: 0 UID: 0 PID: 1 Comm: init Not tainted 6.18.4 #1 NONE [ 0.749421] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 0.749969] pc : dax_disassociate_entry.constprop.0+0x20/0x50 [ 0.750444] lr : dax_insert_entry+0xcc/0x408 [ 0.750802] sp : ffff80008000b9e0 [ 0.751083] x29: ffff80008000b9e0 x28: 0000000000000000 x27: 0000000000000000 [ 0.751682] x26: 0000000001963d01 x25: ffff0000004f7d90 x24: 0000000000000000 [ 0.752264] x23: 0000000000000000 x22: ffff80008000bcc8 x21: 0000000000000011 [ 0.752836] x20: ffff80008000ba90 x19: 0000000001963d01 x18: 0000000000000000 [ 0.753407] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 0.753970] x14: ffffbf3154b9ae70 x13: 0000000000000000 x12: ffffbf3154b9ae70 [ 0.754548] x11: ffffffffffffffff x10: 0000000000000000 x9 : 0000000000000000 [ 0.755122] x8 : 000000000000000d x7 : 000000000000001f x6 : 0000000000000000 [ 0.755707] x5 : 0000000000000000 x4 : 0000000000000000 x3 : fffffdffc0000000 [ 0.756287] x2 : 0000000000000008 x1 : 0000000040000000 x0 : fffffdffbf000000 [ 0.756871] Call trace: [ 0.757107] dax_disassociate_entry.constprop.0+0x20/0x50 (P) [ 0.757592] dax_iomap_pte_fault+0x4fc/0x808 [ 0.757951] dax_iomap_fault+0x28/0x30 [ 0.758258] ext4_dax_huge_fault+0x80/0x2dc [ 0.758594] ext4_dax_fault+0x10/0x3c [ 0.758892] __do_fault+0x38/0x12c [ 0.759175] __handle_mm_fault+0x530/0xcf0 [ 0.759518] handle_mm_fault+0xe4/0x230 [ 0.759833] do_page_fault+0x17c/0x4dc [ 0.760144] do_translation_fault+0x30/0x38 [ 0.760483] do_mem_abort+0x40/0x8c [ 0.760771] el0_ia+0x4c/0x170 [ 0.761032] el0t_64_sync_handler+0xd8/0xdc [ 0.761371] el0t_64_sync+0x168/0x16c [ 0.761677] Code: f9453021 f2dfbfe3 cb813080 8b001860 (f9400401) [ 0.762168] ---[ end trace 0000000000000000 ]--- [ 0.762550] note: init[1] exited with irqs disabled [ 0.762631] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ``` For now, we limit the rootfs that we ship to ARM64 to not use DAX, in the future we'll re-enable it as soon as the patch lands on mainstream kernel. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-14 11:46:40 +01:00
Fabiano Fidêncio	3ef99f4ee3	versions: Add specific nvidia kernel version This is needed as the 580 driver doesn't build against 6.18.x, and the 590 driver is not yet fully working for our case, thus we stick to the previous version that worked before. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-14 11:46:40 +01:00
Fabiano Fidêncio	cce5d4abf6	kernel: bump to v6.18.x (LTS) Bump both the kernel and kernel-confidential versions from v6.12.x and v6.16.x to v6.18.4, aligning with the new LTS release. Kernel 6.18 introduced several configuration changes that required updates to our kernel config fragments: * CRYPTO_FIPS dependencies changed: - In 6.12: depended on !CRYPTO_MANAGER_DISABLE_TESTS - In 6.18: now depends on CRYPTO_SELFTESTS (which requires EXPERT) Added CONFIG_EXPERT=y and CONFIG_CRYPTO_SELFTESTS=y to crypto.conf to satisfy the new dependency chain. * CONFIG_EXPERT is a naughty one, as it disables / enables a bunch of things behind ones back, probably just to prove a point that it is for experts ;-) ... regardless, a reasonable amount of options had to be re-added in order to make sure anything ends up broken. * Legacy iptables support: Kernel 6.18 requires explicit legacy xtables/iptables configs for IP_NF_* options. Added CONFIG_NETFILTER_XTABLES_LEGACY, CONFIG_IP_NF_IPTABLES_LEGACY, and CONFIG_IP6_NF_IPTABLES_LEGACY to netfilter.conf. * Module signing dependencies: Added CONFIG_MODULES=y and other required dependencies to module_signing.conf to ensure MODULE_SIG can be properly enabled. * Whitelist updates: - Added CONFIG_NF_CT_PROTO_DCCP (removed in 6.18+) - Added CONFIG_CRYPTO_SELFTESTS, CONFIG_NETFILTER_XTABLES_LEGACY, CONFIG_IP_NF_IPTABLES_LEGACY, CONFIG_IP6_NF_IPTABLES_LEGACY (added in 6.18+, not present in older kernels like 6.12) Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-14 11:46:40 +01:00
LandonTClipp	94fde1356c	docs: Add Zensical Doc Site Generation This commit adds a Github workflow for building a Github Pages site for the markdown files in the docs/ directory. Zensical is a new markdown-based static site generation framework built by the creators of Material for Mkdocs. https://zensical.org/ This commit does not clean the doc structure, so site navigation is initially going to be messy. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2026-01-13 12:42:02 +01:00
dependabot[bot]	3377d729ea	build(deps): bump rsa from 0.9.6 to 0.9.9 in /src/tools/agent-ctl Bumps [rsa](https://github.com/RustCrypto/RSA) from 0.9.6 to 0.9.9. - [Changelog](https://github.com/RustCrypto/RSA/blob/v0.9.9/CHANGELOG.md) - [Commits](https://github.com/RustCrypto/RSA/compare/v0.9.6...v0.9.9) --- updated-dependencies: - dependency-name: rsa dependency-version: 0.9.9 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2026-01-13 04:08:40 +01:00
Fupan Li	1f1a000608	Merge pull request #12291 from Apokleos/bump-qapi runtime-rs: Bump qapi-rs from 0.14 to 0.15	2026-01-13 10:39:41 +08:00
Manuel Huber	9e30283952	runtime: nvidia: change kernel parameters Remove the agent hotplug timeout parameter from the kernel command line. Having shifted to VFIO cold-plug, this parameter is no longer needed. Remove the no longer required parameter for TDX and thus align the SNP and TDX configurations. Add a parameter to avoid the kernel to mount the /dev tmpfs. NVRC and later on kata-agent attempt this. While kata-agent does not panic when mounting /dev fails, NVRC makes mounting /dev a hard requirement. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-12 16:11:28 -08:00
dependabot[bot]	bcadb9b231	build(deps): bump sequoia-openpgp in /src/tools/agent-ctl Bumps [sequoia-openpgp](https://gitlab.com/sequoia-pgp/sequoia) from 2.0.0 to 2.1.0. - [Commits](https://gitlab.com/sequoia-pgp/sequoia/compare/openpgp/v2.0.0...openpgp/v2.1.0) --- updated-dependencies: - dependency-name: sequoia-openpgp dependency-version: 2.1.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2026-01-12 22:16:51 +01:00
Alex Lyn	fba92880c9	tests: make set_container_command idempotent and add debug output set_container_command() previously appended command arguments one-by-one with '.command += [...]'. This makes the helper non-idempotent and can lead to unexpected command arrays when invoked multiple times. Update the helper to set the full command array in a single yq v4 expression and print the target YAML path plus the command being applied to simplify debugging when tests fail. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-12 17:56:28 +01:00
Alex Lyn	38296a41b2	tests: Generate pod config with stable .yaml suffix The pod config file created by new_pod_config() was generated via mktemp using the template "pod-config.yaml.in.XXX", which produces filenames that do not end with ".yaml" (e.g. pod-config.yaml.in.ABC). If the random combination of special suffix with ".Csv" or ".Xml", etc. the following operations with yq will fail. Some helpers and tooling assume the config path ends with ".yaml". Switch the mktemp template to place the random suffix before the extension so the returned path always ends with ".yaml". Fixes: #12268, #12319 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-12 17:56:28 +01:00
Fabiano Fidêncio	9fec31f400	tools: kubectl: Add kubectl version as a tag This is a suggestion from Choi, so we can easily test with a specific kubectl version and also easily understand which kubectl version is being used in case of failure. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-12 15:48:44 +01:00
Fabiano Fidêncio	26dfcb627b	tools: Build kubectl image This image will be used by our helm charts to verify that a kata-containers deployment is correct. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-12 15:48:44 +01:00
Alex Lyn	d03eccf567	runtime-rs: Improve wait_for_migration to avoid fixed sleep Enhance the wait_for_migration implementation to reliably wait for QEMU migration completion and avoid the previous `sleep(280ms)` delay. (1) Add an initial fast-path query to return immediately if migration is already completed/failed/cancelled. (2) Use a hard deadline to enforce timeouts deterministically. (3) Implement adaptive polling with backoff and a maximum interval to reduce QMP load while keeping responsiveness. (4) Unify migration status handling and return clear errors on failed/cancelled states. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-12 20:06:55 +08:00
Alex Lyn	5026b33455	runtime-rs: Introduce a method to detect current migrate info Return information about current migration process. And the input and output as below: { 'command': 'query-migrate', 'returns': 'MigrationInfo' } But note that the Qemu API is valid within qapi-rs(v0.15+) Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-12 20:06:55 +08:00
Alex Lyn	c472b5db54	runtime-rs: Bump qapi-rs from 0.14 to 0.15 The detailed information about the updated versions as below: ``` qapi = { version = "0.15", features = ["qmp", "async-tokio-all"] } qapi-spec = "0.3.2" qapi-qmp = "0.15.0" ``` and it will correct some corresonding structures. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-12 20:06:55 +08:00
Manuel Huber	183507beeb	agent: change secure_storage_integrity default Change the secure_storage_integrity option's default value to true. With this, integrity protection for encrypted block device contents will be requested from the confidential data hub by default, see the agent's cdh_handler_trusted_storage function in rpc.rs. This behavior can be disabled by explicitly setting the agent.secure_storage_integrity parameter to 0 or false via kernel command line parameters. This will affect the trusted storage implementation for the guest-pull mechanism, and it will affect future implementations using this code path, such as implementations for ephemeral secure storage. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-10 16:54:03 +01:00
stevenhorsman	a0d96256f5	packaging: Fix tools permissions issue In some builds we are seeing: ``` error: could not create temp file /opt/rustup/tmp/r2xu46kwuyc7k2kr_file: Permission denied (os error 13) ``` in the agent-ctl build, so try and port a fix from #12313 to the tools build to try and resolve this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-09 21:45:26 +01:00
Federico A. Corazza	787768fe9b	kata-deploy: Fix extraction of the containerd major version Fixes deploying kata-containers using k3s. The deploy script fails with /opt/kata-artifacts/scripts/kata-deploy.sh: line 397: [: too many arguments Signed-off-by: Federico A. Corazza <git@facorazza.com>	2026-01-09 19:52:18 +01:00
stevenhorsman	5067ed7d9a	versions.yaml: Fix formatting errors yamllint complains that there is only one space before the comment, so add a second to prevent this annoying message showing up. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-09 19:36:31 +01:00
stevenhorsman	a850f66fc4	versions: Bump rust to 1.89 Following the agreed toolchain policy - bump rust to the current (1.91)-2 releases. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-09 19:36:31 +01:00
Manuel Huber	df2896c298	docs: Create NVIDIA GPU passthrough QEMU scenario Create a new page for a reference implementation for Kubernetes using QEMU, the go shim and an NVIDIA rootfs. The new page contains information on: - components involved in the NVIDIA (TEE) GPU scenario - orchestration flow for GPU passthrough scenarios - deployment guidance Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-09 19:02:56 +01:00
Manuel Huber	43627805f4	docs: Improve structure and flow of NVIDIA guide - Apply a few structural/grouping changes and improve flow - Group build sections together - Move usage examples to last section Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-09 19:02:56 +01:00
Steve Horsman	489deaad17	Merge pull request #12297 from manuelh-dev/mahuber/fix-doc docs: Fix trusted-image-storage reference	2026-01-09 15:22:25 +00:00
Hyounggyu Choi	2962e14c10	virtiofsd: fix RUSTUP_HOME and CARGO_HOME permissions for non-root builds The following error was observed during virtiofsd static build: ``` error: could not create temp file /opt/rustup/tmp/p44enysfaxwdbvw4_file: Permission denied (os error 13) ``` This occurs because RUSTUP_HOME and CARGO_HOME were initialized by the root user during `docker build`, but `cargo build` is executed as a non-root user via 'docker run --user'. Ensure these directories are writable by adjusting the permission after the toolchain installation is complete. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-01-09 14:01:20 +01:00
Manuel Huber	65aa99f291	docs: Fix trusted-image-storage reference The sample uses a volume device name which does not exist, hence fix. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-09 11:41:18 +00:00
Saul Paredes	02979a13e3	Merge pull request #12208 from romoh/patch-1 ci: Update AKS setup post Pod Sandboxing GA	2026-01-08 11:02:05 -08:00
Fabiano Fidêncio	f8318c0542	kata-deploy: Remove unused dependency We're depending solely on toml_edit, thus we can safely remove the toml dependency. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-08 18:58:11 +01:00
Fupan Li	b3546f3a68	Merge pull request #12282 from kata-containers/set-required-ci Set several tests as required ci	2026-01-08 20:34:39 +08:00
Mikko Ylinen	cc6277b735	Revert "tdx: Update GPU config for the latest TDX stack" Prefer the "full feature TDVF" instead of the generic OVMF build. See Option-B in https://github.com/tianocore/edk2/tree/master/OvmfPkg/IntelTdx#configurations-and-features for the extra hardening supported. FIRMWAREPATH_NV also seems to be TDX specific unlike the Makefile suggests. Therefore, it can be dropped completely. This reverts commit `66ccc25724`.	2026-01-08 10:21:47 +01:00
Mikko Ylinen	e02e226431	packaging: build OVMF for Intel TDX again OVMF build for Intel TDX (aka "TDVF") was disabled in favor of Ubuntu/ CentOS pre-upstream releases of Intel TDX. See `4292c4c3b1`. It's time to re-enable the build and move runtime configurations to use it (the latter will be done in a later commit). This is a partial revert of `4292c4c3b` with the following changes: - Stop calling OVMF for Intel TDX "TDVF" and follow the naming distros use for TDX enabled build: OVMF.inteltdx.fd. - Single binary OVMF.inteltdx.fd is supported using -bios QEMU param. - Secure Boot infrastructure is disabled since Kata does not support it. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-01-08 10:21:47 +01:00
Alex Lyn	f3d92a8b4a	dragonball: Fix UT failed in test_fs_manipulate_backend_fs Improve the checking logic for source path existing. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 12:42:00 +08:00
Alex Lyn	7de968b416	dragonball: Fix warning of unused method Actually this method is indeed called, just add attribute of `#[allow(dead_code)]` to allow UT pass. And the warning looks like: warning: method `send_message_with_payload` is never used \| 224 \| impl<R: Req> Endpoint<R> { \| ------------------------ method in this implementation ... 522 \| pub fn send_message_with_payload<T: Sized, P: Sized>( \| ^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: `#[warn(dead_code)]` on by default Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 11:01:34 +08:00
Alex Lyn	36d3d7c3bf	dragonball: Fix warnings of result to be handled warning: unused `std::result::Result` that must be used --> src/dragonball/dbs_virtio_devices/src/vhost/vhost_user/net.rs:679:9 \| 679 \| / VirtioDevice::<Arc<GuestMemoryMmap<()>>, QueueSync, GuestRegionMmap>::write_config( 680 \| \| &mut dev, 0, &config, 681 \| \| ); \| \|_________^ \| = note: this `Result` may be an `Err` variant, which should be handled = note: `#[warn(unused_must_use)]` on by default help: use `let _ = ...` to ignore the resulting value \| 679 \| let _ = VirtioDevice::<Arc<GuestMemoryMmap<()>>, QueueSync, GuestRegionMmap>::write_config( \| +++++++ warning: unused `std::result::Result` that must be used --> src/dragonball/dbs_virtio_devices/src/vhost/vhost_user/net.rs:683:9 \| 683 \| / VirtioDevice::<Arc<GuestMemoryMmap<()>>, QueueSync, GuestRegionMmap>::read_config( 684 \| \| &mut dev, 0, &mut data, 685 \| \| ); \| \|_________^ \| = note: this `Result` may be an `Err` variant, which should be handled help: use `let _ = ...` to ignore the resulting value \| 683 \| let _ = VirtioDevice::<Arc<GuestMemoryMmap<()>>, QueueSync, GuestRegionMmap>::read_config( \| +++++++ Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 10:52:19 +08:00
Alex Lyn	6a1b25a4b0	dragonball: Fix warning of variable does not need to be mutable the WARNING looks like as: ... warning: variable does not need to be mutable --> src/dragonball/dbs_virtio_devices/src/vsock/csm/txbuf.rs:217:13 \| 217 \| let mut tmp: Vec<u8> = vec![0; TxBuf::SIZE - 2]; \| ----^^^ \| \| \| help: remove this `mut` \| = note: `#[warn(unused_mut)]` on by default ... Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 10:44:25 +08:00
Alex Lyn	064271b9cb	dragonball: Fix unexpected `cfg` condition of test-resources Fix the warnings about unexpected cfg of test-resources, and the detailed warning message looks like as below: ... warning: unexpected `cfg` condition value: `test-resources` --> src/dragonball/dbs_virtio_devices/src/fs/device.rs:973:11 \| 973 \| #[cfg(feature = "test-resources")] \| ^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: expected values for `feature` are: `fuse-backend-rs`, `vhost`, `vhost-net`, `vhost-rs`, `vhost-user`, `vhost-user-blk`, `vhost-user-fs`, `vhost-user-net`, `virtio-balloon`, `virtio-blk`, `virtio-fs`, `virtio-fs-pro`, `virtio-mem`, `virtio-mmio`, `virtio-net`, and `virtio-vsock` = help: consider adding `test-resources` as a feature in `Cargo.toml` ... Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 10:39:33 +08:00
Alex Lyn	ef36c47ca4	runtime-rs: Fix deprecated method in UT Remove into_path() and replace it with keep(). Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 10:32:31 +08:00
Alex Lyn	e4451baa84	tests: Set run-nerdctl-tests with qemu-runtime-rs required run-nerdctl-tests (qemu-runtime-rs) Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 09:56:50 +08:00
Alex Lyn	56a21c33a3	tests: Set stability tests with qemu-runtime-rs required run-containerd-stability (active, qemu-runtime-rs) run-containerd-stability (lts, qemu-runtime-rs) Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 09:56:50 +08:00
Alex Lyn	679e31d884	tests: Set run-nydus CIs as required run-basic-amd64-tests / run-nydus Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-08 09:56:50 +08:00
Fabiano Fidêncio	6b3953dd51	tests: k8s: liveness-probes: Adjust events grep Till k8s 1.34 we could grep by "Started containerd". From k8s 1.35 onwards the event message changed and we should, instead, grep by "Container started". Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-07 23:01:59 +01:00
Fabiano Fidêncio	c4194538e2	versions: Bump QEMU to v10.2.0 QEMU v10.2.0 was released on December 24th, 2025. The experimental GPU SNP / TDX are also pointing to v10.2.0 release with their gpu-{snp,tdx}-20260107 branch. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-07 12:30:55 +01:00
Steve Horsman	93ad6fde75	Merge pull request #12294 from stevenhorsman/remediate-RUSTSEC-2021-0064 versions: Bump sha2 crate version	2026-01-07 09:53:26 +00:00
stevenhorsman	c456b84537	versions: Bump sha2 crate version sha2 0.9.3 includes the use of cpuid-bool, which was renamed to cpufeatures around 5 years ago. Try moving to a workspace dependency of sha2 and bumping to the latest version to remediate RUSTSEC-2021-0064 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-06 15:41:34 +00:00
Roaa Sakr	44c79cf14a	ci: Update AKS setup post Pod Sandboxing GA Update workload-runtime value to align with current AKS Pod Sandboxing documentation post GA. Signed-off-by: Roaa Sakr <romoh@microsoft.com>	2026-01-05 13:47:33 -08:00
Steve Horsman	9463dd970e	Merge pull request #12287 from mythi/drop-qat use-cases: drop Intel QuickAssist instructions	2026-01-05 13:28:16 +00:00
Mikko Ylinen	99bc0f49cc	use-cases: drop Intel QuickAssist instructions While the use-case of Intel QuickAssist (QAT) accelerated crypto and/or compression with k8s and Kata Containers is still valid, the setup instructions are outdated: Starting with Intel Xeon Gen4 (Sapphire Rapids), QAT driver stack moved to in-tree drivers without a separete SR-IOV VF driver. Drop all the setup instructions but keep the use-cases doc for reference. Users wanting to enable the use-case, should consult with Intel QAT Device plugins or Intel QAT DRA driver authors. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-01-02 12:14:04 +02:00
Fupan Li	b27a80b800	Merge pull request #12156 from Apokleos/required-coco-dev-rs tests: Make the tests coco-dev job with coco-dev-runtime-rs required	2025-12-25 17:30:40 +08:00
Steve Horsman	bdc5f7d4be	Merge pull request #12271 from stevenhorsman/bump-rust-to-1.88 Bump rust to 1.88	2025-12-23 21:38:42 +00:00
Alex Lyn	0b1a5c6e93	tests: Make the tests coco-dev job with coco-dev-runtime-rs required The nontee job (run-k8s-tests-coco-nontee) for qemu-coco-dev-runtime-rs is running well and it's time to make it required when the CI runs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-23 09:54:52 +08:00
stevenhorsman	b6108a7c4a	dragonball: Fix manual implementation of .is_multiple_of Use this new method to avoid the clippy warning and increase readability Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	55be31ef0f	runtime-rs: Fix manual implementation of .is_multiple_of Use this new method to avoid the clippy warning and increase readability Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	1d139a7c92	versions: Bump rust to 1.88 In prep for the bump to rust 1.90, try bumping to 1.88 first to see if the CI is successful here Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	c6053e976f	dragonball: Improve vector initialisation Directly initialise a zero-filled vector, rather than resizing later Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	18a51dad98	dragonball: Fix manual slice size calculation Using the built in size_of_val is easier to read and less error-prone than doing this calculation manually Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	188c9e6eb7	dragonball: Prefer from over into From give Into for free, so prefer this method Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	c7daa12fe6	dragonball: Remove unnecessary cast Don't cast usize to usize Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	6c19bd01c8	dragonball: Fix redundant pattern matching Convert `matches!(desc, None)` to desc.is_none() which is simpler Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	15c6ef5988	dragonball: Fix `deprecated cargo-clippy` cfg #[cfg(feature = "cargo-clippy")] has been deprecated for years, so should be replaced with `#[cfg(clippy)]` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	e0d09dd787	dragonball: Fix useless use of `vec!` `vec![...]` is the same as `[...]`, so remove it to clean up code Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	4fb90d61aa	dragonball: Temporaily skip kvm bindgen tests There are many, many null pointer dereferences in the bindgen code when moving between rust 1.85.1 and 1.86 and no docs of the source that it was generated from, so try and skip these test from running until an SME can look at them @lifupan Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:19 +00:00
stevenhorsman	04306c162b	genpolicy: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:11 +00:00
stevenhorsman	b9ce0bbdf8	trace-forwarder: Fix uninlined_format_args in examples Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:11 +00:00
stevenhorsman	c5f0acef23	kata-ctl: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:02 +00:00
stevenhorsman	aff3524420	kata-ctl: Refresh runtime-rs crates runtime-rs crates are pulled into kata-ctl and some of these have bumped recently, so update these in kata-ctl as well Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:50:01 +00:00
stevenhorsman	2caa62f753	agent-ctl: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:49:52 +00:00
stevenhorsman	6006b8350d	libs: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:49:45 +00:00
stevenhorsman	2fde31547a	runtime-rs: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:49:36 +00:00
stevenhorsman	a299338b6c	dragonball: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:49:27 +00:00
stevenhorsman	e44c4d901f	doc: Fix uninlined_format_args in examples Clippy is recommending that format args are inlined for better clarity, so ensure our docs include this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:49:27 +00:00
stevenhorsman	b07899f8dc	agent: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:49:17 +00:00
stevenhorsman	2af88dbb48	agent: bump cdi-rs In #12151 the version was bumped in cargo.toml, but the update not done, so run `cargo update -p container-device-interface` to apply it Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-20 10:08:45 +00:00
Steve Horsman	97603608ac	Merge pull request #12259 from RuoqingHe/filter-tests-requires-kvm dragonball: Skip tests require kvm while kvm is absent	2025-12-19 16:05:33 +00:00
Steve Horsman	81d74346f3	Merge pull request #12255 from stevenhorsman/bump-to-rust-1.90-prep Preparations for the rust 1.90 bump	2025-12-19 14:41:32 +00:00
Steve Horsman	b75cc16bad	Merge pull request #12272 from shwetha-s-poojary/revert_cleanup workflows: payload: do not remove AGENT_TOOLSDIRECTORY	2025-12-19 14:22:36 +00:00
shwetha-s-poojary	1929ca8879	workflows: payload: do not remove AGENT_TOOLSDIRECTORY Remove line that deletes $AGENT_TOOLSDIRECTORY Signed-off-by: shwetha-s-poojary <shwetha.s-poojary@ibm.com>	2025-12-19 05:24:36 -08:00
Alex Lyn	b85084f046	Merge pull request #12266 from BbolroC/fix-selective-skip-for-empty-dir-test tests: remove re-delcared local variable in k8s-empty-dirs.bats	2025-12-19 17:30:07 +08:00
Hyounggyu Choi	3fa1d93f85	tests: remove re-delcared local variable in k8s-empty-dirs.bats Since #12204 was merged, the following error has been observed: ``` bats warning: Executed 1 instead of expected 2 tests [run_kubernetes_tests.sh:162] ERROR: Tests FAILED from suites: k8s-empty-dirs.bats ``` The cause is that `pod_logs_file` is re-declared as a local variable in the second test before skipping, which makes it inaccessible in `teardown()` and leads to an error. This commit removes the re-declaration of the variable. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-18 18:57:16 +01:00
Fabiano Fidêncio	51e9b7e9d1	nydus-snapshotter: Bump to v0.15.10 As it brings a fix that most likely can workaround the containerd / nydus-snapshotter databases desynchronization. Reference: https://github.com/containerd/nydus-snapshotter/pull/700 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 18:41:09 +01:00
Fabiano Fidêncio	03297edd3a	kata-deploy: rust: Add list verb for runtimeclasses RBAC The Rust kata-deploy binary calls list_runtimeclasses() during NFD setup, but the ClusterRole only granted get and patch permissions. Add the list verb to the runtimeclasses resource permissions to fix the RBAC error: runtimeclasses.node.k8s.io is forbidden: User \"system:serviceaccount:kube-system:kata-deploy-sa\" cannot list resource \"runtimeclasses\" in API group \"node.k8s.io\" at the cluster scope Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 18:31:52 +01:00
Ruoqing He	5fa663b1e3	dragonball: Skip tests requires KVM when KVM is absent KVM is not available in our ARM runners, let's skip those tests accordingly, while making the rest test cases remain tested on machines with KVM present and access to KVM device. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-18 14:17:46 +00:00
Ruoqing He	7cfb97d41b	libs: Introduce skip_if_kvm_unaccessable macro There are test cases require interaction with KVM device, introduce skip_if_kvm_unaccessable macro to skip them. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-18 12:43:20 +00:00
Manuel Huber	78c41b61f4	tests: nvidia: Update images, probes and timeouts Changes in NIM/RAG samples: - update image references - update memory requirements, timeouts, model name - sanitize some of the probes and print-out Further refinements can be made in the future. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-18 10:57:14 +01:00
Manuel Huber	0373428de4	tests: nvidia: Use secret for NGC API key This is a slight change in the manifest to at least use a secret for the environment variable. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-18 10:57:14 +01:00
Hyounggyu Choi	56ec8d7788	Merge pull request #12204 from kata-containers/runtime-rs-stability-debug CI: Upgrade log details for improved error analysis	2025-12-18 10:54:54 +01:00
Alex Lyn	c7dfdf71f5	Merge pull request #11935 from burgerdev/fsgroup genpolicy: support fsGroup setting in pod security context	2025-12-18 16:47:48 +08:00
stevenhorsman	e5568e65a1	lib: Fix missing copyright and license Add the copyright date from when the file was first submitted to github Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	175c2c70b1	dragonball: Fix pointer equality check Use `ptr::eq` to compare references by address rather than the values that they point to Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	a221eaa81d	dragonball: Fix length comparison to zero Replace .len() == 0 with .is_empty() for more clarity Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	e73a7c3717	dragonball: Replace manual div_ceil Use the more clear built-in method Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	048000654c	runtime-rs: Prevent doc test issue cargo test was trying to evaluate the documentation comment and failing, so try and make the comment explicitly text to avoid this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	4384b6ad9f	dragonball: Avoid manual implementation of ok Refactor to use `.ok()` rather than implementing it ourselves Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	f4dd69a835	dragonball: Remove unnecessary unwrap Given that we call `is_some` earlier, we don't then need to unwrap, so refactor to avoid this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	20192f819f	agent-ctl: Remove unnecessary unwrap Given that we call `is_some` earlier, we don't then need to unwrap, so refactor to avoid this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	9bf5f113f9	genpolicy: Allow dead_code A few structs in genpolicy are never constructed, so add `#[allow(dead_code)]` to prevent this clipped warning Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	ca1c0c853f	libs: Remove doc overindentation The doc comment had one space to many in it's list, so the format was wrong Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	501b41cf8f	dragonball: Remove doc overindentation The doc comment had one space to many in it's list, so the format was wrong Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	6a45ee0874	runtime-rs: Improve map iteration The key was never used, just the value, so just iterate over `.values()` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	2f49dffcd7	runtime-rs: Remove dead code `VmmPingResponse` and `NetInterworkingModel` are never constructed, so remove them Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	35557745b1	runtime-rs: Fix char_indices_as_byte_indices In unicode you can have multi-byte characters, so it's better to user char_indices than enumerate the bytes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	69ca6c0de0	runtime-rs: Fix manual_contains Use contains to be more concise and efficient rather than manually implementing this check Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	0027f6cae0	agent: Fix dead_code warning VirtioBlkCcwDeviceHandler and VirtioBlkCcwHandler are only constructed on s390x, so add #[cfg(target_arch = "s390x")] to all the code Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	3b2c83f9d2	trace-forwarder: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	b1cfa98524	runtime-rs: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	dc8f628dd1	libs: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other and drop our own macro that did this mapping Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	5f1d3481af	dragonball: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	9ec7109712	agent: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	34d299ae44	vsock-exporter: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	b2f9f23504	dragonball: Fix `mismatched_lifetime_syntaxes` issue Fix to`warning: hiding a lifetime that's elided elsewhere is confusing` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	8bbbc3a58b	lib: Fix `mismatched_lifetime_syntaxes` issue Fix the warning throw up: ``` warning: hiding a lifetime that's elided elsewhere is confusing --> /root/go/src/github.com/kata-containers/kata-containers/src/libs/kata-types/src/utils/u32_set.rs:50:17 \| 50 \| pub fn iter(&self) -> Iter<u32> { \| ^^^^^ --------- the same lifetime is hidden here \| \| \| the lifetime is elided here \| = help: the same lifetime is referred to in inconsistent ways, making the signature confusing = note: `#[warn(mismatched_lifetime_syntaxes)]` on by default help: use `'_` for type paths \| 50 \| pub fn iter(&self) -> Iter<'_, u32> { \| +++ ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
Xuewei Niu	a65c2b06b8	Merge pull request #12169 from zhangls-0524/new-fix-issue-11996 runtime-rs: Block Device Rootfs Mount Options Lost During Storage Object Creation	2025-12-18 10:09:38 +08:00
Fabiano Fidêncio	0e534fa7fe	versions: Update virtiofsd to v1.13.3 Update virtiofsd to its latest release. Here we also need to update the alpine version used by the builder as we need a version of musl-dev new enough to have wrappers for pread2 and pwrite2. As bumping, bump to the latest. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 00:51:08 +01:00
Fabiano Fidêncio	1d2e19b07c	versions: Update pause image to 3.10.1 Update pause image to its latest release. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 00:51:08 +01:00
Fabiano Fidêncio	6211c10904	versions: Update libseccomp to 2.6.0 Update libseccomp to its latest release. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 00:51:08 +01:00
Fabiano Fidêncio	0e0a92533c	versions: update lvm2 to v2_03_38 Update lvm2 to its latest release. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 00:51:08 +01:00
Fabiano Fidêncio	142c7d6522	versions: Update gperf to 3.3 Update gperf to its latest release. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 00:51:08 +01:00
Fabiano Fidêncio	e757485853	versions: Update cryptsetup to v2.8.1 Update cryptsetup to its latest release Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 00:51:08 +01:00
Fabiano Fidêncio	35cd5fb1d4	versions: Update helm to v4.0.4 Update helm to its latest release Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-18 00:51:08 +01:00
Tobin Feldman-Fitzthum	decc09e975	tests: cc: add test with SNP reference values Add two attestation tests. The first one sets a resource policy that requires CPU0 to have an affirming trust level. This is a negative test which can run on any platform. Setting this policy without setting any reference values should result in an attestation failure. Next, a second test will set the same policy, but this time it will use the journal log to find the QEMU command line from the previous test and calculate the expected reference values. Currently this is only supported on SNP using the sev-snp-measure tool, but the same flow should work on other platforms. Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>	2025-12-18 00:12:11 +01:00
Ruoqing He	8b0d650081	dragonball: Use unique name for vhost path The five tests are set to the same vhost socket path, which could lead to racing with one another. Use unique name to avoid this. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-17 22:25:55 +01:00
Fabiano Fidêncio	320f1ce2a3	versions: Bump experimental {tdx,snp} QEMU Let's bump experimental {tdx,snp} QEMU to the tags created Today in the Confidential Containers repo, which match with QEMU 10.2.0-rc3. This bump is mostly for early testing what will become 10.2.0, which will be bumped everywhere then. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-17 17:42:04 +01:00
Alex Lyn	3696d9143a	tests: Correct the teardown_common in cpu-ns.bats It will address the issue: "# bats warning: Executed 0 instead of expected 1 tests" Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com> Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	a28f24ef8c	tests: move the get_pod_config_dir into setup_common As each case need such preparation of get_pod_config_dir, a better method is directly move it into the setup_common method. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	5778b0a001	tests: Introduce measure_node_time to get test case end time To measure the duration for journal, we need clearly print the journal start time and end time for each case which helps to ensure the journal log is for the specified period for the case. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	648f0913ca	tests: Load lib.rs in bats to ensure related function available The lib.rs should be first loaded before execute some functions call. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	0929c84480	runtime-rs: Reduce output log and increase log level For failure cases within CI, we need dump the kata log to help address issues, but currently large log messages cause partial log we can see. We remove initdata log output and increase log level to reduce log output. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	bbec15d695	tests: delete policy_settings_dir only for first test case Currently policy_settings_dir is created only when BATS_TEST_NUMBER == "1", but delete_tmp_policy_settings_dir "${policy_settings_dir}" is called in teardown() for every test. This means that for tests after the first one teardown() may attempt to delete a directory that was already removed by a previous test, or rely on a value that does not belong to the current test execution. Adjust teardown logic so that policy_settings_dir is only deleted for the first test case (BATS_TEST_NUMBER == "1") and ignored for subsequent tests. This keeps the original optimization of running genpolicy only once, while avoiding unnecessary or confusing cleanup attempts in later test cases. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	24e68b246f	tests: Add missing bin env at the head of bats Add the missing part of `#!/bin/bash/env` in bats. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	93ba6a8e76	tests: Make pod_name a global variable the previous pod_name is set as local which can not be captured within the teardown() function, causing failure. This commit just remove the `local pod_name` to make it a global variable. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	89dce4eff6	tests: Enhance debug log output Introduce setup_common in setup() and teardown_common() in teardown() to get enough log to help debug Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Fabiano Fidêncio	88cdfab604	runtime: nvidia: Align static_sandbox_resource_mgmt Let's ensure we have those aligned for both CC and non-CC use-case. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-17 17:04:51 +01:00
Fabiano Fidêncio	995770dbeb	runtime: nvidia: Use cold-plug by default Now that we have the way to do cold-plug, let's ensure we also use it for the non-CC use case. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-17 17:04:51 +01:00
Hyounggyu Choi	7f72acc266	Merge pull request #12180 from BbolroC/enable-vfio-ap-passthrough-runtime-rs runtime-rs: Enable VFIO-AP passthrough (hotplug only) on s390x	2025-12-17 15:50:10 +01:00
Hyounggyu Choi	f1b4327dba	Merge pull request #12247 from fidencio/topic/ci-store-the-tarballs-we-rely-on-on-gchr-follow-up build: Fix GPG key for gperf & Pass PUSH_TO_REGISTRY and GH_TOKEN to Docker builds	2025-12-17 13:53:58 +01:00
Fabiano Fidêncio	5415cf4e0f	workflows: payload: Remove unneeded stuff from the runner Otherwise we may hit a `no space left on device` when building the rust kata-deploy binary. This happens mostly because of the muli-staging build used to generate a distroless final container. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-17 09:57:02 +01:00
Fabiano Fidêncio	98c5276546	helm: runtimeclasses: Match the kata-deploy rust deployment There we ensure labels are added to better deal with ownership of the runtimeclasses. It's not strictly needed here as helm does take care of the ownership, but also doesn't hurt to follow what seems to be a common practice. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-17 09:57:02 +01:00
Fabiano Fidêncio	6130d7330f	ci: Run a nightly job using the kata-deploy rust Let's shamelessly duplicate the nightly job to have at least nightly runs using the rust implementation of kata-deploy. The reason for doing that is to be pragmatic, as pragmatic as possible, and avoid switching away of the scripts before 3.24.0 release, while still testing both ways till the switch happens. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-17 09:57:02 +01:00
Fabiano Fidêncio	fbc29f3f5e	kata-deploy: helm: Adapt to the rust binary Differently than the scripts, which are called as `bash -c ...`, the kata-deploy rust binary must be invoked directly we do not even have shell in its container. For now, the rust version is used in the used image has the "-rust" suffix, which will help us to have both ways being used / tested for a little while. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-17 09:57:02 +01:00
Fabiano Fidêncio	9d88c6b1d7	kata-deploy: Oxidize the script kata-deploy shell script is not THAT bad and, to be honest, it's quite handy for quick hacks and quick changes. However, it's been increasingly becoming harder to maintain as it's grown its scope from a testing tool to the proper project's front door, lacking unit tests, and with an abundacy of complex regular expressions and bashisms to be able to properly parse the environment variables it consumes. Morever, the fact it is a Frankstein's monster glued together using python packages, golang binaries, and a distro dependent container makes the situation VERY HARD to use it from a distroless container (thus, avoiding security issues), preventing further integration with components that require a higher standard of security than we've been requiring. With everything said, with the help of Cursor (mostly on generating the tests cases), here comes the oxidized version of the script, which runs from a distroless container image. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-17 09:57:02 +01:00
Fabiano Fidêncio	c9cd79655d	build: Pass PUSH_TO_REGISTRY and GH_TOKEN to Docker builds The ORAS cache helper needs PUSH_TO_REGISTRY to be set to 'yes' to push new artifacts to the cache. However, this environment variable was not being passed to the Docker container during agent, tools, and busybox builds. Moreover, for ghcr.io authentication, add support for using GH_TOKEN and GITHUB_ACTOR as fallbacks when explicit credentials (ARTEFACT_REGISTRY_USERNAME/PASSWORD) are not provided. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-16 21:58:16 +01:00
Fabiano Fidêncio	b11cea3113	build: Fix GPG key for gperf The GPG key used for gperf was incorrectly set to the busybox maintainer's key (Denis Vlasenko) instead of the gperf maintainer's key (Marcel Schaible). Wrong key (busybox): C9E9416F76E610DBD09D040F47B70C55ACC9965B Denis Vlasenko <vda.linux@googlemail.com> Correct key (gperf): EDEB87A500CC0A211677FBFD93C08C88471097CD Marcel Schaible <marcel.schaible@studium.fernuni-hagen.de> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-16 21:58:16 +01:00
Fabiano Fidêncio	6e01ee6d47	helm: Provide kata-remote runtime class kata-remote is a runtime class that cloud-api-adaptor relies on to work. kata-remote by itself does nothing, and that's the reason it's disabled by default. We're only adding it here so cloud-api-adaptor charts can simply do something like `--set shims.remote.enabled=true`. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-16 21:57:49 +01:00
Fabiano Fidêncio	0a0fcbae4a	gatekeeper: Adjust to kata-tools A few jobs have been renamed as part of the kata-tools split. Let's add them all here. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-16 18:22:40 +01:00
Fabiano Fidêncio	fb326b53df	agent: Ensure MS_REMOUNT is respected When updating ephemeral storages, MS_REMOUNT is explicitly passed as, for instance, `/dev/shm` should be remounted after memory is hotplugged. Till now Kata Containers has been explicitly ignoring such updates, leading to the containers' `/dev/shm` having the size of "half of the memory allocated, during the startup time", which goes against the expected behaviour. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-12-16 15:11:34 +01:00
Fabiano Fidêncio	830d15d4c8	tests: Adapt to using kata-tools Instead of relying and the fully bloated kata tarball. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-16 12:55:07 +01:00
Fabiano Fidêncio	a2534e7bc8	kata-tools: Release as its own tarball We're only releasing those for amd64 as that's the only architecture we've been building the packages for. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-16 12:55:07 +01:00
Fabiano Fidêncio	6d2f393be4	build: Split tools build from the other artefacts build Let's ensure we can create a specific "tools" tarball, which will help those who only need to pull those either for testing or production usage. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-16 12:55:07 +01:00
Ruoqing He	6d2c66c7eb	runtime-rs: Refactor feature propagation After runtime-rs workspace merged into root workspace, features passed when building runtime-rs needs to be refactored to be correctly propagated. Taking dragonball for example, runtime-rs requires runtimes to depend on virt_conttainers feature, and virt_containers needs to handle hypervisor features specifically. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	1872af7c5a	ci: Install cmake before building runtime-rs cmake is required for libz-sys to compile (which is required by nydus). Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	9551f97e87	runtime-rs: Change TARGET_PATH to root workspace After the workspace integration of runtime-rs, now the output of runtime-rs is under the repo root, instead of src/runtime-rs. Change the TARGET_PATH accordingly to tell Makefile where to lookup output. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	c7c02ac513	dragonball: Skip tests needs kvm under non-root Some cases in dragonball crates requires interaction with KVM module to complete, which requires root privilege. Skip those tests under non-root user. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	889c3b6012	dragonball: Fix false use statement on aarch64 gic::create_gic is actually gated behind dbs_arch crate, instead of arch::aarch64. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	1c1f3a2416	dragonball: Allow missing_docs for dummy MMIODeviceInfo MMIODeviceInfo inside the test module of dbs_boot on aarch64 is used for testing purpose, but `pub` attribute requires it to have documentation. Since this is used only for testing purpose, let's allow missing_docs for it. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	6d0cb18c07	dragonball: Add missing test module attribute Test set of dbs_utils's tap module is missing test attribute, which makes dev-dependencies unusable. Marking tests of tap as test module. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	15fe7ecda1	runtime-rs: Remove lockfile Remove Cargo.lock since it now shares lockfile workspace-wise. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	beb0cac0d1	build: Move runtime-rs to root workspace This is a follow-up of `3fbe693`. Remove runtime-rs from exclude list, and make it as a member of root workspace. Specify shim and shim-ctl as the binary of runtime-rs package, make runtime-rs and all its members into root workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
Ruoqing He	ae4b3e9ac0	runtime-rs: Make runtime-rs a package Make runtime-rs a package produces shim and shim-ctl as its binary product, which enables Makefile to work after it's incorporated into root workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-12-16 11:26:07 +01:00
shezhang.lau	9744e9f26d	runtime-rs: Block Rootfs Mount Options During Storage Object Creation Init the storage options with original rootfs options. Addition: XFS, append nouuid to the mount options if not exist. Signed-off-by: shezhang.lau <shezhang.lau@antgroup.com>	2025-12-16 13:57:02 +08:00
Xuewei Niu	c8b5f8efad	Merge pull request #12167 from M-Phansa/main runtime-rs: handle container missing during kill_process gracefully	2025-12-16 10:31:50 +08:00
Fabiano Fidêncio	1388a3acda	packaging: Add ORAS cache for gperf and busybox tarballs To protect against upstream download failures for gperf and busybox, implement ORAS-based caching to GHCR. This adds: - download-with-oras-cache.sh: Core helper for downloading with cache - populate-oras-tarball-cache.sh: Script to manually populate cache - warn() function to lib.sh for consistency Modified build scripts to: - Try ORAS cache first (from ghcr.io/kata-containers/kata-containers) - Fall back to upstream download on cache miss - Automatically push to cache when PUSH_TO_REGISTRY=yes The cache is automatically populated during CI builds, and parallel architecture builds check for existing versions before pushing to avoid race conditions. Forks benefit from upstream cache but can override with their own: ARTEFACT_REPOSITORY=myorg/kata make agent-tarball Generated-By: Cursor IDE with Claude Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-15 22:04:21 +01:00
Markus Rudy	661e851445	genpolicy: support fsGroup setting in pod security context The runtime handles the fsGroup field of the pod security context by adding a mount option to the generated storage object [1]. This commit changes genpolicy to expect this option. Instead of passing another side input to yaml::get_container_mounts_and_storages, we pass the entire PodSpec. This reduces the necessary changes in the pod-generating resources and allows for possible future use of other PodSpec fields. [1]: https://github.com/kata-containers/kata-containers/blob/0c6fcde1/src/runtime/virtcontainers/kata_agent.go#L1620-L1625 Fixes: #11934 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-12-15 15:22:33 +01:00
Fabiano Fidêncio	a25a53c860	kata-deploy: sa: Fix permissions for patching nodefeaturerules I've seen this happening with the GPU SNP CI every now and then, but I don't really understand how this was not caught by the TDX / SNP CI themselves before. In any case, the error seen is: ``` Error from server (Forbidden): error when applying patch: {"metadata":{"annotations":{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"nfd.k8s-sigs.io/v1alpha1\",\"kind\":\"NodeFeatureRule\",\"metadata\":{\"annotations\":{},\"name\":\"amd64-tee-keys\"},\"spec\":{\"rules\":[{\"extendedResources\":{\"sev-snp.amd.com/esids\":\"@cpu.security.sev.encrypted_state_ids\"},\"labels\":{\"amd.feature.node.kubernetes.io/snp\":\"true\"},\"matchFeatures\":[{\"feature\":\"cpu.security\",\"matchExpressions\":{\"sev.snp.enabled\":{\"op\":\"Exists\"}}}],\"name\":\"amd.sev-snp\"},{\"extendedResources\":{\"tdx.intel.com/keys\":\"@cpu.security.tdx.total_keys\"},\"labels\":{\"intel.feature.node.kubernetes.io/tdx\":\"true\"},\"matchFeatures\":[{\"feature\":\"cpu.security\",\"matchExpressions\":{\"tdx.enabled\":{\"op\":\"Exists\"}}}],\"name\":\"intel.tdx\"}]}}\n"}}} to: Resource: "nfd.k8s-sigs.io/v1alpha1, Resource=nodefeaturerules", GroupVersionKind: "nfd.k8s-sigs.io/v1alpha1, Kind=NodeFeatureRule" Name: "amd64-tee-keys", Namespace: "" for: "/opt/kata-artifacts/node-feature-rules/x86_64-tee-keys.yaml": error when patching "/opt/kata-artifacts/node-feature-rules/x86_64-tee-keys.yaml": nodefeaturerules.nfd.k8s-sigs.io "amd64-tee-keys" is forbidden: User "system:serviceaccount:kube-system:kata-deploy-sa" cannot patch resource "nodefeaturerules" in API group "nfd.k8s-sigs.io" at the cluster scope ``` And the fix is as simple as allowing patching and updating a nodefeaturerule in our service account RBAC. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-15 12:01:20 +01:00
Alex Lyn	f4f61d5666	Merge pull request #12229 from fidencio/topic/kata-deploy-do-deprecations kata-deploy: Remove deprecated features from 3.23.0	2025-12-15 19:00:07 +08:00
Hyounggyu Choi	b69da5f3ba	gatekeeper: Make s390x e2e tests required again Since the CI issue for s390x was resolved on Dec 5th, the nightly test result has gone green for 10 consecutive days. This commit puts the e2e tests for s390x again into the required job list. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-15 11:12:25 +01:00
Fabiano Fidêncio	ded6d1636f	kata-deploy: Remove deprecated features from 3.23.0 Let's remove the deprecated features that were marked for removal after Kata Containers 3.23.0: kata-deploy.sh: - Remove non-arch-specific variable fallbacks (SHIMS, DEFAULT_SHIM, SNAPSHOTTER_HANDLER_MAPPING, ALLOWED_HYPERVISOR_ANNOTATIONS, PULL_TYPE_MAPPING, EXPERIMENTAL_FORCE_GUEST_PULL). Each arch now has its own default value. - Remove CREATE_RUNTIMECLASSES and CREATE_DEFAULT_RUNTIMECLASS variables and associated functions (create_runtimeclasses, delete_runtimeclasses, adjust_shim_for_nfd). RuntimeClasses are now managed by Helm chart, not the daemonset script. - Unsupported architectures now fail with an error instead of falling back to non-arch-specific defaults. Helm chart: - Remove all deprecated env values (createRuntimeClasses, createDefaultRuntimeClass, debug, shims, shims_, defaultShim, defaultShim_, allowedHypervisorAnnotations, snapshotterHandlerMapping, snapshotterHandlerMapping_, agentHttpsProxy, agentNoProxy, pullTypeMapping, pullTypeMapping_, _experimentalSetupSnapshotter, _experimentalForceGuestPull, _experimentalForceGuestPull_*). - Remove backward compatibility code from _helpers.tpl that checked for legacy env values. - Remove legacy env.shims check from runtimeclasses.yaml. - Remove CREATE_RUNTIMECLASSES and CREATE_DEFAULT_RUNTIMECLASS env vars from kata-deploy.yaml and post-delete-job.yaml. - Update RBAC to only include runtimeclasses get/patch permissions (needed for NFD patching), removing create/delete/list/update/watch. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-13 16:32:00 +01:00
Adeet Phanse	db09912808	agent: add SandboxError enum for typed error handling - Replace generic errors in sandbox operations with typed SandboxError variants (InvalidContainerId, InitProcessNotFound, InvalidExecId). - This enables the kata shim to handle specific failure cases differently. Fixes #12120 Signed-off-by: Adeet Phanse <adeet.phanse@mongodb.com>	2025-12-12 12:33:18 -05:00
Adeet Phanse	5b7e1cdaad	runtime-rs: handle container missing during kill_process gracefully Add better error handling to runtime rs to handle when the sandbox itself is killed and recreated. - Update the kill_process function to skip sending a signal when the process is stopped. - Always set ProcessStatus::Stopped even when wait_process fails - In state_process return synthetic state for sandbox container when using Sandbox API Fixes #12120 Signed-off-by: Adeet Phanse <adeet.phanse@mongodb.com>	2025-12-12 12:33:17 -05:00
Fabiano Fidêncio	c7d0c270ee	release: Bump version to 3.24.0 Bump VERSION and helm-chart versions Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-12 18:15:41 +01:00
Fabiano Fidêncio	50b853eb93	tests: nvidia: Always rely on the "kata" default runtime class This is a pattern already followed by all the other tests. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	ff2396aeec	tests: nvidia: Declare KATA_HYPERVISOR variable Align with other test logic - declare the KATA_HYPERVISOR in the run bash script, then declare the RUNTIME_CLASS_NAME variable in the bats files. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	6e31cf2156	tests: nvidia: cc: USE is_confidential_gpu_hw This function has recently been introduced, so we align patterns. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	cd1f55b41c	tests: nvidia: cc: Set GPU0 policy for NIM tests Now that we have a more restrictive resource policy for KBS, let us start adopting it across all NVIDIA test cases. This policy was previously introduced by the NVIDIA attestation test. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	edbac264cb	tests: nvidia: cc: Remove KBS variable The variable is now set in the CI YAML file, thus removing the assignment. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	9665b74653	tests: nvidia: cc: address shellcheck warnings Address shellcheck warnings for run_kubernetes_nv_tests.sh Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	5f9e7a03a8	tests: nvidia: do not use teardown_common Clean up in each NVIDIA bats file according to our needs. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Alex Lyn	c3fd4c1621	version: Bump rtnetlink and netlink-packet-route It aims to upgrade rtnetlink to mitigate netlink log noise. This commit upgrades the `rtnetlink` dependency (and corresponding libraries like `netlink-packet-route`) to address excessive and unnecessary netlink-related logging during sandbox startup. Problem: The previously used `rtnetlink v0.16` (depending on `netlink-proto v0.11.3`) generates a high volume of DEBUG/INFO level netlink messages during sandbox initialization. This noise: 1. Overloads the logging system, often leading to warnings like "slog-async: logger dropped messages due to channel overflow." 2. Interferes with effective troubleshooting by distracting developers from legitimate Kata errors. Solution: We upgrade to `rtnetlink v0.19` (and `netlink-proto v0.12`), as testing confirms that the latest versions have correctly elevated the verbosity of these netlink internal events to the TRACE level. This change significantly enhances the log analysis experience by suppressing unnecessary network-related logs during startup. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-12 14:27:33 +01:00
Manuel Huber	1781fb8b06	tests: nvidia: cc: Use CUDA image from NVCR Pull from nvcr.io to avoid hitting unauthenticated pull rate limits. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	f63f95f315	tests: nvidia: cc: generate pod security policies With these changes, we create pod security policies when running against NVIDIA TEE GPU handlers where AUTO_GENERATE_POLICY is set. For the non-TEE GPU tests, the added functions bail out by design. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	bf26ad9532	nvidia: tests: remove outer CDI annotations With the new device plugin being used by CI runners, these annotations are no longer necessary. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	37b4f6ae8b	tests: Adapt NVIDIA common policy settings Following existing patterns, we adapt the common policy settings for NVIDIA GPU CI platforms. For instance, for our CI runners, we use containerd 2.x. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	f4c0c8546e	tests: Enable AUTO_GENERATE_POLICY for NVIDIA TEEs Enable auto-generate policy for qemu-nvidia-gpu-* if the user didn't specify an AUTO_GENERATE_POLICY value. Setting this in run_kubernetes_nv_tests.sh is too late as gha-run.sh calls into run_tests, setup.sh, and then into create_common_genpolicy_settings() where the rules.rego and genpolicy-settings file are being copied to the right locations. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	b9774e44b6	genpolicy: tests: Add VFIO passthrough test cases Add one valid test case with 2 GPUs with proper VFIO device entries and CDI annotations. Add seven test cases with invalid combinations of VFIO device entries and CDI annotations. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	d3e6936820	genpolicy: validation of vfio passthrough GPUs Add rules for vfio passthrough GPUs. When creating the security policy document, parse GPU resource limits and derive CDI annotation patterns and VFIO device entries. With various values for CDI annotations and device paths being runtime-dependent, use regular expressions. For now, this enables passthrough of NVIDIA GPUs, but the changes are designed to allow for other VFIO device types. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Alex Lyn	82e8e9fbe0	doc: add block device's settings to the doc page Add the block device specific annotations which is dedicated within runtime-rs for num_queues and queue_sie to the document to help users set the two parameters. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-11 21:10:22 +01:00
Alex Lyn	a8a458664d	kata-types: Allow dynamic queue config via Pod annotations This commit introduces the capability to dynamically configure `queue_size` and `num_queues` parameters via Pod annotations. Currently, `kata-runtime` allows for static configuration of `queue_size` and `num_queues` for block devices through its config file. However, a critical issue arises when a Pod is allocated fewer CPU cores than the statically configured `num_queues` value. In such scenarios, the Pod fails to start, leading to operational instability and limiting flexibility in resource allocation. To address this, this feature enables users to override the default queue_size and num_queues parameters by specifying them in Pod annotations.This allows for fine-grained control and dynamic adjustment of these parameters based on the specific resource allocation of a Pod. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-11 21:10:22 +01:00
Steve Horsman	51459b9b15	Merge pull request #12220 from fidencio/topic/ci-arm64-temporarily-disable-arm64-non-k8s-tests ci: arm64-non-k8s: temporarily skip the tests	2025-12-11 11:35:39 +00:00
Fabiano Fidêncio	46c7d6c9f8	ci: arm64-non-k8s: temporarily skip the tests The runner is down for a few weeks. I may end up bringing in my personal runner, but I'm not confident I can easily do this before the holidays, thus I'm skipping the tests for now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-11 12:14:32 +01:00
Manuel Huber	560f6f6c74	tests: nvidia: cc: Affirming attestation policy Set the attestation policy for GPU0 to affirming. This requires the GPU, for instance, to have production properties, such as properly signed VBIOS firmware. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-11 10:16:58 +01:00
Alex Lyn	751b6875f9	tests: Temporarily skip the cpu-ns test for the s390x platform As some reasons that this CI is continuously failed, we'd like to temporarily skip it for the s390x platform. And it will be enabled when we addressed related issues. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	d495b77135	runtime-rs: Align the default annptations with runtime-go As the default enable_annotations in runtime-rs is different with runtime-go, we should make it align with configuration in runtime-go. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	c8dd5fbacf	runtime-rs: Migrate vCPU tracking to fractional float This commit refactors the vCPU resource management within runtime's `CpuResource` structure and related calculation logic to use floating-point numbers (`f32`) instead of integers (`u32`). This migration is necessary to fully support the fractional vCPU allocation introduced in the `kata-types` library, ensuring better precision in: 1.Allocation Tracking: `current_vcpu` now tracks the precise fractional value (e.g., 1.5 vCPUs). 2.Resource Calculation: `calc_cpu_resources` now returns a precise `f32` sum of container vCPU requests, including normalization logic based on the maximum period, removing the previous integer rounding steps in the calculation. 3.Hypervisor Interaction: The integer vCPU requirement for the hypervisor remains, so `ceil()` is now explicitly applied only when interacting with the hypervisor or agent APIs (`do_update_cpu_resources`, `current_vcpu`, `online_cpu_mem`). And key changes as below: 1. `CpuResource::current_vcpu` updated from `u32` to `f32`. 2. `calc_cpu_resources` return type changed from `u32` to `f32`. 3. CPU hotplug logic now uses `f32` for the target vCPU count and applies 4. `ceil()` before calling `hypervisor.resize_vcpu()`. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	84fd33c3bc	kata-types: Use fractional float for vCPU resource tracking Refactors `LinuxContainerCpuResources` and `LinuxSandboxCpuResources` to track calculated vCPU allocation using `f64` (fractional float) instead of `u64` (milliseconds). This ensures more precise resource calculation (`quota / period`) and aggregation by avoiding rounding errors inherent in millisecond-based integer tracking. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	0f04363ea8	tests: Disable CPU elasticity tests for nontee scenarios This commit updates the non-TEE tests to disable two specific test cases: `k8s-number-cpus.bats` and `k8s-sandbox-vcpus-allocation.bats`. These tests are designed to cover CPU elasticity/dynamic scaling capabilities. In the non-TEE scenario, we are enforcing the disabling of this capability by setting the default configuration to `static_sandbox_resource_mgmt=true`. Although the tests currently pass, allowing them to run is logically inconsistent with the intended non-TEE configuration. Therefore, we are disabling them for all non-TEE runtimes, specifically targeting: - `qemu-coco-dev` - `qemu-coco-dev-runtime-rs` This change ensures that our non-TEE CI accurately reflects the static resource management policy and prevents misleading test results. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	beaf44dd2e	tests: disable block volume test for s390 arch As runtime-rs doesn't support block device hotplug in s390 arch, with this fact, we just disable or skip the test when it is the s390. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	535ba589f4	runtime-rs: Enable elastic resource feature To support such feature, the item in Makefile should be enabled, and it can be set true when make build, just like this: `DEFSTATICRESOURCEMGMT_QEMU := false` When users don't want this feature, they can set it with true via the configuration.toml. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	28371dbec5	tests: Enable cloud-hypervisor and qemu-runtime-rs within the CI Enable the cpu hotplug tests within the k8s-number-cpus.bats for both cloud-hypervisor and qemu-runtime-rs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	82a72b4564	tests: Enable cpu hotplug for dragonball and clh in vcpus allocation We have support cpu hotplug features within dragonball and clh, this commit is to enable the test within the CI. Fixes: #8660 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	6196d3d646	tests: Enable cpu hotplug tests in k8s-cpu-ns.bats As previous failure within the case, we choose to skip it, but now the cpu hotplug has been corrected, and it's time to re-enable it. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	96bd13e85d	tests: Add support for qemu-runtime-rs We have supportted virtio-scsi driver, and now the CI should be enabled. Fixes: #10373 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
dependabot[bot]	2137b1fa3a	build(deps): bump github.com/containernetworking/plugins in /src/runtime Bumps [github.com/containernetworking/plugins](https://github.com/containernetworking/plugins) from 1.7.1 to 1.9.0. - [Release notes](https://github.com/containernetworking/plugins/releases) - [Commits](https://github.com/containernetworking/plugins/compare/v1.7.1...v1.9.0) --- updated-dependencies: - dependency-name: github.com/containernetworking/plugins dependency-version: 1.9.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-12-10 16:10:24 +01:00
LandonTClipp	b50a73912d	runtime: Config test extension for IOMMUFDID Adding additional cases for the IOMMUFDID method to check for non-IOMMUFD paths are passed. The method should do the right thing. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2025-12-10 15:46:28 +01:00
LandonTClipp	d5e4cf6b4d	runtime: Add test for ExecuteVFIODeviceAdd Copilot made a good point that we should have a test for this. Thus, this commit. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2025-12-10 15:46:28 +01:00
LandonTClipp	137866f793	runtime: Allow QMP commands to be logged in debug level Logging the QMP commands gives us a lot of flexibility to troubleshoot issues with what is being sent to QEMU. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2025-12-10 15:46:28 +01:00
LandonTClipp	a3b5764f67	runtime: Fix import cycle and add unit test for IOMMUFDID() An import cycle was introduced because of a mutual need for the constant that describes the prefix of IOMMUFD files. We need to extract this out into a higher-level package. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2025-12-10 15:46:28 +01:00
LandonTClipp	09438fd54f	runtime: Add IOMMUFD Object Creation for QEMU QMP Commands The QMP commands sent to QEMU did not properly set up IOMMUFD objects in the codepath that handles VFIO device hot-plugging. This is mainly relevant in the Kubernetes use-case where the VFIO devices are not available when QEMU is first launched. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2025-12-10 15:46:28 +01:00
Manuel Huber	cb8fd2e3b1	runtime: gpu: Skip CDI annos for pause container The pause container does not need CDI annotations, these are only intended for workload containers. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-10 13:26:04 +01:00
Fabiano Fidêncio	69a0ac979c	tests: Adjust install_bats() The function assumes that the runner is a Ubuntu machine, which so far has been true as part of our CI. However, the new ARM runner is running on Debian, and those mirror additions would simply break. With this in mind, for any distro that's not ubuntu, let's just make sure to inform the owner of the system to have bats already installed as part of the environment provided. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-10 12:05:04 +01:00
Fabiano Fidêncio	406f6b1d15	Revert "tests: Add workaround to override CDI files" This reverts commit `5a81b010f2`, as we now have all the infrastructure properly set up as part of our CI node. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-09 23:18:11 +01:00
Fabiano Fidêncio	3db7b88eff	tests: remove containerd guest pull stability tests Remove the existing containerd guest pull stability tests workflow as we're going to rebuild all the VMs used for testing and introduce new, more focused stability tests for nydus-snapshotter. The new tests will be added soon, as part of another PR. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-08 16:29:11 +01:00
Fabiano Fidêncio	5b6a2d25bc	podOverhead: Reduce memory overhead for GPU runtime classes Now that we've bumped to QEMU 10.2.0-rc1, we can take advantage of a fix that's present there, which fixes the double memory allocation for the cases where GPUs are being cold-plugged. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-06 00:16:43 +01:00
Fabiano Fidêndio	71f78cc87e	tests: cc: gpu: Lower the amount of memory required by the pods We've made the pods require a ridiculous amount of memory, just for the sake of getting them running. Now that those are running, tests are passing, CI is required, let's work to lower the amount of mmemory needed as everything else is working as expected. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-06 00:16:43 +01:00
Dan Mihai	965ad10cf2	tests: k8s: tests_common.sh local modification Clean-up shellcheck warnings: SC2030 (info): Modification of cmd_out is local (to subshell caused by (..) group). SC2031 (info): cmd_out was modified in a subshell. That change might be lost. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-12-06 00:16:23 +01:00
Dan Mihai	8199171cc4	tests: k8s: tests_common.sh braces around variables Clean-up shellcheck warnings: SC2250 (style): Prefer putting braces around variable references even when not strictly required. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-12-06 00:16:23 +01:00
Fabiano Fidêncio	5a81b010f2	tests: Add workaround to override CDI files Let's add a simple backup and restore logic for the CDI configuration file nvidia.com-pgpu.yaml in the k8s-nvidia-*.bats and k8s-confidential-attestation.bats test files. Althought not optimal, this is a temporary workaround needed until NVIDIA releases what's needed for the GPU Operator to properly deal with cold plugged devices for the Confidential Containers cases, which is work in progress right now. After that's released, we can revert/drop this patch. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-05 18:58:35 +01:00
Fabiano Fidêncio	aaa67df4dd	versions: Bump experimental {tdx,snp} QEMU Let's bump experimental {tdx,snp} QEMU to the tags created Today in the Confidential Containers repo, which match with QEMU 10.2.0-rc1. This bump is specially beneficial for us, as we can get rid of QEMU's double memory allocation when cold plugging a GPU. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-05 18:58:35 +01:00
Zvonko Kaiser	f8ad17499d	gpu: VFIO handling container vs sandbox If the sandbox has cold-plugged a IOMMUFD device but the device-plugins sends us a /dev/vfio/<NUM> device we need to check if the IOMMUFD device and the VFIO device are the same We have the sibling.BDF we now need to extract the BDF of the devPath that is either /dev/vfio/<NUM> or /dev/vfio/devices/vfio<NUM> Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-12-05 16:53:31 +01:00
Zvonko Kaiser	147e9f188e	Merge pull request #12080 from manuelh-dev/mahuber/cc-gpu-ci-attestation tests: nvidia: cc: Add attestation test	2025-12-05 09:31:57 -05:00
Steve Horsman	2f1b98c232	Merge pull request #12197 from stevenhorsman/logrus-1.9.3-bump version: Bump sirupsen/logrus	2025-12-05 14:18:50 +00:00
Manuel Huber	e5861cde20	tests: use Authorization when GH_TOKEN is set Same as for other uses of GH_TOKEN, use it when set in order to avoid rate limiting issues. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 14:08:43 +01:00
stevenhorsman	9eba559bd6	version: Bump sirupsen/logrus Bump the github.com/sirupsen/logrus version to 1.9.3 across our components where it is back-level to bring us up-to-date and resolve high severity CVE-2025-65637 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-05 11:12:04 +00:00
Manuel Huber	34efa83afc	tests: nvidia: cc: Add attestation test Add the attestation bats test case to the NVIDIA CI and provide a second pod manifest for the attestation test with a GPU. This will enable composite attestation in a subsequent step. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	e31d592a0c	versions: Bump coco-trustee Bump to pull in a fix for composite attestation with GPUs. The new commit ID corresponds to the fix (change for default GPU policy), currently being the top commit of the main branch. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	73dfa9b9d5	versions: Bump coco-guest-components Bump to pull in a fix for NVIDIA CC GPU attestation. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	116a72ad0d	tests: cc: Fix command evaluation This brings two fixes: - use the test_key variable to check against the aatest value. - properly check the run command invocation (run w/o bash does not seem to like the pipe which leads to ALWAYS evaluating the status result to 1. With this, the deny-all test would ALWAYS succeed regardless of whether aatest was actually returned or not. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	23675c784b	tests: cc: Reset default policy When running these tests repeatedly locally, the default policy is not being reset after the test completes, then subsequent runs fail. Similar to k8s-sealed-secrets.bats, we set the default policy in an if condition. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	f70c3adaf1	tests: cc: Add kbs_set_gpu0_resource_policy This allows setting a GPU0 resource policy, enabling GPU attestation tests to not use the default resource policy. For now, the policy requires attestation's ear status to not be contraindicated. In a future change we will require this to be affirming once our CI runners' vBIOS version is properly configured. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	c2d1e2dcc9	tests: cc: Add is_confidential_gpu_hardware This enables attestation tests to figure out whether composite attestation with a GPU can be executed. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	53e94df203	tests: nvidia: cc: add SUPPORTED_TEE_HYPERVISORS Add the NVIDIA TEE hypervisors. With this, attestation tests can be run against the NVIDIA handlers, for instance. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Fabiano Fidêncio	923f97bc66	rootfs: Temporarily revert "gpu: Handle root_hash.txt correctly" This reverts commit `e4a13b9a4a`, as it caused some issues with the GPU workflows. Reverting it is better, as it unblocks other PRs. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-05 11:47:37 +01:00
Steve Horsman	d27af53902	Merge pull request #12185 from stevenhorsman/runtime-rs-required-checks ci: Add qemu-runtime-rs AKS tests to required	2025-12-05 10:43:25 +00:00
stevenhorsman	403de2161f	version: Update golang to 1.24.11 Needed to fix: ``` Vulnerability #1: GO-2025-4155 Excessive resource consumption when printing error string for host certificate validation in crypto/x509 More info: https://pkg.go.dev/vuln/GO-2025-4155 Standard library Found in: crypto/x509@go1.24.9 Fixed in: crypto/x509@go1.24.11 Vulnerable symbols found: #1: x509.HostnameError.Error ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-04 22:50:07 +01:00
Steve Horsman	425f4ffc8d	Merge pull request #12124 from zvonkok/nvidia-measured-rootfs gpu: Measured rootfs	2025-12-04 14:54:11 +00:00
Hyounggyu Choi	1dd3426adc	tests: Extend vfio-ap test for runtime-rs vfio-ap passthrough has been introduced for runtime-rs, requiring that the existing test verify this new functionality. This commit adds: - containerd config specific to runtime-rs - extensions to the existing test functions to cover vfio-ap Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-04 15:05:23 +01:00
Hyounggyu Choi	aa326fb9b8	tests: Remove usage of crictl for vfio-ap `crictl` is not used any more after #10767. Let's clean up all places where the tool is used. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-04 15:05:23 +01:00
Hyounggyu Choi	41d61f4b16	runtime-rs: Enable VFIO-AP passthrough The following have been made for the enablement: 1. Make `MediatedPci` and `MediatedAp` in `VfioDeviceType` 2. Make HostDevice without BDF for `MediatedAp` 3. Add `CCW` to VFioBusMode and set it to VfioConfig as `bus_type` 4. Return `vfio-ap` driver type for `CCW` bus type 5. Set `bus_mode` for `VfioDevice` based on `bus_type` 6. Set `vfio-ap` to the agent device's `field_type` 7. Prepare a different argument for `vfio-ap` for QMP command 8. Set None to all PCI relevant fields Please keep in mind that `vfio-ap` does not belong to any types of port togologies like PCI (e.g., root or switch) because devices on s390x are controlled by CCW. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-04 15:05:23 +01:00
Hyounggyu Choi	cb5b1384ca	runtime-rs: Introduce `uses_native_ccw_bus()` Until now, we relied on `VMROOTFSDRIVER` to determine whether a system uses a native CCW bus. However, this method is not canonical and can be error-prone depending on the configuration. This commit introduces a new function that checks for the presence of CCW bus infrastructure in sysfs and verifies that native mainframe drivers are available. It replaces all previous uses of the old detection method. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-04 15:05:23 +01:00
Steve Horsman	f673f33e72	Merge pull request #12172 from fidencio/topic/gatekeeper-mark-nvidia-jobs-as-required gatekeeper: Mark NVIDIA CC GPU test as required	2025-12-04 12:48:57 +00:00
stevenhorsman	112810c796	ci: Add qemu-runtime-rs AKS tests to required Add the small and normal variants of the qemu-runtime-rs tests to the required-tests list now that they are stable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-04 11:15:43 +00:00
Fabiano Fidêncio	c505afb67c	gatekeeper: Mark NVIDIA CC GPU test as required It's been stable for the past 10 nightlies, no retries. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-04 11:14:25 +00:00
Steve Horsman	635f7892d5	Merge pull request #12190 from BbolroC/mark-s390x-jobs-as-nonrequired gatekeeper: Drop all s390x e2e tests temporarily	2025-12-04 11:10:46 +00:00
Steve Horsman	2a6ebc556f	Merge pull request #12175 from kata-containers/mahuber/gpu-ci-genpolicy ci: nvidia: Install kata-artifacts	2025-12-04 09:23:32 +00:00
Hyounggyu Choi	b6ef7eb9c3	gatekeeper: Drop all s390x e2e tests temporarily This commit marks three s390x CI jobs as non-required. Please check out the details at #12189. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-04 08:05:14 +01:00
Steve Horsman	10b0717cae	Merge pull request #12179 from stevenhorsman/nginx-test-image-by-digest tests: Switch nginx test image ref to digest	2025-12-03 13:39:07 +00:00
Hyounggyu Choi	22778547b2	runtime-rs: Fix panic when OCI spec annotations are missing An oci-spec can be passed to the runtime without annotations (e.g., `ctr run`). In this case, runtime panics with: ``` src/runtime-rs/crates/runtimes/src/manager.rs:391: called `Option::unwrap()` on a `None` value ``` This commit checks if the annotation is None, and instantiates the hashmap as an empty map if it is missing. It also adds a None check for `netns`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-03 13:07:39 +01:00
Hyounggyu Choi	ba78fb46fb	runtime-rs: Configure protection devices when confidential_guest is set Currently, the protection device configuration is constructed automatically even if `confidential_guest` is not set. This commit puts a condition to check the flag and allows the construction accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-03 13:07:39 +01:00
Zvonko Kaiser	e4a13b9a4a	gpu: Handle root_hash.txt correctly Updates to the shim-v2 build and the binaries.sh script. Makeing sure that both variants "confidential" AND "nvidia-gpu-confidential" are handled. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-12-02 19:56:19 +01:00
Steve Horsman	d8405cb7fb	Merge pull request #11983 from stevenhorsman/toolchain-guidance doc: Document our Toolchain policy	2025-12-02 15:47:54 +00:00
stevenhorsman	b9cb667687	doc: Document our Toolchain policy Create an initial version of our toolchain policy as agreed in Architecture Committee meetings and the PTG Fixes: #9841 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-02 14:28:29 +00:00
stevenhorsman	79a75b63bf	tests: Switch nginx test image ref to digest As tags are mutable and digests are not, lets pin our image by digest to give our CI a better chance of stability Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-02 13:02:50 +00:00
stevenhorsman	5c618dc8e2	tests: Switch nginx images to use version.yaml details - Swap out the hard-coded nginx registry and verisons for reading the test image details for version.yaml which can also ensure that the quay.io mirror is used rather than the docker hub versions which can hit pull limits - Try setting imagePullPoliycy Always to fix issues with the arm CI Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-02 10:04:09 +01:00
Manuel Huber	3427b5c00e	ci: nvidia: Install kata-artifacts In preparation for Kata agent security policy testing, installing Kata tools to provide genpolicy. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-01 17:59:19 +00:00
Manuel Huber	4355af7972	kata-deploy: Fix binary find install_tools_helper Using make tarball targets for tools locally, binaries may exist for both debug and release builds. In this case, cryptic errors are shown as we try to install multiple binaries. This change require exactly one binary to be found and errors out in other cases. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-01 09:29:24 -08:00
Manuel Huber	5a5c43429e	ci: nvidia: remove kubectl_retry calls When tests regress, the CI wait time can increase significantly with the current kubectly_retry attempt logic. Thus, align with other tests and remove kubectl_retry invocations. Instead, rely on proper timeouts. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-28 19:00:57 +01:00
Fabiano Fidêncio	e3646adedf	gatekeeper: Drop SEV-SNP from required SEV-SNP machine is failing due to nydus not being deployed in the machine. We cannot easily contact the maintainers due to the US Holidays, and I think this should become a criteria for a machine not be added as required again (different regions coverage). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-28 12:46:07 +01:00
Steve Horsman	8534afb9e8	Merge pull request #12150 from stevenhorsman/add-gatekeeper-triggers ci: Add two extra gatekeeper triggers	2025-11-28 09:34:41 +00:00
Zvonko Kaiser	9dfa6df2cb	agent: Bump CDI-rs to latest Latest version of container-device-interface is v0.1.1 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-27 22:57:50 +01:00
Fabiano Fidêncio	776e08dbba	build: Add nvidia image rootfs builds So far we've only been building the initrd for the nvidia rootfs. However, we're also interested on having the image beind used for a few use-cases. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-27 22:46:07 +01:00
stevenhorsman	531311090c	ci: Add two extra gatekeeper triggers We hit a case that gatekeeper was failing due to thinking the WIP check had failed, but since it ran the PR had been edited to remove that from the title. We should listen to edits and unlabels of the PR to ensure that gatekeeper doesn't get outdated in situations like this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-27 16:45:04 +00:00
Zvonko Kaiser	bfc9e446e1	kernel: Add NUMA config Add per arch specific NUMA enablement kernel settings Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-27 12:45:27 +01:00
Steve Horsman	c5ae8c4ba0	Merge pull request #12144 from BbolroC/use-runs-on-to-choose-runners GHA: Use `runs-on` only for choosing proper runners	2025-11-27 09:54:39 +00:00
Fabiano Fidêncio	2e1ca580a6	runtime-rs: Only QEMU supports templating We can remove the checks and default values attribution from all other shims. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-27 10:31:28 +01:00
Alex Lyn	df8315c865	Merge pull request #12130 from Apokleos/stability-rs tests: Enable stability tests for runtime-rs	2025-11-27 14:27:58 +08:00
Fupan Li	50dce0cc89	Merge pull request #12141 from Apokleos/fix-nydus-sn tests: Properly handle containerd config based on version	2025-11-27 11:59:59 +08:00
Fabiano Fidêncio	fa42641692	kata-deploy: Cover all flavours of QEMU shims with multiInstallSuffix We were missing all the runtime-rs variants. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-26 17:44:16 +01:00
Fabiano Fidêncio	96d1e0fe97	kata-deploy: Fix multiInstallSuffix for NV shims When using the multiInstallSuffix we must be cautelous on using the shim name, as qemu-nvidia-gpu* doesn't actually have a matching QEMU itself, but should rather be mapped to: qemu-nvidia-gpu -> qemu qemu-nvidia-gpu-snp -> qemu-snp-experimental qemu-nvidia-gpu-tdx -> qemu-tdx-experimental Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-26 17:44:16 +01:00
Markus Rudy	d8f347d397	Merge pull request #12112 from shwetha-s-poojary/fix_list_routes agent: fix the list_routes failure	2025-11-26 17:32:10 +01:00
Steve Horsman	3573408f6b	Merge pull request #11586 from zvonkok/numa-qemu qemu: Enable NUMA	2025-11-26 16:28:16 +00:00
Steve Horsman	aae483bf1d	Merge pull request #12096 from Amulyam24/enable-ibm-runners ci: re-enable IBM runners for ppc64le and s390x	2025-11-26 13:51:21 +00:00
Steve Horsman	5c09849fe6	Merge pull request #12143 from kata-containers/topic/add-report-tests-to-workflows workflows: Add Report tests to all workflows	2025-11-26 13:18:21 +00:00
Steve Horsman	ed7108e61a	Merge pull request #12138 from arvindskumar99/SNPrequired CI: readding SNP as required	2025-11-26 11:33:07 +00:00
Amulyam24	43a004444a	ci: re-enable IBM runners for ppc64le and s390x This PR re-enables the IBM runners for ppc64le/s390x build jobs and s390x static checks. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2025-11-26 16:20:01 +05:30
Hyounggyu Choi	6f761149a7	GHA: Use `runs-on` only for choosing proper runners Fixes: #12123 `include` in #12069, introduced to choose a different runner based on component, leads to another set of redundant jobs where `matrix.command` is empty. This commit gets back to the `runs-on` solution, but makes the condition human-readable. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-26 11:35:30 +01:00
Alex Lyn	4e450691f4	tests: Unify nydus configuration to containerd v3 schema Containerd configuration syntax (`config.toml`) varies across versions, requiring per-version logic for fields like `runtime`. However, testing confirms that containerd LTS (1.7.x) and newer versions fully support the v3 schema for the nydus remote snapshotter. This commit changes the previous containerd v1 settings in `config.toml`. Instead, it introduces a unified v3-style configuration for nydus, which can be vailid for lts and active containerds. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-26 17:58:16 +08:00
stevenhorsman	4c59cf1a5d	workflows: Add Report tests to all workflows In the CoCo tests jobs @wainersm create a report tests step that summarises the jobs, so they are easier to understand and get results for. This is very useful, so let's roll it out to all the bats tests. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-26 09:28:36 +00:00
shwetha-s-poojary	4510e6b49e	agent: fix the list_routes failure relax list_routes tests so not every route requires a device Signed-off-by: shwetha-s-poojary <shwetha.s-poojary@ibm.com>	2025-11-25 20:25:46 -08:00
Xuewei Niu	04e1cf06ed	Merge pull request #12137 from Apokleos/fix-netdev-mq runtime-rs: fix QMP 'mq' parameter type in netdev_add to boolean	2025-11-26 11:49:33 +08:00
Alex Lyn	ebe084e093	Merge pull request #12122 from fidencio/topic/configs-do-no-have-commented-out-options runtimes: config: Do NOT have commented fields	2025-11-26 10:33:32 +08:00
Alex Lyn	e9f50f6e71	Merge pull request #12116 from manuelh-dev/mahuber/ci-openvpn-policy-v2 policy: ci: enable security policy for openvpn test case	2025-11-26 09:35:43 +08:00
Fabiano Fidêncio	e859537c74	runtimes: config: Do NOT have commented fields In order to have a better way to set things up using a toml editor, we should take the containerd approach and actually have everything uncommnted. This will help us to unify how we deal with such values in the future from the kata-deploy POV. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-25 19:26:56 +01:00
Arvind Kumar	c085011a0a	CI: readding SNP as required Reenabling the SNP CI node as a required test. Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-11-25 17:05:01 +00:00
Fabiano Fidêncio	5ca4f2b9ff	runtimes: annotations: Fix kernel param handling We need to ensure that we do not blindly append nor blindly override the kernel parameters set by default, but rather modify the values in case they exist, and append in case they do not. Now we're actually making golang and rust runtime behave the same, as so far they were behaving differently, each version wrong in its own way. :-p. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-25 16:04:52 +01:00
Zvonko Kaiser	45cce49b72	shellcheckk: Fix [] [[]] SC2166 This file is a beast so doing one shellcheck fix after the other. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-25 15:46:16 +01:00
Zvonko Kaiser	b2c9439314	qemu: Update tools/packaging/static-build/qemu/build-qemu.sh This nit was introduced by `227e717` during the v3.1.0 era. The + sign from the bash substitution ${CI:+...} was copied by mistake. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-25 15:46:09 +01:00
Zvonko Kaiser	2f3d42c0e4	shellcheck: build-qemu.sh is clean Make shellcheck happy Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-25 15:46:07 +01:00
Zvonko Kaiser	f55de74ac5	shellcheck: build-base-qemu.sh is clean Make shellcheck happy Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-25 15:45:49 +01:00
Zvonko Kaiser	040f920de1	qemu: Enable NUMA support Enable NUMA support with QEMU. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-25 15:45:00 +01:00
Alex Lyn	de9308419b	Merge pull request #12135 from microsoft/danmihai1/init-data agent: allow disabling detect_initdata_device	2025-11-25 21:07:57 +08:00
Alex Lyn	34d3bd18bc	Merge pull request #12132 from fidencio/topic/runtime-classes-fix-nvidia-gpu-podOverhead runtimeclasses: Fix nvidia-gpu podOverhead	2025-11-25 20:23:07 +08:00
Alex Lyn	7f4d856e38	tests: Enable nydus tests for qemu-runtime-rs We need enable nydus tests for qemu-runtime-rs, and this commit aims to do it. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-25 17:45:57 +08:00
Alex Lyn	98df3e760c	runtime-rs: fix QMP 'mq' parameter type in netdev_add to boolean QEMU netdev_add QMP command requires the 'mq' (multi-queue) argument to be of boolean type (`true` / `false`). In runtime-rs the virtio-net device hotplug logic currently passes a string value (e.g. "on"/"off"), which causes QEMU to reject the command: ``` Invalid parameter type for 'mq', expected: boolean ``` This patch modifies `hotplug_network_device` to insert 'mq' as a proper boolean value of `true . This fixes sandbox startup failures when multi-queue is enabled. Fixes #12136 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-25 17:34:36 +08:00
Alex Lyn	23393d47f6	tests: Enable stability tests for qemu-runtime-rs on nontee Enable the stability tests for qemu-runtime-rs CoCo on non-TEE environments Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-25 16:18:37 +08:00
Alex Lyn	f1d971040d	tests: Enable run-nerdctl-tests for qemu-runtime-rs Enable nerdctl tests for qemu-runtime-rs Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-25 16:14:50 +08:00
Alex Lyn	c7842aed16	tests: Enable stability tests for runtime-rs As previous set without qemu-runtime-rs, we enable it in this commit. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-25 16:12:12 +08:00
Alex Lyn	aadf1d6f71	Merge pull request #11932 from Apokleos/enhance-blk-params runtime-rs: Allow configuration of virtio block queue parameters	2025-11-25 15:24:12 +08:00
Dan Mihai	22d60a36c0	agent: allow disabling detect_initdata_device Allow users to build the Kata Agent using INIT_DATA=no to disable the detect_initdata_device() code loop and associated debug log output. Future additional improvements related to Init Data are tracked by #11532. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-25 02:44:28 +00:00
Fabiano Fidêncio	bb56a2e4d9	runtimeclasses: Fix nvidia-gpu podOverhead On `69c4fc4e76`, I've mistakenly changed the nvidia-gpu podOverhead while I should only have changed the TEE nvidia-gpu ones. Let's move it back to its original value. Reported-by: Joji Mekkattuparamban <jojim@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-24 21:43:29 +01:00
Zvonko Kaiser	55489818d6	gpu: TDX kernel param cleanup This settings is not needed anymore with Ubuntu 25.10 and the newest QEMU releases for TDX by Ubuntu. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-24 15:49:16 +01:00
Steve Horsman	e1e370091c	Merge pull request #12128 from fidencio/topic/kata-deploy-nfd-adjust-runtime-classe kata-deploy: nfd: Patch TEE runtimeclasses when needed	2025-11-24 14:05:43 +00:00
Steve Horsman	d437f875aa	Merge pull request #12126 from zvonkok/cold-plug-cleanup gpu: Cleanup Makefile	2025-11-24 14:01:49 +00:00
Zvonko Kaiser	77089fe5b3	Merge pull request #12115 from nheinemans-asml/main Kata-deploy: Add tolerations to daemonset and cleanup job	2025-11-24 09:00:42 -05:00
Manuel Huber	331515e1b8	ci: enable security policy for openvpn test With issue 11777 being resolved, this commit enables openvpn policy testing. The remaining work on the security policy required to successfully run this test case was to enable UDP ports for Service kinds and to use the mount path's last component instead of the volume name to construct the expected storage source path. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-23 17:23:43 +00:00
Manuel Huber	4f32816ea3	policy: Use mount path instead of volume name Use the mount path's last component instead of the volume name to construct the expected storage source path. Example: Name of a volumeMount is 'openvpn-config' and its mountPath is '/etc/openvpn/'. Without this change, we use 'openvpn-config' to calculate the expected storage source path. However, we need to use 'openvpn', because the shim uses the basename of the destination path as the source suffix and not the volume name. For reference, see 'fs_hsare_linux.go"'s 'ShareFile' function where the filename variable uses 'filepath.Base(m.Destionation))'. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-23 17:23:43 +00:00
Manuel Huber	e4123a9848	policy: support UDP based Service types For Service kinds using the UDP protocol as port. An example is the openvpn-server-service.yaml file part of the openvpn CI test. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-23 17:23:43 +00:00
Fabiano Fidêncio	d0f3eb935e	kata-deploy: nfd: Patch TEE runtimeclasses when needed We've added logic to properly do the book keeping of the TEE keys when using NFD AND creating the runtime classes. However, we need to also take into consideration the case where the runtimeclasses are being created by the helm template, and in that case we just update what helm has deployed. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-23 10:27:52 +01:00
Zvonko Kaiser	dce207397c	gpu: Cleanup Makefile Some VARS were introduced but not cleaned up with the recent cold-plug PR, doing this now Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-21 22:03:34 +00:00
Zvonko Kaiser	8afcdae31f	Merge pull request #12092 from manuelh-dev/mahuber/cc-gpu-ci-smi-srs tests: nvidia: cc: Remove nvrc.smi.srs=1 parameter	2025-11-21 08:26:13 -05:00
Steve Horsman	37dd055283	Merge pull request #12090 from stevenhorsman/required-tests-update-14-nov-2025 Required tests update 14 nov 2025	2025-11-21 12:05:05 +00:00
nheinemans-asml	ef9d4e8b0d	kata-deploy: Add tolerations value to kata-deploy This allows the daemonset and cleanup job to run on tainted nodes. fixes #12114 Signed-off-by: nheinemans-asml <nick.heinemans@asml.com> Signed-off-by: nheinemans-asml <97238218+nheinemans-asml@users.noreply.github.com>	2025-11-21 09:49:47 +01:00
Manuel Huber	dfc229f51e	tests: nvidia: cc: Remove nvrc.smi.srs=1 parameter Remove the nvrc.smi.srs=1 parameter from the kernel command line. In CC use cases, the attestation agent is expected to set the GPU ready state. For the CUDA vectorAdd case where attestation agent is not being used, we set the ready state by adding the kernel command line parameter through an annotation. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:35:05 +01:00
Manuel Huber	6c6fc50aa5	tests: nvidia: cc: allow-all policy and init-data Add an allow-all policy for the CC GPU tests and ensure the init-data device is being created (hypervisor annotations). Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:24:15 +01:00
Manuel Huber	7e20118c8e	tests: nvidia: move secret definitions to bottom The add_allow_all_policy_to_yaml in tests_common.sh needs some improvements so that this function can support pod manifests with different resource kinds. For now, moving the Secret definition to the bottom so that we can create a default policy for the Pod. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:24:15 +01:00
Manuel Huber	ffd5443637	tests: nvidia: adapt is_aks_cluster The qemu-nvida-gpu handlers should not cause is_aks_cluster to return 1. Otherwise, CI logic will assume these hypervisors run on AKS hosts, see the following message in CI w/o this change: INFO: Adapting common policy settings for AKS Hosts Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:24:15 +01:00
Manuel Huber	f2bdd12e5e	tests: nvidia: Check KATA_HYPERVISOR var Fail explicitly when a wrong KATA_HYPERVISOR variable is provided. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:24:15 +01:00
Xuewei Niu	bf967b81cc	runtime-rs: Bump cgroups-rs to v0.5.0 The new version fixes some issues with systemd version, path verification. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-11-21 09:06:26 +01:00
Fabiano Fidêncio	6b40b59861	tests: Reduce KBS deployment check flakeness We currently start a pod that does a `wget` to the KBS address, and fails after 5 seconds. By the time it fails and reports back, we can see that KBS is actually running, but the workflow failed as the checker failed. :-/ Let's give it more time for the KBS to show up, and the flakeness should go away. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-20 19:29:26 +01:00
Fabiano Fidêncio	35672ec5ee	tests: cc: Test authenticated images with force guest pull As this should simply work. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-20 19:02:15 +01:00
Fupan Li	b86e7ff42b	Merge pull request #12087 from jojimt/device_cold_plug shim: Support device cold plug with Kubernetes	2025-11-20 19:17:13 +08:00
Joji Mekkattuparamban	7dc292094c	shim: go vendor changes for cold plug support Vendor in the kubelet pod resources API. Signed-off-by: Joji Mekkattuparamban <jojim@nvidia.com>	2025-11-20 10:58:55 +01:00
Joji Mekkattuparamban	5aa184925a	shim: Support device cold plug with Kubernetes Utilize Kubelet's Pod Resource API to determine device allocations for the Pod during sandbox creation. Use CDI files to translate the device IDs to corresponding device paths and perform device injection. Fixes #12009 Signed-off-by: Joji Mekkattuparamban <jojim@nvidia.com>	2025-11-20 10:58:55 +01:00
Manuel Huber	477ca3980b	tests: nvidia: cc: Re-enable multi GPU test case Use the pod name variable so that kubectl wait finds the pod. Currently, kubectl waits for nvidia-nim-llama-3-2-nv-embedqa-1b-v2, not for nvidia-nim-llama-3-2-nv-embedqa-1b-v2-tee Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-20 10:05:46 +01:00
Zvonko Kaiser	89cd561340	Merge pull request #12059 from manuelh-dev/mahuber/bb-debug-v2 gpu: introduce a new devkit build flag to produce a rootfs for developers	2025-11-19 13:03:46 -05:00
Steve Horsman	8c6c31555a	Merge pull request #12111 from fidencio/topic/ci-fix-erofs-ci tests: k8s: Fix typo in authenticated tests	2025-11-19 16:08:48 +00:00
Manuel Huber	3966864376	gpu: introduce devkit build flag Introduce a new devkit parameter which will produce a rootfs without chisselling. This results in a larger rootfs with various packages and binaries being included, for instance, enabling the use of the debug console. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-19 15:50:03 +01:00
Manuel Huber	2c9e0f9f4f	gpu: add signed-by to package sources Pin to specific key. CUDA package sources in /etc/apt/sources.list.d already use a specific key. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-19 15:50:03 +01:00
Ruoqing He	54bfbf5687	build: Exclude tools from root workspace There are rust packages being cloned and built inside tools/packaging/kata-deploy/local-build/build folder, which may mislead those packages to think they are part of the kata root workspace. Exclude the directory to avoid that. Reported-by: Fabiano Fidêncio <ffidencio@nvidia.com> Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-11-19 15:49:25 +01:00
Fabiano Fidêncio	ae463642ed	tests: k8s: Fix typo in authenticated tests The person who introduced the check, someone named Fabiano Fidêncio, forgot a `$` in a variable assignment. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-19 11:59:59 +01:00
Steve Horsman	87b180383e	Merge pull request #11802 from kata-containers/dependabot/github_actions/oras-project/setup-oras-1.2.4 build(deps): bump oras-project/setup-oras from 1.2.2 to 1.2.4	2025-11-19 09:58:37 +00:00
dependabot[bot]	ede5ac9c2d	build(deps): bump the bit-vec group across 2 directories with 1 update Bumps the bit-vec group with 1 update in the /src/agent directory: [bit-vec](https://github.com/contain-rs/bit-vec). Bumps the bit-vec group with 1 update in the /src/tools/agent-ctl directory: [bit-vec](https://github.com/contain-rs/bit-vec). Updates `bit-vec` from 0.6.3 to 0.8.0 - [Changelog](https://github.com/contain-rs/bit-vec/blob/master/RELEASES.md) - [Commits](https://github.com/contain-rs/bit-vec/commits) Updates `bit-vec` from 0.6.3 to 0.8.0 - [Changelog](https://github.com/contain-rs/bit-vec/blob/master/RELEASES.md) - [Commits](https://github.com/contain-rs/bit-vec/commits) --- updated-dependencies: - dependency-name: bit-vec dependency-version: 0.8.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: bit-vec - dependency-name: bit-vec dependency-version: 0.8.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: bit-vec ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-19 10:43:25 +01:00
stevenhorsman	b75d90b483	ci: Comment out snp ci from required-tests The snp CI has not been required for a while and has recently been broken, so comment it out from the list of required jobs. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-19 09:39:36 +00:00
stevenhorsman	ae71921be2	ci: Update build-checks name in required-tests to update the required-tests to match. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-19 09:39:36 +00:00
stevenhorsman	112ed9bb46	ci: Comment out run-nydus from required-tests The run-nydus tests are not stable and blocking PRs, so make them non-required temporarily until they can be looked at Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-19 09:38:38 +00:00
Fupan Li	478a5ff693	Merge pull request #12109 from Apokleos/enable-cocodev-rs tests: Enable AUTO_GENERATE_POLICY for qemu-coco-dev-runtime-rs	2025-11-19 12:05:22 +08:00
Alex Lyn	1da225efc5	tests: Enable AUTO_GENERATE_POLICY for qemu-coco-dev-runtime-rs Enable auto-generate policy on cbl-mariner Hosts for qemu-coco-dev-runtime-rs if the user didn't specify an AUTO_GENERATE_POLICY value. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-19 10:44:03 +08:00
Alex Lyn	8d85548711	Merge pull request #12102 from Apokleos/rs-copyfile-devcgrp runtime-rs: Clear Linux.Resources.Devices completely and correct the guest path for container mount binding	2025-11-19 09:05:59 +08:00
Fabiano Fidêncio	8c02b5b913	tests: nvidia: cc: Temporarily skip multi GPU for nim tests We will re-enable this one later on once the changes to properly cold plug multi GPUs are merged. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	69c4fc4e76	kata-deploy: Adjust podOverhead for GPU TEEs Let's just move the podOverhead to a gigantic value, as we do need pod snadboxes as big as that, and we've noticed QEMU being OOM killed with smaller overheads. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	94ed4051b0	tests: nvidia: cc: Increase RAM for NIM pods Those need to pull the models inside the guest, and the guest has 50% of its memory "allowed" to be used as tmpfs, so, we gotta usa the RAM that we have. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	e5062a056e	tests: nvidia: cc: Adjust timeouts on NIM pods Timeout increases for confidential computing slowness: * livenessProbe: * initialDelaySeconds: 15 → 120 seconds * timeoutSeconds: 1 → 10 seconds * failureThreshold: 3 → 10 * readinessProbe: * initialDelaySeconds: 15 → 120 seconds * timeoutSeconds: 1 → 10 seconds * failureThreshold: 3 → 10 * startupProbe: * initialDelaySeconds: 40 → 180 seconds * timeoutSeconds: 1 → 10 seconds * failureThreshold: 180 → 300 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	dee6f2666b	runtime: nvidia: Increase the guest pull timeout to 20 minutes Yes, we're dealing with a combination of large images and image-rs concurrent image layers being not optimal. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	6be43b2308	tests: nvidia: Retry kubectl commands As with CoCo some of the commands may take longer, way longer than expected. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	bb5bf6b864	tests: nvidia: nims: Use the current auths format for KBS We cannot use the same format used for docker, as it includes username and password, while what's expected when using Trustee does not. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	92da54c088	tests: nvidia: cc: Enable NIM tests Now that we've bumped Trustee to a version that supports the NVIDIA remote verifier, let's re-enable the tests. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Steve Horsman	74254cba8f	Merge pull request #12106 from stevenhorsman/gatekeeper-paging-reduction ci: Adjust gatekeeper's job fetch	2025-11-18 14:08:26 +00:00
Fabiano Fidêncio	8eca0814bd	tests: Run authenticated tests with experimental_force_guest_pull As it should be supported. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 14:46:48 +01:00
Fabiano Fidêncio	5beb1af202	tests: Pass EXPERIMENTAL_FORCE_GUEST_PULL to the test Right now we have only been passing the env var to the deployment script, but we really need to pass it to the tests script as well. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 14:46:48 +01:00
Markus Rudy	638cad18ef	Merge pull request #11978 from burgerdev/genpolicy-test-refactor genpolicy: prepare integration tests for programmatic modification	2025-11-18 09:54:40 +01:00
stevenhorsman	9f0fea1e34	ci: Adjust gatekeeper's job fetch Try and reduce the page limit of each job request to avoid the chances of us tripping over github's 10s api limit. All credit to @burgerdev for the investigation and suggestion! Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-18 08:22:36 +00:00
Alex Lyn	6ceacee0b9	runtime-rs: Add queue_size and num_queues for block volumes Add the related block queue_size and num_queues in volumes based on block devices, This very important for IO performance. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-18 14:53:43 +08:00
Alex Lyn	30a9a8b4ec	runtime-rs: Add queue_size and num_queues for block device Add the queue_size and num_queues in block device config when the block device is handled. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-18 14:53:43 +08:00
Alex Lyn	9b0204a2de	runtime-rs: Set Clh's disk queue_size and num_queues Previous Clh's settings with disk queue_size and num_queues are hardcodes, they should be configurable with user-defined values. This commit is to address such issue via passing these settings. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-18 14:53:43 +08:00
Alex Lyn	f19c48505c	runtime-rs: Introduce queue_size and num_queues in BlockConfig Usually, we pass the related block config via BlockConfig, and to reach the goal of user-friendly setting queue_size and num_queues for users, the queue_size and num_queues are introduced in BlockConfig. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-18 14:53:43 +08:00
Alex Lyn	e958993348	kata-types: Introduce queue_size and num_queues within BlockDeviceInfo Add two fields of queue_size and num_queues in BlockDeviceInfo to allow users to set the related items via configurations Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-18 14:53:43 +08:00
Alex Lyn	780c45de23	runtime-rs: Add support queue_size and num_queues within configurations Add related items for block device queue size and num queues in configurations. And users can set the related items by configurations. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-18 14:53:43 +08:00
Steve Horsman	ac021e2ab9	Merge pull request #11563 from RuoqingHe/single-workspace build: Introduce root workspace for rust components	2025-11-18 06:36:18 +00:00
Alex Lyn	d071384bba	runtime-rs: Clear Linux.Resources.Devices completely The current implementation causes issues with the Agent Policy nontee CI tests, as Kata-Agent does not allow any configuration for `count(Linux.Resources.Devices) == 0`. This commit ensures that Linux.Resources.Devices, including all its values, is completely cleared from the OCI Runtime Specification before being passed to the Kata-Agent. This addresses the CI failure by enforcing the required empty state for the Devices cgroup configuration. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-18 13:40:09 +08:00
Xuewei Niu	ca8b3300d3	Merge pull request #11620 from zhangckid/indep_iothreads_upstream Runtime/QEMU: Introduce virtio-blk with iothreads and enable Indep iothreads framework	2025-11-18 11:08:51 +08:00
Alex Lyn	5982e66503	runtime-rs: Ensure unique guest path for container mount binding Previously, CopyFile implementation attempted to reuse existing guest paths for subsequent containers within the same Pod. This prevented correct bind mounting of shared configurations (e.g., ConfigMaps, Service Accounts) into the later containers within a multi-containers pod, as they lacked their own allocated guest path. This commit modifies the logic to create a unique guest path for every container that requires file propagation. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-18 11:03:26 +08:00
Fupan Li	f791be1abb	Merge pull request #12064 from Apokleos/policy-optional-path genpolicy: Make cpath compatible with both runtime-rs and runtime-go	2025-11-18 10:19:26 +08:00
Ruoqing He	e6b24cd789	build: Exclude crates with no workspace setup Crates with no workspace setup would think themselves are in the root workspace, which our root workspace is not ready for them. Excluding them for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-11-18 01:39:48 +00:00
Ruoqing He	6068242bf1	build: Move dragonball to root workspace Move dragonball and all its member of that workspace into root workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-11-18 01:39:48 +00:00
Ruoqing He	3fbe693658	build: Introduce root workspace for rust components Add Cargo.toml at repo root, use this root workspace for as many as possible Rust components of Kata Containers. This would enable us to share a common Cargo.lock file, and reduce the noise from dependabot. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-11-18 01:39:48 +00:00
Steve Horsman	650ada7bcc	Merge pull request #12101 from stevenhorsman/release/3.23.0 release: Bump version to 3.23.0	2025-11-17 21:09:45 +00:00
stevenhorsman	70f1f4a3ac	release: Bump version to 3.23.0 Bump VERSION and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 19:27:25 +00:00
stevenhorsman	c47e8d0ab8	kata-ctl: update backtrace and local references Similar to #12075, bump-backtrace to 0.3.76 to remove the dependency on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056 As a side effect this brought in loads of other crate changes, which I think are due to it bumping the local dependencies that this package builds on. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 20:13:04 +01:00
stevenhorsman	d16620bae1	runk: update backtrace to 0.3.76 Similar to #12075, bump-backtrace to remove the dependency on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 20:13:04 +01:00
stevenhorsman	0b259e4fcf	agent-ctl: update backtrace to 0.3.76 Similar to #12075, bump-backtrace to remove the dependency on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 20:13:04 +01:00
stevenhorsman	4abf79f16f	genpolicy: update backtrace to 0.3.76 Similar to #12075, bump-backtrace to remove the dependency on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 20:13:04 +01:00
stevenhorsman	4158d9a94a	runtime-rs: update flate2 & backtrace Similar to #12075, bump flate2 and backtrace to remove the dependency on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 20:13:04 +01:00
stevenhorsman	fe10db233c	runtime-rs: Remove libbacktrace feature from backtrace This feature was removed in https://github.com/rust-lang/backtrace-rs/pull/615 which shows that the implementation was removed over two years ago, so get rid of this feature, so we can move to newer versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 20:13:04 +01:00
stevenhorsman	398e7987cd	dragonball: update flate2 & backtrace Similar to #12075, bump flate2 and backtrace to remove the dependency on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 20:13:04 +01:00
Steve Horsman	04c7d11689	Merge pull request #12044 from lifupan/fix_update_interface runtime: fix the issue of update interface error	2025-11-17 14:45:36 +00:00
Fupan Li	763a0d8675	runtime: fix the issue of update interface error Since the network device hotplug is an asynchronous operation, it's possible that the hotplug operation had returned, but the network device hasn't ready in guest, thus it's better to retry on this operation to wait until the device ready in guest. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-17 13:58:36 +01:00
Steve Horsman	b3eb794662	Merge pull request #12098 from stevenhorsman/csi-kata-direct-volume-xz-0.5.15-bump csi-kata-directvolume: Bump xz module	2025-11-17 12:47:28 +00:00
Fabiano Fidêncio	75996945aa	kata-deploy: try-kata-values.yaml -> values.yaml This makes the user experience better, as the admin can deploy Kata Containers without having to download / set up any additional file. Of course, if the admin wants something more specific, examples are provided. Tests and documentation are updated to reflect this change. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-17 12:16:17 +01:00
Alex Lyn	71a9ecf9f8	Merge pull request #12095 from lifupan/fix_vcpu_number runtime-rs: fix the issue of wrong vcpu number	2025-11-17 19:11:48 +08:00
stevenhorsman	502a3ce3b6	csi-kata-directvolume: Bump xz module Bump github.com/ulikunitz/xz to v0.5.15, to remediate vulnerability GO-2025-3922 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-17 10:20:50 +00:00
Markus Rudy	b771bb6ed3	genpolicy: log requests as jsonlines The current format of genpolicy request logs looks a bit like JSON, but it does not parse out of the box and needs post-processing with sed, for example. This commit changes the log format to jsonlines[1], which is basically newline-delimited compact JSON values. Compared to standard JSON, this allows streaming output. The resulting file can be converted and processed programmatically, for example with `jq -s`. The fields are also adjusted to match the field names of TestRequest, so that the logged requests can be used immediately in tests. [1]: https://jsonlines.org/ Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-11-17 09:01:00 +01:00
Markus Rudy	eb6cf025b3	genpolicy: format testcases.json and sort by key This should allow keeping future diffs minimal. The files were formatted with `jq -S`, which should be used after future updates to the test case files. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-11-17 09:01:00 +01:00
Markus Rudy	851f8258af	genpolicy: move testcase request type out of struct Storing the request type outside the request object has two benefits: * The request JSON passed to the Rego engine matches more closely what would be passed by the agent (no `type` field). * If we want to update the requests, it's easier to insert them into a dedicated field, rather than inserting them and amending the type field. This is a first step towards programmatic updates of testcase files. This commit also adds the 'Request' suffix to the test case enum, such that we can use the 'ep' input for allow_request directly. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-11-17 09:01:00 +01:00
zhangchen.kidd	914063bcdd	runtime: documentation: Add virtio-blk support iothread comments in docs Add comments to make the "EnableIOThreads" flag as a switch for virtio-blk(based on IndepIOThreads) driver. Signed-off-by: zhangchen.kidd <zhangchen.kidd@jd.com>	2025-11-17 15:55:03 +08:00
zhangchen.kidd	9128112e3d	runtime: qemu: Add Independent IOThread support for virtio-blk Make hotplug virtio-blk device attach to Independent IOThread 0 as default when enabled the EnableIOThreads and IndepIOThreads. Signed-off-by: zhangchen.kidd <zhangchen.kidd@jd.com>	2025-11-17 15:55:03 +08:00
zhangchen.kidd	fea954df7a	runtime: qemu: qmp: Add iothread args for QMP ExecutePCIDeviceAdd Qemu already support the device_add with iothread args. Make KATA have ability to hotplug PCI device with IOThreads. Currently, just support QEMU as the hypervisor, not sure it works for stratovirt. Signed-off-by: zhangchen.kidd <zhangchen.kidd@jd.com>	2025-11-17 15:55:03 +08:00
zhangchen.kidd	af203b7dee	runtime: qemu: introduce setup iothread function Make the original virtio-scsi iothread and the new independent iothread to a dedicated method for handing the related logics. Signed-off-by: zhangchen.kidd <zhangchen.kidd@jd.com>	2025-11-17 15:55:03 +08:00
zhangchen.kidd	d20712aa9e	runtime: qemu: Add comments for virtio-scsi iothread args For current implementation, just virtio-scsi use this iothread path. Signed-off-by: zhangchen.kidd <zhangchen.kidd@jd.com>	2025-11-17 15:55:03 +08:00
zhangchen.kidd	f9d4829e77	rumtime: qemu: Add indep_iothreads for QEMU hypervisor toml Add indep_iothreads args for QEMU related configuration toml. The default value is 0. Signed-off-by: zhangchen.kidd <zhangchen.kidd@jd.com>	2025-11-17 15:55:03 +08:00
zhangchen.kidd	c3d3684f81	runtime: Introduce independent IOThreads framework Introduce independent IOThread framework for Kata container. What is the indep_iothreads: This new feature introduce a way to pre-alloc IOThreads for QEMU hypervisor (maybe other hypervisor can support too). Independent IOThreads enables IO to be processed in a separate thread. To generally improve the performance of each module, avoid them running in the QEMU main loop. Why need indep_iothreads: In Kata container implementation, many devices based on hotplug mechanism. The real workload container may not sync the same lifecycle with the VM. It may require to hotplug/unplug new disks or other devices without destroying the VM. So we can keep the IOThread with the VM as a IOThread pool(some devices need multi iothreads for performance like virtio-blk vq-mapping), the hotplug devices can attach/detach with the IOThread according to business needs. At the same time, QEMU also support the "x-blockdev-set-iothread" to change iothreads(but it need stop VM for data secure). Current QEMU have many devices support iothread, virtio-blk, virtio-scsi, virtio-balloon, monitor, colo-compare...etc... How it works: Add new item in hypervisor struct named "indep_iothreads" in toml. The default value is 0, it reused the original "enable_iothreads" as the switch. If the "indep_iothreads" != 0 and "enable_iothreads" = true it will add qmp object -iothread indepIOThreadsPrefix_No when VM startup. The first user is the virtio-blk, it will attach the indep_iothread_0 as default when enable iothread for virtio-blk. Thanks Chen Signed-off-by: zhangchen.kidd <zhangchen.kidd@jd.com>	2025-11-17 15:55:01 +08:00
Fupan Li	c74a2650e9	runtime-rs: fix the issue of wrong vcpu number In commit `1f95d9401b` runtime-rs: change representation of default_vcpus from i32 to f32, When the vCPU number is less than 1.0, directly converting an integer to a floating-point number will automatically convert it to 0. Therefore, it needs to be rounded up before converting it back to an integer. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-17 10:09:51 +08:00
Alex Lyn	daca7b268b	genpolicy: Make cpath compatible with both runtime-rs and runtime-go Update the `cpath` variable in the policy template to support the optional `/passthrough` subpath used by runtime-rs. This ensures that mount source path validation works correctly for both runtime implementations. By changing `cpath` to include the `(?:/passthrough)?` regular expression fragment, we make the `/passthrough` segment optional. The updated `cpath`: `/run/kata-containers/shared/containers(?:/passthrough)?` This single regex pattern now correctly matches both: 1.`/run/kata-containers/shared/containers/<sandbox-id>/...` (runtime-go) 2.`/run/kata-containers/shared/containers/passthrough/<sandbox-id>/...` (runtime-rs) This elegantly resolves the compatibility issue without needing to add separate or conditional logic to the policy rules, making the policy more robust and maintainable. Fixes: #12063 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-17 09:36:19 +08:00
Fabiano Fidêncio	2e000129a9	kata-deploy: tests: Add example values files for easy Kata deployment Add three example values files to make it easier for users to try out different Kata Containers configurations: - try-kata.values.yaml: Enables all available shims - try-kata-tee.values.yaml: Enables only TEE/confidential computing shims - try-kata-nvidia-gpu.values.yaml: Enables only NVIDIA GPU shims These files use the new structured configuration format and serve as ready-to-use examples for common deployment scenarios. Also update the README.md to document these example files and how to use them. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	8717312599	tests: Migrate helm_helper to use new structured configuration Update the helm_helper function in gha-run-k8s-common.sh to use the new structured configuration format instead of the legacy env.* format. All possible settings have been migrated to the structured format: - HELM_DEBUG now sets root-level 'debug' boolean - HELM_SHIMS now enables shims in structured format with automatic architecture detection based on shim name - HELM_DEFAULT_SHIM now sets per-architecture defaultShim mapping - HELM_EXPERIMENTAL_SETUP_SNAPSHOTTER now sets snapshotter.setup array - HELM_ALLOWED_HYPERVISOR_ANNOTATIONS now sets per-shim allowedHypervisorAnnotations - HELM_SNAPSHOTTER_HANDLER_MAPPING now sets per-shim containerd.snapshotter - HELM_AGENT_HTTPS_PROXY and HELM_AGENT_NO_PROXY now set per-shim agent proxy settings - HELM_PULL_TYPE_MAPPING now sets per-shim forceGuestPull/guestPull settings - HELM_EXPERIMENTAL_FORCE_GUEST_PULL now sets per-shim forceGuestPull/guestPull The test helper automatically determines supported architectures for each shim (e.g., qemu-se supports s390x, qemu-cca supports arm64, qemu-snp/qemu-tdx support amd64, etc.) and applies per-shim settings to the appropriate shims based on HELM_SHIMS. Only HELM_HOST_OS remains in legacy env.* format as it doesn't have a structured equivalent yet. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	aa89fda7fc	kata-deploy: Document new structured configuration and deprecation Add comprehensive documentation for the new structured configuration format, including: - Migration guide from legacy env.* format - List of deprecated fields with removal timeline (2 releases) - Examples of the new structured format - Explanation of key benefits - Backward compatibility notes The documentation makes it clear that the legacy format is deprecated but will continue to work during the transition period. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	119893b8e8	kata-deploy: Add backward compatibility for legacy env.* configuration This commit adds backward compatibility support to ensure existing configurations using the legacy env.* format continue to work. The helper functions now check for legacy env.* values first, and only fall back to the new structured format if legacy values are not set. This allows for gradual migration without breaking existing deployments. Backward compatibility is maintained for: - env.shims, env.shims_* (per architecture) - env.defaultShim, env.defaultShim_* (per architecture) - env.allowedHypervisorAnnotations - env.snapshotterHandlerMapping_* (per architecture) - env.pullTypeMapping_* (per architecture) - env.agentHttpsProxy, env.agentNoProxy - env._experimentalSetupSnapshotter - env._experimentalForceGuestPull_* (per architecture) - env.debug Legacy env vars (SHIMS, DEFAULT_SHIM, etc.) are still set in the DaemonSet when using the old format to maintain full compatibility. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	ae3fb45814	kata-deploy: Introduce structured configuration format for shims This commit introduces a new structured configuration format for configuring Kata Containers shims in the Helm chart. The new format provides: - Per-shim configuration with enabled/supportedArches - Per-shim snapshotter, guest pull, and agent proxy settings - Architecture-aware default shim configuration - Root-level debug and snapshotter setup configuration All shims are disabled by default and must be explicitly enabled. This provides better type safety and clearer organization compared to the legacy env.* string-based format. The templates are updated to use the new structure exclusively. Backward compatibility will be added in a follow-up commit. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	e85d584e1c	kata-deploy: script: Fix FOR_ARCH handling As the some of the global vars can be empty, we should actually check their _FOR_ARCH version instead. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	397289c67c	kata-deploy: script: Handle {https,no}_proxy per shim As we're making the values.yaml more user friendly, we actually have to handle the https_proxy and no_proxy entries per shim, instead of having this globally available, as this will only affect images being pulled inside the guest (as in, when using TEE variations of the shims). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	f62d9435a2	runtimeclasses: firecracker is not a valid one At least not for now, and it was mistakenly added to the list. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
nheinemans-asml	3380458269	kata-deploy: Add daemonsets to the RBAC Add missing rules which are necessary for dealing with daemonsets as kata-deploy know checks for the NFD daemonset as part of its script. fixes #12083 Signed-off-by: nheinemans-asml <97238218+nheinemans-asml@users.noreply.github.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-14 17:16:58 +01:00
Simon Kaegi	716c55abdd	kernel: adds nft bridging and filtering support for IPv4 and IPv6 Adds a practical set of kernel config used by docker-in-docker and kind for network bridging and filtering. It also includes the matching IPv6 support to allow tools like kind that require IPv6 network policies to work out of the box. This support includes: - nftables reject and filtering support for inet/ipv4/ipv6 - Bridge filtering for container-to-container traffic - IPv6 NAT, filtering, and packet matching rules for network policies - VXLAN and IPsec crypto support for network tunneling - TMPFS POSIX ACL support for filesystem permissions The configs are organized across fragment files: - common/fs.conf: TMPFS ACL support - common/crypto.conf: IPsec/VXLAN crypto algorithms - common/network.conf: VXLAN, IPsec ESP, nftables bridge/ARP/netdev - common/netfilter.conf: IPv6 netfilter stack and nftables advanced features Fixes: #11886 Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>	2025-11-14 15:57:47 +01:00
Dan Mihai	5cc1024936	ci: k8s: AUTO_GENERATE_POLICY for coco-dev Re-enable AUTO_GENERATE_POLICY for coco-dev Hosts, unless PULL_TYPE is "experimental-force-guest-pull", or the caller specified a different value for AUTO_GENERATE_POLICY. Auto-generated Policy has been disabled accidentally and recently for these Hosts, by a GHA workflow change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-14 15:53:34 +01:00
Dan Mihai	73ad83e1cc	genpolicy: update workaround for guest pull Don't skip anymore parsing the pause container image when using the recently updated AKS pause container handling - i.e. when pause_container_id_policy == "v2". This was the easiest CI fix for guest pull + new AKS given the current tests. When adding new UID/GID/AdditionalGids tests in the future, these workarounds might need additional updates. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-14 15:53:34 +01:00
Steve Horsman	7bcb971398	Merge pull request #12075 from burgerdev/genpolicy-archived-deps retire `adler` dependency	2025-11-14 14:51:47 +00:00
Steve Horsman	1d0d066869	Merge pull request #12069 from Amulyam24/static-checks-ppc github: run agent checks for Power on ppc64le instead of ubuntu-24.04-ppc64le	2025-11-14 10:18:37 +00:00
Markus Rudy	dd59131924	runtime-rs: update flate2 to 1.1.5 The update removes the deprecated adler crate from our dependencies. In addition, we're switching to the default backend (miniz_oxide), which is a pure Rust implementation and thus much more portable. The performance impact is negligible, because flate2 is only used for initdata decompression, which is limited to a couple of MiB anyway. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-11-14 11:11:44 +01:00
Markus Rudy	3949492f19	genpolicy: update flate2 to 1.1.5 The update removes the deprecated adler crate from our dependencies. In addition, we're switching to the default backend (miniz_oxide), which is a pure Rust implementation and thus much more portable. The performance impact is acceptable for a developer tool. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-11-14 11:10:29 +01:00
Steve Horsman	0ab71771ab	Merge pull request #11447 from kata-containers/runtime-rs-qemu-coco-dev-config Runtime rs qemu coco dev config	2025-11-13 19:12:57 +00:00
stevenhorsman	1ef3e3b929	ci: Switch gatekeeper auth header The github API suggestions that `Authorization: Bearer <YOUR-TOKEN>` is the way to set the auth token, but it also mentioned that `token` should work, so it's unclear if this will help much, but it shouldn't harm. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 19:01:21 +01:00
stevenhorsman	b7abcc4c37	tests: Fix wildcard skip in k8s-cpu-ns The formatting wasn't quite right, so the `qemu-coco-dev-runtime-rs` hypervisor wasn't skipping this test Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 14:21:05 +00:00
Alex Lyn	bda6bbcad3	runtime-rs: Set `static_sandbox_resource_mgmt` to true within nontee Introduce a flag `DEFSTATICRESOURCEMGMT_COCO` for setting static sandbox resource management with default true. And then set it to the item of `static_sandbox_resource_mgmt` in configuration. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-13 14:18:43 +00:00
stevenhorsman	b51af53bc7	tests/k8s: call teardown_common in some policy tests The teardown_common will print the description of the running pods, kill them all and print the system's syslogs afterwards. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 14:18:43 +00:00
Alex Lyn	efc6aee4f6	runtime-rs: Support agent policy Support agent policy within runtime-rs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 14:18:43 +00:00
stevenhorsman	79082171ca	workflows: Add Delete AKS cluster timeout When testing this branch, on several occasions the Delete AKS cluster step has hung for multiple hours, so add a timeout to prevent this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 14:18:43 +00:00
stevenhorsman	0335012824	tests/k8s: Enable tests for qemu-coco-dev-runtime-rs Add the runtime class to the non-tee tests and enable it to run in the test code Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 14:18:43 +00:00
stevenhorsman	a1ddd2c3dd	kata-deploy: Add kata-qemu-coco-dev-runtime-rs runtime class Add the runtime class and shim references for the new non-tee runtime-rs class Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 14:18:43 +00:00
Alex Lyn	64da581f6e	kata-types: Support create_container_timeout set within configuration Since it aligns with the create_container_timeout definition in runtime-go, we need to set the value in configuration.toml in seconds, not milliseconds. We must also convert it to milliseconds when the configuration is loaded for request_timeout_ms. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-13 14:18:43 +00:00
stevenhorsman	af2c2d9d00	runtime-rs: Add qemu-coco-dev-runtime-rs Create non-tee runtime class for runtime-rs qemu CoCo development without requiring TEE hardware. Based on the qemu-runtime-rs config, but with updated guest image, kernel and shared_fs Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 14:18:43 +00:00
Amulyam24	b32b54c4af	github: do not run agent checks for Power on ubuntu-24.04-ppc64le The new environment of Power runners for agent checks is causing two test case failures w.r.to selinux and inode which needs further understanding and is mostly an issue due to environemnt change and not to do with the agent. Fall back to running agent checks on original ppc64le self hosted runners. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2025-11-13 15:56:43 +05:30
Gao Xiang	657c4406cd	runtime: Add preliminary support for EROFS native rwlayers So that the writable data will be written to a seperate storage instead of tmpfs in the guest. Note that a cleaner way should use new containerd custom mount type but I don't have time on this for now. More details, see: https://github.com/containerd/containerd/blob/v2.2.0/docs/snapshotters/erofs.md#quota-support Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-11-13 09:55:06 +01:00
Steve Horsman	92758a17fe	Merge pull request #12078 from kata-containers/switch-to-ubuntu-24.04-arm-runner workflows: Switch to ubuntu-24.04-arm runner	2025-11-12 16:35:52 +00:00
stevenhorsman	ba56a2c372	workflows: Switch to ubuntu-22.04-arm runner As the arm 22.04 runner isn't working at the moment, let's test the 24.04 version to see if that is better. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-12 15:37:09 +00:00
Fabiano Fidêncio	a04cdbc40f	tests: Enforce qemu-coco-dev for experimental_force_guest_pull The fact that we were not explicitly setting the VMM was leading to us testing with the default runtime class (qemu). :-/ Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-12 16:07:05 +01:00
Wainer Moschetta	e31313ce9e	Merge pull request #11030 from ldoktor/webhook2 tools.kata-webhook: Add support for only-filter	2025-11-12 11:21:23 -03:00
Hyounggyu Choi	2dec247a54	Merge pull request #12038 from lifupan/fix_smaller-memeory runtime-rs: fix the issue of hot-unplug memory smaller	2025-11-12 11:22:04 +01:00
dependabot[bot]	c715d8648c	build(deps): bump oras-project/setup-oras from 1.2.2 to 1.2.4 Bumps [oras-project/setup-oras](https://github.com/oras-project/setup-oras) from 1.2.2 to 1.2.4. - [Release notes](https://github.com/oras-project/setup-oras/releases) - [Commits](`5c0b487ce3...22ce207df3`) --- updated-dependencies: - dependency-name: oras-project/setup-oras dependency-version: 1.2.4 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-12 09:45:27 +00:00
Markus Rudy	2c8d0688f2	Merge pull request #12068 from katexochen/p/full-controllers genpolicy: support full DeploymentSpec, JobSpec; cleanup CronJobSpec	2025-11-12 10:35:38 +01:00
Fabiano Fidêncio	6d3c20bc45	riscv: Introduce its own nightly tests By doing this, the ones interested on RISC-V support can still have a ood visibility of its state, without the extra noise in our CI. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-12 09:46:17 +01:00
Zvonko Kaiser	d783e59b42	Merge pull request #12055 from fidencio/topic/coco-bump-trustee versions: Bump Trustee	2025-11-12 02:48:16 -05:00
dependabot[bot]	edacdcb0bc	build(deps): bump github.com/opencontainers/selinux in /src/runtime Bumps [github.com/opencontainers/selinux](https://github.com/opencontainers/selinux) from 1.12.0 to 1.13.0. - [Release notes](https://github.com/opencontainers/selinux/releases) - [Commits](https://github.com/opencontainers/selinux/compare/v1.12.0...v1.13.0) --- updated-dependencies: - dependency-name: github.com/opencontainers/selinux dependency-version: 1.13.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-11 23:15:40 +01:00
Steve Horsman	1954dfe349	Merge pull request #12071 from stevenhorsman/update-required-test-docker-and-stratovirt ci: Remove stratovirt & docker tests from required	2025-11-11 21:19:25 +00:00
Zvonko Kaiser	76e4e6bc24	Merge pull request #12061 from Apokleos/correct-unexpected-cap tests: Correct unexpected capability for policy failure test	2025-11-11 12:20:33 -05:00
Fabiano Fidêncio	d82eb8d0f1	ci: Drop docker tests We have had those tests broken for months. It's time to get rid of those. NOTE that we could easily revert this commit and re-add those tests as soon as we find someone to maintain and be responsible for such integration. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-11 17:02:02 +01:00
stevenhorsman	8b5df4d360	ci: Remove stratovirt & docker tests from required As stratovirt CI was removed in #12006 we should remove the jobs from required. Also the docker tests have been commented out for months, and we are considering removing them, so clean this file up. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-11 15:38:51 +00:00
Steve Horsman	4b33000c56	Merge pull request #12067 from Apokleos/fix-guest-emptydir runtime-rs: Fix several incorrect settings with guest empty dir.	2025-11-11 15:21:31 +00:00
Lukáš Doktor	ca91073d83	tools.kata-webhook: Add support for only-filter sometimes it's hard to enumerate all blacklisted namespaces, lets add a regular expression based only filter to allow specifying namespaces that should be mutated. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-11-11 15:21:15 +01:00
dependabot[bot]	281f69a540	build(deps): bump github.com/containerd/containerd in /src/runtime Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.7.27 to 1.7.29. - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](https://github.com/containerd/containerd/compare/v1.7.27...v1.7.29) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-version: 1.7.29 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-11 14:23:47 +01:00
Paul Meyer	ec6896e96b	genpolicy: remove non-existing field from CronJobSpec There is no backoffLimit on CronJobSpec, also no additional fields. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-11-11 11:12:48 +01:00
Paul Meyer	258aed3cd3	genpolicy: support full JobSpec Based on https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.33/#job-v1-batch The JOB_COMPLETION_INDEX env will be set if completionMode is "indexed". Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-11-11 11:12:48 +01:00
Paul Meyer	f0ffaa9a6b	genpolicy: support full DeploymentSpec The added fields are relevant only to the controller, so they should not impact security and following aren't of interest for policies. Adding according to https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.33/#deployment-v1-apps Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-11-11 11:07:18 +01:00
Alex Lyn	79d1a6ed8f	runtime-rs: Correct the mount type for emptydir with local storage Previous set for the Mount.type with `bind` is wrong, and for local storage, the type of Mount should be `local`. This commit aims to correct the type with "local". Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-11 17:09:33 +08:00
Alex Lyn	935ecf2765	runtime-rs: Fix disable_guest_empty_dir parameters order As the disable_guest_empty_dir order is wrong which causes the bool value is not correct and it got a wrong result. This commit aims to correct the parameters order. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-11 16:59:00 +08:00
Fabiano Fidêncio	9d6f6bac37	agent-ctl: Bump image-rs version Bump to the same version of CoCo Guest Components. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-11 08:08:24 +01:00
Fabiano Fidêncio	a5629a5a6f	versions: Bump coco-guest-components Usual bump before a release that will be consumed by Confidential Containers. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-11 08:08:24 +01:00
Fabiano Fidêncio	2d2b0de160	tests: kbs: Try to get the pod logs on deployment failure As this helps immensely to figure out what went wrong with the deployment. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-11 08:08:24 +01:00
Fabiano Fidêncio	58df06d90e	versions: Bump Trustee This is a bump pre-release, which brings several fixes and some improvements related to initData, and NVIDIA's remote verifier. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-11 08:08:05 +01:00
Alex Lyn	c225cba0e6	tests: Correct unexpected capability for policy failure test The test case designed to verify policy failures due to an "unexpected capability" was misconfigured. It was using "CAP_SYS_CHROOT" as the unexpected capability to be added. This configuration was flawed for two main reasons: 1.Incorrect Syntax: Kubernetes Pod specs expect capability names without the "CAP_" prefix (e.g., "SYS_CHROOT", not "CAP_SYS_CHROOT"). This made the test case's premise incorrect from a K8s API perspective. 2.Part of Default Set: "SYS_CHROOT" is already included in the `default_caps` list for a standard container. Therefore, adding it would not trigger a policy violation, defeating the purpose of the "unexpected capability" test. Furthermore, a related issue was observed where a malformed capability like "CAP_CAP_SYS_CHROOT" was being generated, causing parsing failures in the `oci-spec-rs` library. This was a symptom of incorrect string manipulation when handling capabilities. This commit corrects the test by selecting "SYS_NICE" as the unexpected capability. "SYS_NICE" is a more suitable choice because: - It is a valid Linux capability. - It is relatively harmless. - It is not part of the default capability set defined in `genpolicy-settings.json`. By using "SYS_NICE", the test now accurately simulates a scenario where a Pod requests a legitimate but non-default capability, which the policy (generated from a baseline Pod without this capability) should correctly reject. This change fixes the test's logic and also resolves the downstream `oci-spec-rs` parsing error by ensuring only valid capability names are processed. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-11 14:06:30 +08:00
Alex Lyn	9aaf41a71b	Merge pull request #11985 from Apokleos/policy-caps-rs genpolicy: Correct caps matcher for runtime-rs	2025-11-11 11:08:11 +08:00
Alex Lyn	29fe46bc06	genpolicy: Correct caps matcher for runtime-rs Detected a format mismatch in OCI Spec Capabilities fields between `runtime-rs` (no `CAP_` prefix) and `runtime-go` (with `CAP_` prefix). This introduces a normalization of caps in match_caps(p_caps, i_caps). This ensures robust and consistent processing of Capabilities regardless of whether the OCI Spec originates from `runtime-rs` or `runtime-go`. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-11 10:03:54 +08:00
Dan Mihai	f78584e868	Merge pull request #12048 from manuelh-dev/mahuber/bb-build deploy: Improve busybox build	2025-11-10 11:32:07 -08:00
Alex Lyn	7423eb7a30	agent: Support both virtio-blk and virtio-scsi devices for initdata Currently, the initdata module only detects virtio-blk devices (/dev/vd) when searching for the initdata block device. However, when using virtio-scsi, the devices appear as /dev/sd in the guest, causing the initdata detection to fail. This commit extends the device detection logic to support both device types: - virtio-blk devices: /dev/vda, /dev/vdb, etc. - virtio-scsi devices: /dev/sda, /dev/sdb, etc. This commits aims to address issue of theinitdata device not being found when using virtio-scsi Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-10 18:03:23 +01:00
dependabot[bot]	f699f097f3	build(deps): bump github.com/opencontainers/runc in /src/runtime Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.2.6 to 1.2.8. - [Release notes](https://github.com/opencontainers/runc/releases) - [Changelog](https://github.com/opencontainers/runc/blob/v1.2.8/CHANGELOG.md) - [Commits](https://github.com/opencontainers/runc/compare/v1.2.6...v1.2.8) --- updated-dependencies: - dependency-name: github.com/opencontainers/runc dependency-version: 1.2.8 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-10 15:43:48 +01:00
Fabiano Fidêncio	92226d0a19	tests: nvidia: Be prepared for TDX Thankfully there's only one piece that's still SNP specific (for the supported TEEs). Let's adjust it so we can have an easy and smooth execution when adding a TDX CI machine. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	4d314e8676	tests: nvidia: nims: Adjust to CC There are several changes needed in order to get this test working with CC, and yet we still are skipping it. Basically, we need to: * Pull an authenticated image inside the guest, which requires: * Using Trustee to release the credential * We still depend on a PR to be merged on Trustee side * https://github.com/confidential-containers/trustee/pull/1035 * We still depend on a Trustee bump (including the PR above) on our side Apart from those changes, I ended up "duplicating" the tests by adding a "-tee" version of those, which already have: * The proper kbs annotations set up * Dropped host mounts * Increases the memory needed Last but not least, as "bats" probably means "being a terrible script", I had to re-arrange a few things otherwise the tests would not even run due to bats-isms that I am sincerely not able to pin-point. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	8cedd96d54	tests: nvidia: k8s: Enforce experimental_force_guest_pull We added the tests using virtio-9p as we knew it'd require incremental changes to be able to use any kind of guest-pull method. Now, as in the coming commits we'll be actually ensuring that guest-pull works and is in use, we can enforce the experimental_force_guest_pull usage for the nvidia cases. Note: We're using experimental_force_guest_pull instead of nydus-snapshotter due to stability concerns with the snapshotter. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	464764c7e0	tests: nvidia: kbs: Ensure KBS_INGRESS=nodeport I've missed doing this doing the KBS deployment set up. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Manuel Huber	a5cd7235cb	runtime: Align nvidia TEEs enable_annotations with TEEs It was just missed when adding those configurations. Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	e85cf83573	k8s: tests: Fix default for EXPERIMENTAL_FORCE_GUEST_PULL It takes either a shim name or "", but we were treating this (thankfully only in this specific file) as a boolean. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Manuel Huber	8b39468b36	tests: nvidia: Logging for NIM Adjust output to the setup_file and teardown_file behavior. With this, we will be able to observe relevant logging rather than adding to the output variable. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	812191c1f3	tests: nvidia: Do not deploy NFD on nvidia-gpu cases As it'll come from the GPU Operator for now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Pavel Mores	74f9fdb11f	runtime-rs: remove hardcoding of SEV physical address reduction Previous commit enabled getting the physical address reduction from processor but just stored it for later use. This commit adds handling of the value to ProtectionDevice and enables the QEMU driver to use it. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-10 13:01:03 +01:00
Pavel Mores	6f9178d290	runtime-rs: get SEV params using CPUID and store them in SevSnpDetails An implementation of cbitpos acquisition is supplied that was missing so far. We also get the physical address reduction value from the same source (CPUID Fn8000_001f function). This has been hardcoded at 1 so far, following the Go runtime example, but it's better to get it from the processor. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-10 13:01:03 +01:00
Greg Kurz	5810279edf	Merge pull request #12008 from microsoft/saulparedes/allow_priv webhook: allow privileged containers	2025-11-10 11:13:41 +01:00
Zvonko Kaiser	df58972d41	Merge pull request #12051 from microsoft/danmihai1/agent-version agent: update version.rs when VERSION file changed	2025-11-09 20:34:58 -05:00
Fabiano Fidêncio	37d4eb0b77	ci: nvidia: Ensure K8S_TEST_HOST_TYPE=baremetal So the proper cleanups are performed in case something goes awry in a previous run. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-09 10:51:33 +01:00
Dan Mihai	7b10f4c72a	agent: update version.rs when VERSION file changed - version.rs gets generated from version.rs.in - version.rs.in contains values read from VERSION - so version.rs (and maybe other Agent files too) must be re-generated when the VERSION file changes Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 17:53:09 +00:00
Alex Lyn	83b0a59215	Merge pull request #12046 from Apokleos/disable-guest-emptydir Disable guest emptydir	2025-11-08 11:54:15 +08:00
Dan Mihai	df7ee2dd38	ci: k8s: AUTO_GENERATE_POLICY for cbl-mariner Auto-generate policy on cbl-mariner Hosts if the user didn't specify an AUTO_GENERATE_POLICY value. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Dan Mihai	53acb74f26	genpolicy: adapt to new AKS pause container behavior The new image reference has changed to mcr.microsoft.com/oss/v2/kubernetes/pause:3.6 from mcr.microsoft.com/oss/kubernetes/pause:3.6. The new image uses by default UID=0, GID=0 while the older. The older image had: UID=65535, GID=65535. There is a new pause_container_id_policy field in genpolicy-settings.json, informing genpolicy about the way AdditionalGids gets updated - "v1" for the older behavior and "v2" for the newer AKS version: - When using v1, the default value of AdditionalGids is {65535}. - When using v2, the default value of AdditionalGids is {}. UID=65535 and GID=65535 are still hard-coded by default in genpolicy-settings.json. We might be able to remove/ignore these fields in the future, if we'll stop relying on policy::KataSpec::get_process_fields to use these fields. A new CI function adapt_common_policy_settings_for_aks() changes the pause container UID, GID, pause_container_id_policy, and image ref settings values when testing on AKS Hosts - i.e., when testing coco-dev or mariner Hosts. The genpolicy workarounds for the unexpected behavior with guest pull enabled have been improved to use the current container's GID instead of hard-coding GID=0 as the guest pull default. Also, AdditionalGids gets updated when the current container's GID is changing, instead of always changing the AdditionalGids at the very end of policy::AgentPolicy::get_container_process(), when the relevant evolution of the GID value was no longer available. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Dan Mihai	1f784bb770	genpolicy: improve policy generation comments Make it easier to understand the source of the UID/GID/AdditionalGids values from the container in the auto-generated policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Dan Mihai	969b8e0fb8	genpolicy: more detailed UID/GID debug logs Add more details to code paths handling UID/GID values, for easier debugging. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Dan Mihai	cacd37ee6e	tests: genpolicy: restore test settings for non-Coco configMap These settings got broken recently because the non-CoCo tests were disabled for unrelated reasons. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Manuel Huber	caff6df827	deploy: Improve busybox build Parallelize busybox builds to build a bit faster and create the build directory prior to Docker execution, which on my environment, helps with permission issues when building busybox without the kata-containers/build directory existing beforehand. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-07 10:09:57 -08:00
Alex Lyn	23024876b2	runtime-rs: Use the configurable disable_guest_empty_dir Correct the hardcoded value of disable_guest_empty_dir, instead, we use the real value of it which comes from the configuration. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-07 19:52:11 +08:00
Alex Lyn	382924bdf3	kata-sys-util: Introduce a sandbox annotation for disable guest emptydir A sandbox annotation that determines if it should create Kubernetes emptyDir mounts on the guest filesystem. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-07 19:48:42 +08:00
Alex Lyn	720a229579	kata-types: Introduce disable guest emptydir flag It acts as if it should create Kubernetes emptyDir mounts on the guest filesystem. If enabled, the runtime will not create Kubernetes emptyDir mounts on the guest filesystem.Instead, emptyDir mounts will be created on the host and shared via virtio-fs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-07 19:45:55 +08:00
Fabiano Fidêncio	03e06fdf4d	tests: nvidia: Deploy Trustee Let's ensure Trustee is deployed as some of the tests rely images that live behind authentication. /o\ The approach taken here to deploy Trustee is exactly the same one taken on the other CoCo tests, apart from an env var passed to ensure we're using the NVIDIA remote verifier (which will be in handy very very soon). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-07 12:32:11 +01:00
Pavel Mores	841fee28da	runtime-rs: add a helper to run external command and capture its output This isn't really related to remote hypervisor though it was useful for its debugging. It's a small helper I've been using regularly during development for quite some time that I think might be useful more broadly. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-07 10:49:14 +01:00
Pavel Mores	72c704b287	runtime-rs: make error reporting for CreateVM a bit more explicit A naked ttrpc error with no context turns out to be rather hard to understand or even notice in log. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-07 10:49:14 +01:00
Pavel Mores	45d8141edc	runtime-rs: remote hv needs neither image nor initrd specified in config The remote hypervisor launches no VM, it just instructs the Cloud API Adaptor to do so, therefore it has no need for an image or initrd to boot from and should be exempt from the mandate for one or the other to be specified. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-07 10:49:14 +01:00
Pavel Mores	80ef102a00	runtime-rs: fix scoping of the remote hv Hypervisor service The go runtime's .proto file - which is also used by the Cloud API Adaptor - puts the Hypervisor service into the "hypervisor" package. runtime-rs has to do the same to avoid an "unimplemented" error. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-07 10:49:14 +01:00
Alex Lyn	d5e2071869	Merge pull request #11921 from Apokleos/enhance-copyfile2 runtime-rs: Add support LocalStorage for emptyDir within nontee cases	2025-11-07 16:58:39 +08:00
Fabiano Fidêncio	a591cda466	gatekeeper: Adjust the nvidia gpu test name With the change made to the matrix when the CC GPU runner was added, there was a change in the job name (@sprt saw that coming, but I didn't). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	c6dc176a03	tests: nvidia: cc: Enable NIMs tests Same deal as the previous commut, just enabling the tests here, with the same list of improvements that we will need to go through in order to get is working in a perfect way. Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	8ca77f2655	tests: nvidia: cc: Run CUDA vectorAdd tests on CC mode While the primary goal of this change is to detect regressions to the NVIDIA SNP GPU scenario, various improvements to reflect a more realistic CC setting are planned in subsequent changes, such as: * moving away from the overlayfs snapshotter * disabling filesystem sharing * applying a pod security policy * activating the GPUs only after attestation * using a refined approach for GPU cold-plugging without requiring annotations * revisiting pod timeout and overhead parameters (the podOverhead value was increased due to CUDA vectorAdd requiring about 6Gi of podOverhead, as well as the inference and embedqa requiring at least 12Gi, respectively, 14Gi of podOverhead to run without invoking the host's oom-killer. We will revisit this aspect after addressing points 1. and 2.) Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	25ce0afd52	kata-deploy: Allow the CDI annotation for CC GPU cases For the nvidia-gpu-snp and nvidia-gpu-tdx we must set containerd to allow the CDI annotation to be passed to down. This solution may become obsolete soon enough, but the cleanest way to have it properly working is by adding it here (even if we remove it before the next release). Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	c91edf884b	runtimeclasses: nvidia: Bump TEE podOverhead It's been noticed that as more RAM is needed to run the CC tests, we also need to update the podOverhead of the NVIDIA CC runtime classes to avoid getting OOM Killed. Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Fupan Li	bfe8da6c8a	tests: disable the qemu-runtime-rs cpu hotplug test Since there's something wrong with the cpu hotplug on qemu-runtime-rs, thus disable this test temporally. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-06 21:37:01 +08:00
Fupan Li	3b1bfea609	runtime-rs: fix the issue of hot-unplug memory smaller It should do nothing instead of return an error when hot-unplug the memory to the size smaller than static plugged memory size. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-06 18:19:55 +08:00
Fupan Li	aac2a37ff5	runtime-rs: enable pselect6 syscall for dragonball seccomp Since the nerdctl's network hook would call pselect6 syscall by xtables-nft-multi, thus we'd better add it to the seccomp's whitelist. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-06 11:17:57 +01:00
Hyounggyu Choi	ff429072b6	Merge pull request #11924 from BbolroC/fix-static-checks-actionspz ci: Fix failing static checks to enable IBM actionspz - Z specific	2025-11-06 09:04:04 +01:00
Zvonko Kaiser	fce6a75899	Merge pull request #12027 from fidencio/topic/kata-deploy-make-ALLOWED_HYPERVISOR_ANNOTATIONS-per-arch kata-deploy: Add per arch ALLOWED_HYPERVISOR_ANNOTATIONS	2025-11-05 18:20:14 -05:00
Manuel Huber	d8953f67c5	ci: Onboard another NVIDIA machine Let's add a new NVIDIA machine, which later on will be used for CC related tests. For now the current tests are skipped in the CC capable machine. Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 23:23:08 +01:00
Fabiano Fidêncio	b2ee64a2d6	kata-deploy: scripts: Ensure we don't add duplicated values Let's now make sure that we don't add duplicated values to any of our entries, making the script as sane as possible for sequential runs. Vibed with Cursor's help! Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 19:48:24 +01:00
Fabiano Fidêncio	78ae79d153	kata-deploy: scripts: Add helper functions to avoid duplicated items Let's add some helper functions, not yet used, to avoid adding duplicated items. This idea is an expansion of Choi's idea to avoid setting duplicated items, and it'll help on making the whole script idempotent on sequential runs. Vibed with Cursor's help! Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 19:48:24 +01:00
Fabiano Fidêncio	f773368d93	kata-deploy: Add per arch ALLOWED_HYPERVISOR_ANNOTATIONS I know, this is not simplifying much things for now, but it has a good intent in the background and will serve as base for making the kata-deploy helm chart more user friendly. With that said, let's add ALLOWED_HYPERVISOR_ANNOTATIONS per arch, while adding support to set something like "qemu:foo,bar clh:bar foobar barfoo". Why? Because in the future we'll have a better way to set this per shim (and the shim is per arch ...). More details of what we'll do in the future are being discussed here: https://github.com/kata-containers/kata-containers/issues/12024 Anyways, the variables are DELIBERATELY not exposed to the chart for now, as those will be later on when addressing the issue mentioned above. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 19:45:34 +01:00
Fabiano Fidêncio	66e133e096	kata-deploy: Add missing runtimeClasses When the runtimeClasses were added, as part of `7cfa826804`, the firecracker runtimeClass ended up missing from the dictionary. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 19:07:28 +01:00
Anton Ippolitov	23c46b8a00	docs: Update devmapper containerd plugin name The Firecracker installation docs had an outaded containerd configuration for the devmapper plugin. This commit updates the instructions so that they are compatible with more recent versions of containerd. Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>	2025-11-05 18:42:29 +01:00
Fabiano Fidêncio	ace9cf942d	tests: guest-pull: Fix names When added, I've mistakenly used the wrong test-type name, which is now fixed and should be enough to trigger the tests correctly. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 18:21:48 +01:00
Hyounggyu Choi	4ee2037974	GHA: Run runtime tests on self-hosted runners for P/Z On IBM actionspz P/Z runners, the following error was observed during runtime tests: ``` host system doesn't support vsock: stat /dev/vhost-vsock: no such file or directory ``` Since loading the vsock module on the fly is not permitted, this commit moves the runtime tests back to self-hosted runners for P/Z. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	32da38273a	agent/tests: Skip if kernel module is not found On IBM actionspz Z runners, the following error occurs when running `modprobe`: ``` modprobe: FATAL: Module bridge not found in directory /lib/modules/6.8.0-85-generic ``` Additionally, there are no files under `/lib/modules`, for example: ``` total 0 drwxr-xr-x 1 root root 0 Aug 5 13:09 . drwxr-xr-x 1 root root 2.0K Oct 1 22:59 .. ``` This commit skips the `test_load_kernel_module` test if the module is not found or if running `modprobe` is not permitted. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	075de4dc62	agent/tests: Skip test if error is EACCES (permission denied) On IBM actionspz Z runners, write operations on network interfaces are not allowed, even for the root user. This commit skips the `add_update_addresses` test if the operation fails with EACCES (-13, permission denied). Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	3f84b623a3	agent/tests: Skip RNG reseeding test on restricted environments On IBM actionspz Z runners, the ioctl system call is not allowed even for the root user. There is likely an additional security mechanism (such as AppArmor or seccomp) in place on Ubuntu runners. This commit introduces a new helper, `is_permission_error()`, which skips the test if ioctl operations in `reseed_rng()` are not permitted. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	c2abc4da34	agent/tests: Use detected filesystem for baremounted points The IBM actionspz Z runners mount /dev as tmpfs, while other systems use devtmpfs. This difference causes an assertion failure for test_already_baremounted. This commit sets the detected filesystem for bare-mounted points as the expected value. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	faa048893d	agent/tests: Handle error messages differetnly based on root filesystem The root filesystem for IBM actionspz Z runners is `btrfs` instead of `ext4`. The error message differs when an unprivileged user tries to perform a bind mount. This commit adjusts the handling of error messages based on the detected root filesystem type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Fupan Li	0df6c795d8	runtime-rs: disable the default static resource management Since the qemu & cloud-hypervisor support the cpu & memory hotplug now, thus disable the static resource management for qemu and cloud-hypervisor by default. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-05 16:59:13 +01:00
Fupan Li	02ecab40e4	tests: disable the cpu hotplug test for coco dev runtime Since qemu-coco-dev-runtime-rs and qemu-coco-dev had disabled the cpu&memory hotplug by enable static_sandbox_resource_mgmt, thus we should disable the cpu hotplug test for those two runtime. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-05 16:59:13 +01:00
Fupan Li	1fc05491a2	tests: enable the cpu hotplug test for dragonball etc Since the qemu, cloud-hypervisor and dragonball had supported the cpu hotplug on runtime-rs, thus enable the cpu hotplug test in CI. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-05 16:59:13 +01:00
Fabiano Fidêncio	0a0de4e6e3	Revert "tests: Do not enable NFD on s390x" This reverts commit `c75a46d17f`, as NFD now publishes an s390x image (and also a ppc64le one). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 16:06:33 +01:00
Alex Lyn	8f0dd4c44b	runtime-rs: Introduce disable_guest_empty_dir flag This commit introduces the configuration flag `disable_guest_empty_dir` to control the placement of Kubernetes emptyDir volumes. By default, the value is set to `false`, maintaining the current behavior of creating emptyDirs within the guest VM When set to `true`, emptyDirs will be created on the host filesystem. This is essential for scenarios where users need to share data between the host and the guest VM via an emptyDir. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-05 15:05:45 +08:00
Alex Lyn	205c3dac44	runtime-rs: Add rprivate and rw options for memory emptyDir mounts When handling a memory-based emptyDir, the runtime creates a tmpfs mount inside the guest VM. The previous implementation just supports mount options with only "rbind", which does not explicitly guarantee the desired mount propagation behavior. This commit hardens the mounting process by explicitly adding the `rprivate` and `rw` mount flags. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-05 15:05:45 +08:00
Alex Lyn	fac9c795c6	runtime-rs: Add 'local' volume to support k8s emptyDir This commit introduces the 'local' volume, which is specifically designed to create and manage Kubernetes emptyDir volumes directly within the VM's sandbox directory. The core functionality ensures that local volume can be handled correctly in handle volume procedure. This capability is essential for allowing containers to leverage the storage backend for shared volumes. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-05 15:05:45 +08:00
Alex Lyn	1696968eb1	runtime-rs: Implement 'local' storage type for k8s emptyDir volumes This commit implements the new 'local' storage type, enabling Kubernetes emptyDir volumes to be created and managed directly inside the Kata VM (in the sandbox directory). The 'local' type instructs the kata-agent to provision the empty directory within the VM. This approach allows containers to share storage inside VM, Specially useful within CoCo emptyDir scenarios. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-05 15:05:22 +08:00
Alex Lyn	b58a53bfa4	kata-sys-util: Improve handling of Kubernetes emptyDir volumes Separated the checks for tmpfs and disk-based emptyDirs from an `if-else if` block into two distinct `if` statements. This clarifies the logic by treating each volume type detection as an independent task. Additionally, updated the type for disk-based emptyDirs to the more semantically accurate `KATA_K8S_LOCAL_STORAGE_TYPE`. This allows for more specific handling downstream, distinguishing them from generic host path mounts. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-05 14:59:21 +08:00
Alex Lyn	c39c6f1ae4	kata-sys-utils: Correct the judgement of logic of host emptyDir In fact, emptyDir is not usually found in the proc mounts with the previous logic and then it failed with the previous implementation. Based on the related implementation within runtime-go,related implementation within Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-05 14:59:21 +08:00
Alex Lyn	f278616bf7	kata-types: Introduce a new storage type of "local" This introduces a new storage type: local. Local storage type will tell kata-agent to create an empty directory with LocalStorgae handler in the sandbox directory within the VM. And it also makes it align with runtime-go `KataLocalDevType = "local"`. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-05 14:59:21 +08:00
Manuel Huber	1561d7fbba	runtime: Clear outer CDI annotations Pod annotations from the outer runtime are being used for cold-plugging CDI devices. We need to ensure that these annotations don't leak into the inner runtime for which specific container (sibling) annotations are being created. Without this change, the inner runtime receives both annotations, leading to failing CDI injection as an outer runtime annotation observed in the guest translates to an unresolvable CDI device, for example, cdi.k8s.io/gpu: "nvidia.com/pgpu=0". Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-04 23:18:00 +01:00
Fabiano Fidêncio	1dfbb14093	tests: Stop testing on stratovirt Stratovirt has been failing for a considerable amount of time, with no sign of someone watching it and being actively working on a fix. With this we also stop building and shipping stratovirt as part of our release as we cannot test it. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-04 10:22:46 +01:00
Fabiano Fidêncio	02f47d3f18	helm: uninstall: Take nodeSelector into consideration As we're already doing for the install part, but this bit was missed during review. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-04 09:29:35 +01:00
Fabiano Fidêncio	5b01eaf929	tests: Align kata-deploy helm's uninstall Let's use the same method both on the kata-deploy and k8s tests. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-04 09:29:35 +01:00
Fabiano Fidêncio	4293cdf846	tests: Add stability tests for experimental-force-guest-pull A few weeks ago we've tested nydus-snapshotter with this approach, and we DID find issues with it. Now, let's also test this with `experimental_force_guest_pull`. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-04 09:02:19 +01:00
Dan Mihai	6a4c336ca0	Merge pull request #12016 from microsoft/danmihai1/early-wait-abort tests: k8s: reduce test time for unexpected CreateContainerRequest errors	2025-11-03 12:04:56 -08:00
Fabiano Fidêncio	3107533953	tests: Adjust to runtimeClass creation by the chart It's just a follow-up on the previous commit where we move away from the runtimeClass creation inside the script, and instead we do it using the chart itself. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-03 17:32:18 +01:00
Fabiano Fidêncio	12f3b206eb	Revert "kata-deploy: Allow setting the default runtime class name" This reverts commit `be05e1370c`, which is not a problem as we never released such option. Conflicts: tools/packaging/kata-deploy/helm-chart/README.md Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-03 17:32:18 +01:00
Fabiano Fidêncio	7cfa826804	kata-deploy: Let helm deal with runtimeClass creation We had this logic inside the script when we didn't use the helm chart. However, this only makes the shim script more convoluted for no reason. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-03 17:32:18 +01:00
Fabiano Fidêncio	14039c9089	golang: Update to 1.24.9 In order to fix: ``` === Running govulncheck on containerd-shim-kata-v2 === Vulnerabilities found in containerd-shim-kata-v2: === Symbol Results === Vulnerability #1: GO-2025-4015 Excessive CPU consumption in Reader.ReadResponse in net/textproto More info: https://pkg.go.dev/vuln/GO-2025-4015 Standard library Found in: net/textproto@go1.24.6 Fixed in: net/textproto@go1.24.8 Vulnerable symbols found: #1: textproto.Reader.ReadResponse Vulnerability #2: GO-2025-4014 Unbounded allocation when parsing GNU sparse map in archive/tar More info: https://pkg.go.dev/vuln/GO-2025-4014 Standard library Found in: archive/tar@go1.24.6 Fixed in: archive/tar@go1.24.8 Vulnerable symbols found: #1: tar.Reader.Next Vulnerability #3: GO-2025-4013 Panic when validating certificates with DSA public keys in crypto/x509 More info: https://pkg.go.dev/vuln/GO-2025-4013 Standard library Found in: crypto/x509@go1.24.6 Fixed in: crypto/x509@go1.24.8 Vulnerable symbols found: #1: x509.Certificate.Verify #2: x509.Certificate.Verify Vulnerability #4: GO-2025-4012 Lack of limit when parsing cookies can cause memory exhaustion in net/http More info: https://pkg.go.dev/vuln/GO-2025-4012 Standard library Found in: net/http@go1.24.6 Fixed in: net/http@go1.24.8 Vulnerable symbols found: #1: http.Client.Do #2: http.Client.Get #3: http.Client.Head #4: http.Client.Post #5: http.Client.PostForm Use '-show traces' to see the other 9 found symbols Vulnerability #5: GO-2025-4011 Parsing DER payload can cause memory exhaustion in encoding/asn1 More info: https://pkg.go.dev/vuln/GO-2025-4011 Standard library Found in: encoding/asn1@go1.24.6 Fixed in: encoding/asn1@go1.24.8 Vulnerable symbols found: #1: asn1.Unmarshal #2: asn1.UnmarshalWithParams Vulnerability #6: GO-2025-4010 Insufficient validation of bracketed IPv6 hostnames in net/url More info: https://pkg.go.dev/vuln/GO-2025-4010 Standard library Found in: net/url@go1.24.6 Fixed in: net/url@go1.24.8 Vulnerable symbols found: #1: url.JoinPath #2: url.Parse #3: url.ParseRequestURI #4: url.URL.Parse #5: url.URL.UnmarshalBinary Vulnerability #7: GO-2025-4009 Quadratic complexity when parsing some invalid inputs in encoding/pem More info: https://pkg.go.dev/vuln/GO-2025-4009 Standard library Found in: encoding/pem@go1.24.6 Fixed in: encoding/pem@go1.24.8 Vulnerable symbols found: #1: pem.Decode Vulnerability #8: GO-2025-4008 ALPN negotiation error contains attacker controlled information in crypto/tls More info: https://pkg.go.dev/vuln/GO-2025-4008 Standard library Found in: crypto/tls@go1.24.6 Fixed in: crypto/tls@go1.24.8 Vulnerable symbols found: #1: tls.Conn.Handshake #2: tls.Conn.HandshakeContext #3: tls.Conn.Read #4: tls.Conn.Write #5: tls.Dial Use '-show traces' to see the other 4 found symbols Vulnerability #9: GO-2025-4007 Quadratic complexity when checking name constraints in crypto/x509 More info: https://pkg.go.dev/vuln/GO-2025-4007 Standard library Found in: crypto/x509@go1.24.6 Fixed in: crypto/x509@go1.24.9 Vulnerable symbols found: #1: x509.CertPool.AppendCertsFromPEM #2: x509.Certificate.CheckCRLSignature #3: x509.Certificate.CheckSignature #4: x509.Certificate.CheckSignatureFrom #5: x509.Certificate.CreateCRL Use '-show traces' to see the other 27 found symbols Vulnerability #10: GO-2025-4006 Excessive CPU consumption in ParseAddress in net/mail More info: https://pkg.go.dev/vuln/GO-2025-4006 Standard library Found in: net/mail@go1.24.6 Fixed in: net/mail@go1.24.8 Vulnerable symbols found: #1: mail.AddressParser.Parse #2: mail.AddressParser.ParseList #3: mail.Header.AddressList #4: mail.ParseAddress #5: mail.ParseAddressList ``` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-03 16:57:22 +01:00
Dan Mihai	c563ee99fa	tests: policy-rc: detect create container errors early During the ${wait_time} for an expected condition, if CreateContainerRequest was NOT expected to fail: detect possible CreateContainerRequest failures early and abort the wait. For example, before this change: not ok 1 Successful replication controller with auto-generated policy in 123335ms ok 2 Policy failure: unexpected container command in 14601ms ok 3 Policy failure: unexpected volume mountPath in 14443ms ok 4 Policy failure: unexpected host device mapping in 14515ms ok 5 Policy failure: unexpected securityContext.allowPrivilegeEscalation in 14485ms ok 6 Policy failure: unexpected capability in 14382ms ok 7 Policy failure: unexpected UID = 1000 in 14578ms After this change: not ok 1 Successful replication controller with auto-generated policy in 17108ms ok 2 Policy failure: unexpected container command in 14427ms ok 3 Policy failure: unexpected volume mountPath in 14636ms ok 4 Policy failure: unexpected host device mapping in 14493ms ok 5 Policy failure: unexpected securityContext.allowPrivilegeEscalation in 14554ms ok 6 Policy failure: unexpected capability in 15087ms ok 7 Policy failure: unexpected UID = 1000 in 14371ms Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-03 15:55:55 +00:00
Dan Mihai	319400dc0d	tests: policy-pvc: detect create container errors early During the ${wait_time} for an expected condition, if CreateContainerRequest was NOT expected to fail: detect possible CreateContainerRequest failures early and abort the wait. For example, before this change: not ok 1 Successful pod with auto-generated policy in 94852ms ok 2 Policy failure: unexpected device mount in 17807ms After this change: not ok 1 Successful pod with auto-generated policy in 35194ms ok 2 Policy failure: unexpected device mount in 21355ms Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-03 15:55:55 +00:00
Dan Mihai	1914fcb812	tests: policy-log: detect create container errors early During the ${wait_time} for an expected condition, if CreateContainerRequest was NOT expected to fail: detect possible CreateContainerRequest failures early and abort the wait. For example, before this change: not ok 1 Logs empty when ReadStreamRequest is blocked in 102257ms After this change: not ok 1 Logs empty when ReadStreamRequest is blocked in 17339ms Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-03 15:55:55 +00:00
Dan Mihai	a0bd9e02ca	tests: policy-job: detect create container errors early During the ${wait_time} for an expected condition, if CreateContainerRequest was NOT expected to fail: detect possible CreateContainerRequest failures early and abort the wait. For example, before this change: not ok 1 Successful job with auto-generated policy in 107111ms ok 2 Policy failure: unexpected environment variable in 7920ms ok 3 Policy failure: unexpected command line argument in 7874ms ok 4 Policy failure: unexpected emptyDir volume in 7823ms ok 5 Policy failure: unexpected projected volume in 7812ms ok 6 Policy failure: unexpected readOnlyRootFilesystem in 7903ms ok 7 Policy failure: unexpected UID = 222 in 7720ms After this change: not ok 1 Successful job with auto-generated policy in 10271ms ok 2 Policy failure: unexpected environment variable in 8018ms ok 3 Policy failure: unexpected command line argument in 7886ms ok 4 Policy failure: unexpected emptyDir volume in 7621ms ok 5 Policy failure: unexpected projected volume in 7843ms ok 6 Policy failure: unexpected readOnlyRootFilesystem in 7632ms ok 7 Policy failure: unexpected UID = 222 in 7619ms Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-03 15:55:55 +00:00
Dan Mihai	992c91371c	tests: policy-deployment-sc: detect create container errors early During the ${wait_time} for an expected condition, if CreateContainerRequest was NOT expected to fail: detect possible CreateContainerRequest failures early and abort the wait. For example, before this change: ok 1 Successful sc deployment with auto-generated policy and container image volumes in 14769ms ok 2 Successful sc with fsGroup/supplementalGroup deployment with auto-generated policy and container image volumes in 8384ms not ok 3 Successful sc deployment with security context choosing another valid user in 136149ms ok 4 Successful layered sc deployment with auto-generated policy and container image volumes in 8862ms ok 5 Policy failure: unexpected GID = 0 for layered securityContext deployment in 7941ms ok 6 Policy failure: malicious root group added via supplementalGroups deployment in 11612ms After: ok 1 Successful sc deployment with auto-generated policy and container image volumes in 15230ms ok 2 Successful sc with fsGroup/supplementalGroup deployment with auto-generated policy and container image volumes in 9364ms not ok 3 Successful sc deployment with security context choosing another valid user in 11060ms ok 4 Successful layered sc deployment with auto-generated policy and container image volumes in 9124ms ok 5 Policy failure: unexpected GID = 0 for layered securityContext deployment in 7919ms ok 6 Policy failure: malicious root group added via supplementalGroups deployment in 11666ms Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-03 15:55:55 +00:00
Dan Mihai	704ee76f1e	tests: policy-deployment-sc: reduced redundancy Call common function instead of copy/paste of three commands. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-03 15:55:55 +00:00
Dan Mihai	2cafb10a6a	tests: policy-pod: detect create container errors early During the ${wait_time} for an expected condition, if CreateContainerRequest was NOT expected to fail: detect possible CreateContainerRequest failures early and abort the wait. For example, before this change: not ok 1 Successful pod with auto-generated policy in 110801ms not ok 2 Able to read env variables sourced from configmap using envFrom in 94104ms not ok 3 Successful pod with auto-generated policy and runtimeClassName filter in 95838ms not ok 4 Successful pod with auto-generated policy and custom layers cache path in 110712ms ok 5 Policy failure: unexpected container image in 8113ms ok 6 Policy failure: unexpected privileged security context in 7943ms ok 7 Policy failure: unexpected terminationMessagePath in 11530ms ok 8 Policy failure: unexpected hostPath volume mount in 7970ms ok 9 Policy failure: unexpected config map in 7933ms not ok 10 Policy failure: unexpected lifecycle.postStart.exec.command in 112677ms ok 11 RuntimeClassName filter: no policy in 2302ms not ok 12 ExecProcessRequest tests in 93946ms not ok 13 Successful pod: runAsUser having the same value as the UID from the container image in 94003ms ok 14 Policy failure: unexpected UID = 0 in 8016ms ok 15 Policy failure: unexpected UID = 1234 in 7850ms After: not ok 1 Successful pod with auto-generated policy in 12182ms not ok 2 Able to read env variables sourced from configmap using envFrom in 10121ms not ok 3 Successful pod with auto-generated policy and runtimeClassName filter in 11738ms not ok 4 Successful pod with auto-generated policy and custom layers cache path in 26592ms ok 5 Policy failure: unexpected container image in 7742ms ok 6 Policy failure: unexpected privileged security context in 7949ms ok 7 Policy failure: unexpected terminationMessagePath in 7789ms ok 8 Policy failure: unexpected hostPath volume mount in 7887ms ok 9 Policy failure: unexpected config map in 7818ms not ok 10 Policy failure: unexpected lifecycle.postStart.exec.command in 9120ms ok 11 RuntimeClassName filter: no policy in 2081ms not ok 12 ExecProcessRequest tests in 9883ms not ok 13 Successful pod: runAsUser having the same value as the UID from the container image in 9870ms ok 14 Policy failure: unexpected UID = 0 in 11161ms ok 15 Policy failure: unexpected UID = 1234 in 7814ms Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-03 15:55:55 +00:00
Alex Lyn	897ecfb503	Merge pull request #12014 from fidencio/topic/release-ensure-helm-dependencies-update scripts: release: Run helm dependencies update	2025-11-03 16:34:17 +08:00
Fabiano Fidêncio	c539a9e90e	tests: k8s: parallel: Increase timeout We've seen a few cases where we fail the test due to timeout and when we print the pods we just see that they've been created. With that in mind, let's just increase the timeout a little bit. Example: ``` not ok 1 Parallel jobs in 6250ms (in test file k8s-parallel.bats, line 41) `kubectl wait --for=condition=Ready --timeout=$timeout pod -l jobgroup=${job_name}' failed No resources found in kata-containers-k8s-tests namespace. [bats-exec-test:71] INFO: k8s configured to use runtimeclass job.batch/process-item-test1 created job.batch/process-item-test2 created job.batch/process-item-test3 created NAME STATUS COMPLETIONS DURATION AGE process-item-test1 Running 0/1 0s process-item-test2 Running 0/1 0s process-item-test3 Running 0/1 0s error: no matching resources found No resources found in kata-containers-k8s-tests namespace. No resources found in kata-containers-k8s-tests namespace. DEBUG: system logs of node 'aks-nodepool1-25989463-vmss000000' since test start time (2025-11-01 16:39:03) -- No entries -- job.batch "process-item-test1" deleted job.batch "process-item-test2" deleted job.batch "process-item-test3" deleted ``` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-01 18:09:37 +01:00
Fabiano Fidêncio	8a5ebd5d16	tests: k8s: run QoS tests on a bigger instance It's been failing to start quite regularly on the smaller instance. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-01 17:54:58 +01:00
Fabiano Fidêncio	157b2c32ce	scripts: release: Run helm dependencies update Otherwise we'll face issues like: ``` Error: found in Chart.yaml, but missing in charts/ directory: node-feature-discovery ``` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-01 17:54:58 +01:00
Fabiano Fidêncio	c75a46d17f	tests: Do not enable NFD on s390x As we're failing on the uninstall, which seems related to a bug on NFD itself, but I don't have access to a s390x machine to debug, let's skip the enablement for now and enable it back once we've experimented it better on s390x. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	67e38e0f92	tests: Do not enable NFD on cbl-mariner As we're failing to install NFD on CBL Mariner, let's skip the enablement there, and enable it once we've experimented it better there. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	1bc873397b	tests: Use NFD as part of the tests As we have the ability to deploy NFD as a sub-chart of our chart, let's make sure we test it during our CI. We had to increase the timeout values, where we had timeouts set, to deploy / undeploy kata, as now NFD is also deployed / undeployed. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	ebe15d154e	kata-deploy: Add NFD as a dependency Let's ensure that we add NFD as a weak dependency of the kata-deploy helm chart. What we're doing for now is leaving it up to the user / admin to enable it, and if enabled then we do a explicit check for virtualization support (x86_64 only for now). In case NFD is already deployed, we fail the installation (in case it's enabled on the kata-deploy helm chart) with a clear error message to the user. While I know that kata-remote DOES NOT require virtualization, I've left this out (with a comment for when we add a peer-pods dependency on kata-deploy) in order to simplify things for now, as kata-remote is not a deployed shim by default. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	be05e1370c	kata-deploy: Allow setting the default runtime class name As Kata Containers can be consumed by other helm-charts, hard coding the default runtime class name to `kata` is not optimal. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:14:53 +01:00
Fabiano Fidêncio	820e6d6351	kata-deploy: Add more per-arch options All the options that take a specific shim as an argument MUST have specific per arch settings, as not all the shims are available for all the arches, leading to issues when setting up multi-arch deployments. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:14:53 +01:00
Zvonko Kaiser	94abe4fc00	osbuilder: nvrc: Consume NVRC release instead of building it Let's ensure that we consume NVRC releases straight from GitHub instead of building the binaries ourselves. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 12:10:20 +01:00
Zvonko Kaiser	69c76971f3	gpu: Handle VFIO and IOMMUFD We have here either /dev/vfio/<num> or /dev/vfio/devices/vfio<num>, for IOMMUFD format /dev/vfio/devices/vfio<num>, strip "vfio" prefix /dev/vfio/123 - basename "123" - vfioNum = "123" - cdi.k8s.io/vfio123 /dev/vfio/devices/vfio123 - basename "vfio123" - strip - vfioNum = "123" - cdi.k8s.io/vfio123 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-31 09:46:07 +01:00
Saul Paredes	26396881cf	webhook: allow privileged containers This allows us to test privileged containers when using the webhook. We can do this because kata-deploy sets privileged_without_host_devices = true for kata runtime by default. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-10-30 14:59:26 -07:00
Fabiano Fidêncio	e30e2b5f45	tests: k8s: Remove tests running on GitHub provided runner We have 2 tests running on GitHub provided runners: * devmapper * CRI-O - devmapper situation For devmapper, we're currently testing devmapper with s390x as part of one of its jobs. More than that, this test has been failing here due to a lack of space in the machine for quite some time, and no-action was taken to bring it back either via GARM or some other way. With that said, let's rely on the s390x CI to test devmapper and avoid one extra failure on our CI by removing this one. - cri-o situation CRI-O is being tested with a fixed version of kubernetes that's already reached its EOL, and a CRI-O version that matches that k8s version. There has been attempts to raise issues, and also to provide a PR that does at least part of the work ... leaving the debugging part for the maintainers of the CI. However, there was no action on those from the maintainers. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-30 11:46:59 +01:00
Alex Lyn	fa521220a9	Merge pull request #11816 from jiuyi123/rs-vm-template-kata-ctl-merge kata-ctl: add factory subcommands for VM template management	2025-10-30 18:21:12 +08:00
ssc	551caad4b1	docs: add guide on VM templating usage in runtime-rs - Explained the concept and benefits of VM templating - Provided step-by-step instructions for enabling VM templating - Detailed the setup for using snapshotter in place of VirtioFS for template-based VM creation - Added performance test results comparing template-based and direct VM creation Signed-off-by: ssc <741026400@qq.com>	2025-10-30 15:18:31 +08:00
ssc	5a586e13a1	kata-ctl: add factory subcommands for VM template management - init: initialize the VM template factory - status: check the current factory status - destroy: clean up and remove factory resources These commands provide basic lifecycle management for VM templates. Signed-off-by: ssc <741026400@qq.com>	2025-10-30 10:27:17 +08:00
RuoqingHe	8878c46e8f	Merge pull request #11867 from spectator333/update-rust-vmm-deps dragonball: Bump kvm-ioctls to fix security issue	2025-10-30 00:17:29 +08:00
Siyu Tao	dd444d23b3	dragonball: Bump kvm-ioctls to fix security issue Use `ioctl_with_mut_ref` instead of `ioctl_with_ref` in the `create_device` method as it needs to write to the `kvm_create_device` struct passed to it, which was released in v0.12.1. Signed-off-by: Siyu Tao <taosiyu2024@163.com>	2025-10-29 14:03:29 +00:00
Steve Horsman	0e19a2bf91	Merge pull request #11993 from zvonkok/vectorAdd gpu: Add libs for CC	2025-10-29 13:42:34 +00:00
stevenhorsman	555926ea1a	libs: Fix formatting issue Fix the cargo fmt issues and then we can make the libs tests required again to avoid this regression happening again. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-29 13:13:50 +01:00
Steve Horsman	dbdd1009af	Merge pull request #11933 from kata-containers/topic/kata-deploy-nfd-dependency-part-I kata-deploy: Automatically deploy NodeFeatureRules for TEEs	2025-10-29 09:50:38 +00:00
Fabiano Fidêncio	103f80c7f5	readme: install: Drop outdated documentation kata-deploy helm chart is THE way to deploy kata-containers on kubernetes environments, and kubernetes environments is basically the only reliably tested deployment we have. For now, let's just drop documentation that is outdated / incorrect, and in the future let's ensure we update the linked docs, as we work on update / upgrade for the helm chart. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-29 09:41:57 +01:00
Zvonko Kaiser	5ff218823c	gpu: Remove unneeded libraries The libs in question were added when moving to developer.nvidia.com but switching back to ubuntu only based builds they are not needed. Remove them to keep the rootfs as minimal as possible. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-29 08:03:36 +01:00
Zvonko Kaiser	6d9b4059f5	gpu: Add libs for CC In the case of CC we need additional libraries in the rootfs. Add them conditionally if type == confidential. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-29 08:03:36 +01:00
Xuewei Niu	55d181beb1	Merge pull request #11828 from jiuyi123/rs-vm-template-runtime-rs runtime-rs: introduce VM template lifecycle and integration	2025-10-29 14:03:46 +08:00
Xuewei Niu	8aca32dfa9	Merge pull request #11862 from StevenFryto/rootless_clh runtime-rs: supporting the CLH VMM process running in non-root mode	2025-10-29 13:31:53 +08:00
ssc	16e8cf1a09	runtime-rs: boot vm from template Add build_vm_from_template() that flips boot_from_template flag, wires factory.template_path/{memory,state} into the hypervisor config, and returns ready-to-use hypervisor & agent instances. When factory.template is enabled, VirtContainer bypasses normal creation and directly boots the VM by restoring the template through incoming migration, completing the "create → save → clone" loop. Fixes: #11413 Signed-off-by: ssc <741026400@qq.com>	2025-10-29 12:38:28 +08:00
ssc	550615285c	runtime-rs: add factory, template and vm modules for VM template lifecycle Introduced factory::FactoryConfig with init/destroy/status commands to manage template pools. Added template::Template to fetch, create and persist base VMs. Introduced vm::{VM, VMConfig} exposing create, pause, save, resume, stop, disconnect and migration helpers for sandbox integration. Extended QemuInner to executes QMP incoming migration, pause/resume and status tracking. Fixes: #11413 Signed-off-by: ssc <741026400@qq.com>	2025-10-29 12:38:28 +08:00
ssc	135c84b6cb	kata-types: add VM template and factory configuration Added new fields in Hypervisor struct to support VM template creation, template boot, memory and device state paths, shared path, and store paths. Introduced a Factory struct in config to manage template path, cache endpoint, cache number, and template enable flag. Integrated Factory into TomlConfig for runtime configuration parsing. Fixes: #11413 Signed-off-by: ssc <741026400@qq.com>	2025-10-29 11:49:08 +08:00
stevenfryto	2ceadc5fa3	runtime-rs: supporting the CLH VMM process running in non-root mode This change enables to run the Cloud Hypervisor VMM using a non-root user when rootless flag is set true in configuration. Fixes: #11414 Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>	2025-10-29 01:55:10 +00:00
stevenfryto	2ddbae3aa6	runtime-rs: pass the tuntap fds down to Cloud Hypervisor Pass the file descriptors of the tuntap device to the Cloud Hypervisor VMM process so that the process could open the device without cap_net_admin Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>	2025-10-29 01:55:10 +00:00
Fabiano Fidêncio	59883a2d99	actions: Remove unused USING_NFD There's no reason to keep the env var / input as it's never been used and now kata-deploy detects automatically whether NFD is deployed or not. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-28 21:24:27 +01:00
Fabiano Fidêncio	f9825b4e6e	kata-deploy: Automatically deploy NodeFeatureRules for TEEs When the NodeFeatureRule CRD is detected kata-deploy will: * Create the specific NodeFeatureRules for the x86_64 TEEs * Adapt the TEEs runtime classes to take into account the amount of keys available in the system when spawning the podsandbox. Note, we still do not have NFD as sub-dependency of the helm chart, and I'm not even sure if we will have. However, it's important to integrate better with the scenarios where the NFD is already present. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-28 21:24:27 +01:00
Manuel Huber	8dc78057d6	ci: Refactor NVIDIA NIM test Change NIM bats file logic to allow skipping test cases which require multiple GPUs. This can be helpful for test clusters where there is only one node with a single GPU, or for local test environments with a single-node cluster with a single GPU. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-10-28 19:12:16 +01:00
Manuel Huber	be32b77baf	ci: Add NVIDIA CUDA vectoradd test This change adds a CUDA vectoradd test case and makes enabling NVRC tracing optional and idempotent. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-10-28 19:12:16 +01:00
Fabiano Fidêncio	a164693e1a	release: Bump version to 3.22.0 Bump VERSION and helm-chart versions Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-28 16:28:18 +01:00
Steve Horsman	1b46cf43c4	Merge pull request #11989 from Amulyam24/actionpz-ppc64le revert: Enable new ibm runners for ppc64le	2025-10-28 12:09:03 +00:00
Amulyam24	c603094584	revert: Enable new ibm runners for ppc64le Temporarily disables the new runners for building artifacts jobs. Will be re-enabled once they are stable. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2025-10-28 17:09:26 +05:30
Hyounggyu Choi	7d2fe5e187	revert: Enable new ibm runners for s390x This partially reverts `8dcd91c` for the s390x because the CI jobs are currently blocking the release. The new runners will be re-introduced once they are stable and no longer impact critical paths. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-10-28 11:11:51 +01:00
Fabiano Fidêncio	754e832cfa	kata-deploy: Allow passing shims / defaultShim per arch This allows us to do a full multi-arch deployment, as the user can easily select which shim can be deployed per arch, as some of the VMMs are not supported on all architectures, which would lead to a broken installation. Now, passing shims per arch we can easily have an heterogenous deployment where, for instance, we can set qemu-se-runtime-rs for s390x, qemu-cca for aarch64, and qemu-snp / qemu-tdx for x86_64 and call all of those a default kata-confidential ... and have everything working with the same deployment. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-27 22:42:37 +01:00
Greg Kurz	ffdc80733a	Merge pull request #11966 from zvonkok/gpu-cc-fix gpu: rootfs fixes	2025-10-27 10:18:13 +01:00
Alex Lyn	418d5f724e	Merge pull request #11971 from lifupan/fupan_blk_ratelimit runtime-rs: Support disk rate limiter for dragonball	2025-10-27 17:12:47 +08:00
Alex Lyn	f86ac595a8	Merge pull request #11973 from Apokleos/enhance-oci-spec runtime-rs: Enhancements for items within OCI Spec	2025-10-27 16:15:00 +08:00
Alex Lyn	690dad5528	runtime-rs: Ensure complete cleanup of stale Device Cgroups The previous procedure failed to reliably ensure that all unused Device Cgroups were completely removed, a failure consistently verified by CI tests. This change introduces a more robust and thorough cleanup mechanism. The goal is to prevent previous issues—likely stemming from improper use of Rust mutable references—that caused the modifications to be ineffective or incomplete. This ensures a clean environment and reliable CI test execution. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-10-27 12:47:48 +08:00
Alex Lyn	25ab615da5	Merge pull request #11913 from Apokleos/dedicated-error-rs CI: Add dedicated expected error message for runtime-rs	2025-10-27 10:47:07 +08:00
Zvonko Kaiser	39848e0983	gpu: rootfs fixes Build only from Ubuntu repositories do not mix with developer.nvidia.com Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Update tools/osbuilder/rootfs-builder/nvidia/nvidia_chroot.sh Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-26 19:36:55 +01:00
stevenhorsman	aec0ceb860	gatekeeper: Update mariner tests name In https://github.com/kata-containers/kata-containers/pull/11972 the auto-generate-policy: yes matrix parameter was removed which updates the name of the name, so sync this change in required-tests.yaml Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-25 17:51:31 +02:00
Kevin Zhao	e2dbe87a99	tests: Fix cca test failure on arm64 and other architectures Fix the wrong test with appendProtectionDevice on arm64 Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-25 13:54:35 +02:00
dependabot[bot]	99ae3607dc	build(deps): bump astral-tokio-tar in /src/tools/agent-ctl Bumps [astral-tokio-tar](https://github.com/astral-sh/tokio-tar) from 0.5.5 to 0.5.6. - [Release notes](https://github.com/astral-sh/tokio-tar/releases) - [Changelog](https://github.com/astral-sh/tokio-tar/blob/main/CHANGELOG.md) - [Commits](https://github.com/astral-sh/tokio-tar/compare/v0.5.5...v0.5.6) --- updated-dependencies: - dependency-name: astral-tokio-tar dependency-version: 0.5.6 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-10-25 13:53:24 +02:00
Dan Mihai	61ee4d7f8b	Merge pull request #11951 from burgerdev/watchable genpolicy: allow non-watchable ConfigMaps	2025-10-24 08:38:55 -07:00
Steve Horsman	ac601ecd45	Merge pull request #11964 from Amulyam24/k8s-ppc64le github: migrate k8s job to a different runner on ppc64le	2025-10-24 15:55:59 +01:00
Dan Mihai	ac3ea973ee	Merge pull request #11958 from microsoft/danmihai1/policy-tests-upstream5 tests: k8s: auto-generate policy for additional tests	2025-10-24 07:18:00 -07:00
Amulyam24	9876cbffd6	github: migrate k8s job to a different runner on ppc64le Migrate the k8s job to a different runner and use a long running cluster instead of creating the cluster on every run. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2025-10-24 18:20:11 +05:30
Steve Horsman	5713072385	Merge pull request #11974 from fidencio/topic/payload-after-build-upload-latest-charts actions: Push a `0.0.0-dev` chart package to the registries	2025-10-24 13:13:02 +01:00
Alex Lyn	e539432a91	CI: Add dedicated expected error message for runtime-rs Runtime-rs has its dedicated error message, we need handle it separately. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-10-24 20:08:59 +08:00
Steve Horsman	60022c9556	Merge pull request #11972 from microsoft/danmihai1/no-mariner-policy gha: no policy for cbl-mariner during ci	2025-10-24 12:03:52 +01:00
Fabiano Fidêncio	ebc1d64096	actions: Push a `0.0.0-dev` chart package to the registries This will help immensely projects consuming the kata-deploy helm chart to use configuration options added during the development cycle that are waiting for a release to be out ... allowing very early tests of the stack. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-24 11:44:27 +02:00
Alex Lyn	91db25ef02	runtime-rs: Reset capabilities for exec processes By default, `kubectl exec` inherits some capabilities from the container, which could pose a security risk in a confidential environment. This change modifies the agent policy to strictly enforce that any process started via `ExecProcessRequest` has no Linux capabilities. This prevents potential privilege escalation within an exec session, adhering to the principle of least privilege. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-10-24 15:42:17 +08:00
Alex Lyn	2de6fa520d	runtime-rs: Reset ApparmorProfile with Non value As in CoCo cases, the ApparmorProfile setting within runtime-go is set with None, we should align it with runtime-go. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-10-24 15:40:45 +08:00
Dan Mihai	b8c1215d99	gha: no policy for cbl-mariner during ci Temporarily disable the auto-generated Agent Policy on Mariner hosts, to workaround the new test failures on these hosts. When re-enabling auto-generated policy in the future, that would be better achieved with a tests/integration/kubernetes/gha-run.sh change. Those changes are easier to test compared with GHA YAML changes. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-24 04:00:36 +00:00
Fupan Li	9fda9905a7	runtime-rs: Support disk rate limiter for dragonball This PR adds code that passes disk limiter parameters to dragonball vmm.. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-10-24 10:39:53 +08:00
Markus Rudy	acc7974602	genpolicy: allow non-watchable ConfigMaps If a ConfigMap has more than 8 files it will not be mounted watchable [1]. However, genpolicy assumes that ConfigMaps are always mounted at a watchable path, so containers with large ConfigMap mounts fail verification. This commit allows mounting ConfigMaps from watchable and non-watchable directories. ConfigMap mounts can't be meaningfully verified anyway, so the exact location of the data does not matter, except that we stay in the sandbox data dirs. [1]: `0ce3f5fc6f/docs/design/inotify.md (L11-L21)` Fixes: #11777 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-10-23 15:45:17 +02:00
Fabiano Fidêncio	94adc58342	tests: Ensure helm secret for kata-deploy installation is cleaned up Every now and then, in case a failure happens, helm leaves the secret behind without cleaning it up, leading to issues in the consecutive runs. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-23 11:15:13 +02:00
Fabiano Fidêncio	12a515826d	tools: Install Golang from a reliable mirror (follow-up) Aurélien has moved to a reliable mirror for our tests, but we missed that our tools Dockerfiles could benefit from the same change, which is added now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-23 11:15:13 +02:00
Fabiano Fidêncio	560425f31f	build: kernel: Bump version to trigger signed builds for arm64 GPU Although we saw this happening, we expected it to NOT happen ... As the kernel is not signed, but we expect it to be (the cached version), then we're bailing. :-/ Let's ensure a full rebuild of kernels happen and we'll be good from that point onwards. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-23 11:15:13 +02:00
Zvonko Kaiser	0b11190fcf	gpu: Add Arm64 kernel signing Adopt working amd64 workflow to arm64 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-22 21:05:32 +02:00
Mikko Ylinen	1beda258b8	qemu: nvidia: tdx: add quote-generation-socket for attestation to work Add TDX QGS quote-generation-socket TDX QEMU object params for attestation to work in NVGPU+TDX environment. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-10-22 21:01:35 +02:00
Hyounggyu Choi	2c805900a4	Merge pull request #11891 from stevenhorsman/signature-tests-with-initdata tests/k8s: Add initdata variants of signature verification and registry authentication tests	2025-10-22 20:27:26 +02:00
Fabiano Fidêncio	ba912e6a84	kata-deploy: Adapt nydus installation to MULTI_INSTALL_SUFFIX By doing this we can ensure that more than one instance of nydus-snapshotter can be running inside the cluster, which is super useful for doing A-B "upgrades" (where we install a new version of kata-containers + nydus on B, while A is still running, and then only uninstall A after making sure that B is working as expected). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-22 20:25:03 +02:00
Fupan Li	5615c9af84	Merge pull request #11722 from RuoqingHe/2025-08-25-move-mem-agent-to-libs libs: Move mem-agent into libs workspace	2025-10-22 11:23:33 +02:00
Fabiano Fidêncio	ded336405f	kata-deploy: All qemu variants use .hypervisors.qemu.* We've been wrongly trying to set up the `${shim}` (as the qemu-snp, for instance) as the hypervisor name in the kata-containers configuration file, leading to an `tomlq` breaking as all the .hypervisors.qemu* shims are tied to the `qemu` hypervisor, and it happens regardless of the shim having a different name, or the hypervisor being experimental or not. ```sh $ grep "hypervisor.qemu" src/runtime/config/configuration- src/runtime/config/configuration-qemu-cca.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-coco-dev.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-nvidia-gpu-snp.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-nvidia-gpu-tdx.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-nvidia-gpu.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-se.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-snp.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-tdx.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu.toml.in:[hypervisor.qemu] $ grep "hypervisor.qemu" src/runtime-rs/config/configuration- src/runtime-rs/config/configuration-qemu-runtime-rs.toml.in:[hypervisor.qemu] src/runtime-rs/config/configuration-qemu-se-runtime-rs.toml.in:[hypervisor.qemu] ``` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-22 10:23:12 +02:00
Ruoqing He	000f707205	libs: mem-agent: Add missing #[cfg(test)] `tests` module inside `memcg` module should be gated behind `test`, add `[#cfg(test)]` to make those tests work properly. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	831f3ab616	libs: mem-agent: Skip tests require root Some tests from mem-agent requires root privilege, use `skip_if_not_root` to skip those tests if they were not executed under root user. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	ac539baeaa	libs: Ignore clippy `precedence` and `identity_op` Ignoring `precedence` and `identity_op` clippy warning suggested by rust 1.85.1 for now. ```console error: operator precedence can trip the unwary --> mem-agent/src/compact.rs:273:61 \| 273 \| ... total_free_movable_pages += count * 1 << order; \| ^^^^^^^^^^^^^^^^^^ help: consider parenthesizing your expression: `(count * 1) << order` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#precedence = note: `-D clippy::precedence` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::precedence)]` Checking kata-types v0.1.0 (/root/riscv/kata-containers/src/libs/kata-types) error: this operation has no effect --> mem-agent/src/compact.rs:273:61 \| 273 \| ... total_free_movable_pages += count * 1 << order; \| ^^^^^^^^^ help: consider reducing it to: `count` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#identity_op = note: `-D clippy::identity-op` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::identity_op)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	4dec1a32eb	libs: Allow clippy `type_complexity` Prefixing with `#[allow(clippy::type_complexity)]` to silence this warning, the return type is documented in comments. ```console error: very complex type used. Consider factoring parts into `type` definitions --> mem-agent/src/mglru.rs:184:6 \| 184 \| ) -> Result<HashMap<String, (usize, HashMap<usize, MGenLRU>)>> { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#type_complexity = note: `-D clippy::type-complexity` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::type_complexity)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	241e6db237	libs: Fix clippy `absurd_extreme_comparisons` Manually fix `redundant_field_names ` clippy warning by testing equality against 0 as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: this comparison involving the minimum or maximum element for this type contains a case that is always true or always false --> mem-agent/src/psi.rs:62:8 \| 62 \| if reader \| ________^ 63 \| \| .read_line(&mut first_line) 64 \| \| .map_err(\|e\| anyhow!("reader.read_line failed: {}", e))? 65 \| \| <= 0 \| \|____________^ \| = help: because `0` is the minimum value for this type, the case where the two sides are not equal never occurs, consider using `reader .read_line(&mut first_line) .map_err(\|e\| anyhow!("reader.read_line failed: {}", e))? == 0` instead = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#absurd_extreme_comparisons = note: `#[deny(clippy::absurd_extreme_comparisons)]` on by default ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	495e012160	libs: Fix clippy `redundant_field_names` Manually fix `redundant_field_names` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: redundant field names in struct initialization --> mem-agent/src/memcg.rs:441:13 \| 441 \| numa_id: numa_id, \| ^^^^^^^^^^^^^^^^ help: replace it with: `numa_id` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_field_names = note: `-D clippy::redundant-field-names` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::redundant_field_names)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	96c1175580	libs: Fix clippy `manual_strip` Manually fix `manual_strip` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: stripping a prefix manually --> mem-agent/src/mglru.rs:284:29 \| 284 \| u32::from_str_radix(&content[2..], 16) \| ^^^^^^^^^^^^^ \| note: the prefix was tested here --> mem-agent/src/mglru.rs:283:13 \| 283 \| let r = if content.starts_with("0x") { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_strip = note: `-D clippy::manual-strip` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_strip)]` help: try using the `strip_prefix` method \| 283 ~ let r = if let Some(<stripped>) = content.strip_prefix("0x") { 284 ~ u32::from_str_radix(<stripped>, 16) \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	2dc0b14512	libs: Fix clippy `field_reassign_with_default` Manually fix `field_reassign_with_default` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: field assignment outside of initializer for an instance created with Default::default() --> mem-agent/src/memcg.rs:874:21 \| 874 \| numa_cg.numa_id = numa; \| ^^^^^^^^^^^^^^^^^^^^^^^ \| note: consider initializing the variable with `memcg::CgroupConfig { numa_id: numa, ..Default::default() }` and removing relevant reassignments --> mem-agent/src/memcg.rs:873:21 \| 873 \| let mut numa_cg = CgroupConfig::default(); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#field_reassign_with_default = note: `-D clippy::field-reassign-with-default` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::field_reassign_with_default)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	b399ac7f6d	libs: Fix clippy `derivable_impls` Fix `derivable_impls` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: this `impl` can be derived --> mem-agent/src/memcg.rs:123:1 \| 123 \| / impl Default for CgroupConfig { 124 \| \| fn default() -> Self { 125 \| \| Self { 126 \| \| no_subdir: false, ... \| 132 \| \| } \| \|_^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derivable_impls = note: `-D clippy::derivable-impls` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::derivable_impls)]` help: replace the manual implementation with a derive attribute \| 117 + #[derive(Default)] 118 ~ pub struct CgroupConfig { \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	55bafa257d	libs: Fix clippy `redundant_pattern_matching` Fix `redundant_pattern_matching` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: redundant pattern matching, consider using `is_some()` --> mem-agent/src/memcg.rs:595:40 \| 595 \| ... if let Some(_) = config_map.get_mut(path) { \| -------^^^^^^^--------------------------- help: try: `if config_map.get_mut(path).is_some()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_pattern_matching = note: `-D clippy::redundant-pattern-matching` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::redundant_pattern_matching)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	a9f415ade5	libs: Fix clippy `needless_bool` Fix `needless_bool` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: this if-then-else expression returns a bool literal --> mem-agent/src/memcg.rs:855:17 \| 855 \| / if configs.is_empty() { 856 \| \| true 857 \| \| } else { 858 \| \| false 859 \| \| } \| \|_________________^ help: you can reduce it to: `configs.is_empty()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_bool = note: `-D clippy::needless-bool` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::needless_bool)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	6959bc1b3c	libs: Fix clippy `for_kv_map` Fix `for_kv_map` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: you seem to want to iterate on a map's keys --> mem-agent/src/memcg.rs:822:43 \| 822 \| for (single_config, _) in &secs_map.cgs { \| ^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#for_kv_map help: use the corresponding method \| 822 \| for single_config in secs_map.cgs.keys() { \| ~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	702665ee8b	libs: Fix clippy `manual_map` Fix `manual_map` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: manual implementation of `Option::map` --> mem-agent/src/memcg.rs:375:21 \| 375 \| / if let Some(hmg) = hmg.get(&(numa_id as usize)) { 376 \| \| Some((numa_id, Numa::new(hmg, path, psi_path))) 377 \| \| } else { 378 \| \| None 379 \| \| } \| \|_____________________^ help: try: `hmg.get(&(numa_id as usize)).map(\|hmg\| (numa_id, Numa::new(hmg, path, psi_path)))` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_map = note: `-D clippy::manual-map` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_map)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	b47a382d00	libs: Fix clippy `into_iter_on_ref` Fix `into_iter_on_ref` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: this `.into_iter()` call is equivalent to `.iter_mut()` and will not consume the `Vec` --> mem-agent/src/memcg.rs:1122:27 \| 1122 \| for info in infov.into_iter() { \| ^^^^^^^^^ help: call directly: `iter_mut` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#into_iter_on_ref = note: `-D clippy::into-iter-on-ref` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::into_iter_on_ref)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	2986eb3a78	libs: Fix clippy `legacy_numeric_constants` Fix `legacy_numeric_constants` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: usage of a legacy numeric constant --> mem-agent/src/compact.rs:132:47 \| 132 \| if self.config.compact_force_times == std::u64::MAX { \| ^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#legacy_numeric_constants help: use the associated constant instead \| 132 \| if self.config.compact_force_times == u64::MAX { \| ~~~~~~~~ ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	3d146a525c	libs: Fix clippy `single_component_path_imports` Fix `single_component_path_imports` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: this import is redundant --> mem-agent/src/mglru.rs:345:5 \| 345 \| use slog_term; \| ^^^^^^^^^^^^^^ help: remove it entirely \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#single_component_path_imports ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	b84a03e434	libs: Fix clippy `from_str_radix_10` Fix `from_str_radix_10` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: this call to `from_str_radix` can be replaced with a call to `str::parse` --> mem-agent/src/mglru.rs:29:14 \| 29 \| let id = usize::from_str_radix(words[1], 10) \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `words[1].parse::<usize>()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#from_str_radix_10 = note: `-D clippy::from-str-radix-10` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::from_str_radix_10)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	ded6f2d116	libs: Fix clippy `needless_borrow` Fix `needless_borrow` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: this expression creates a reference which is immediately dereferenced by the compiler --> mem-agent/src/memcg.rs:1100:52 \| 1100 \| self.run_eviction_single_config(infov, &config)?; \| ^^^^^^^ help: change this to: `config` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	541436c82c	libs: Fix clippy `ptr_arg` Fix `ptr_arg` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: writing `&PathBuf` instead of `&Path` involves a new object where a slice will do --> mem-agent/src/memcg.rs:367:19 \| 367 \| psi_path: &PathBuf, \| ^^^^^^^^ help: change this to: `&Path` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg = note: requested on the command line with `-D clippy::ptr-arg` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	cdd94060f1	libs: Fix clippy `crate_in_macro_def` Fix `crate_in_macro_def` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: `crate` references the macro call's crate --> mem-agent/src/misc.rs:12:22 \| 12 \| slog::error!(crate::misc::sl(), "{}", format_args!($($arg)*)) \| ^^^^^ help: to reference the macro definition's crate, use: `$crate` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#crate_in_macro_def = note: `-D clippy::crate-in-macro-def` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::crate_in_macro_def)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	150aee088d	libs: Fix clippy `len_zero` Fix `len_zero` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: length comparison to zero --> mem-agent/src/memcg.rs:225:61 \| 225 \| let (keep, moved) = vec.drain(..).partition(\|c\| c.numa_id.len() > 0); \| ^^^^^^^^^^^^^^^^^^^ help: using `!is_empty` is clearer and more explicit: `!c.numa_id.is_empty()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#len_zero ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	1a0935d35c	libs: Fix clippy `bool_assert_comparison` Fix `bool_assert_comparison` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: used `assert_eq!` with a literal bool --> mem-agent/src/memcg.rs:1378:9 \| 1378 \| assert_eq!(m.get_timeout_list().len() > 0, true); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#bool_assert_comparison = note: `-D clippy::bool-assert-comparison` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::bool_assert_comparison)]` help: replace it with `assert!(..)` \| 1378 - assert_eq!(m.get_timeout_list().len() > 0, true); 1378 + assert!(m.get_timeout_list().len() > 0); \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	75171b0cb7	libs: Fix clippy `collapsible_else_if` Fix `collapsible_else_if` clippy warning as suggested by rust 1.85.1, since `mem-agent` is now a member of `libs` workspace. ```console error: this `else { if .. }` block can be collapsed --> mem-agent/src/agent.rs:205:16 \| 205 \| } else { \| ________________^ 206 \| \| if mas.refresh() { 207 \| \| continue; 208 \| \| } 209 \| \| } \| \|_________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#collapsible_else_if = note: `-D clippy::collapsible-else-if` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::collapsible_else_if)]` help: collapse nested if block \| 205 ~ } else if mas.refresh() { 206 + continue; 207 + } \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	f605097daa	libs: Make `mem-agent` a member of `libs` workspace Add `mem-agent` to `libs` workspace and sort the members list. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	7bb28d8da7	libs: Move `mem-agent` into `src/libs` `mem-agent` now does not ship example binaries and serves as a library for `agent` to reference, so we move it into `libs` to better manage it. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	f0e223c535	mem-agent: Rename `mem-agent-lib` to `mem-agent` Rename `mem-agent-lib` to `mem-agent` before we move it into `src/libs`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Dan Mihai	d7176ffcc8	tests: k8s-sandbox-vcpus-allocation generated policy Auto-generate policy for k8s-sandbox-vcpus-allocation.bats. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-21 21:36:49 +00:00
Dan Mihai	25299bc2a9	tests: k8s-block-volume.bats generated policy Auto-generate policy for k8s-block-volume.bats. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-21 21:36:40 +00:00
Dan Mihai	02a8ec0f63	tests: k8s-measured-rootfs auto generated policy Generate Agent Policy for the pod from k8s-measured-rootfs.bats. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-21 21:36:27 +00:00
Zvonko Kaiser	1ff8b066c6	Merge pull request #11941 from fidencio/topic/kata-deploy-add-missing-helm-docs helm: Add missing documentation	2025-10-21 16:04:55 -04:00
Dan Mihai	ebaecbd3d6	Merge pull request #11949 from microsoft/danmihai1/optional-secret-volume genpolicy: allow optional secret volumes	2025-10-21 12:27:13 -07:00
Aurélien Bombo	d01fa478ad	Merge pull request #11948 from kata-containers/sprt/fix-go-download tests: Install Go from reliable mirror	2025-10-21 14:00:09 -05:00
Aurélien Bombo	89e976e413	Merge pull request #11955 from kata-containers/sprt/refresh-oidc-before-delete ci: Always refresh OIDC token before cluster deletion	2025-10-21 13:52:24 -05:00
Dan Mihai	f11853ab33	tests: k8s-optional-empty-secret.bats policy Auto-generate policy in k8s-optional-empty-secret.bats, now that genpolicy suppprts optional secret-based volumes. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-21 15:27:31 +00:00
Dan Mihai	346e1c1db6	genpolicy: allow optional secret volumes Don't reject during policy generation Secret volumes defined as optional. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-21 15:27:31 +00:00
Aurélien Bombo	785afb1dec	Merge pull request #11885 from kata-containers/sprt/block-dev-hostpath docs: Document behavior of `BlockDevice` hostPath, procs, and sysfs mounts	2025-10-21 09:38:27 -05:00
Aurélien Bombo	b7f542443e	ci: Always refresh OIDC token before cluster deletion This forces OIDC token refresh even if the tests step failed, so that we also have proper credentials to delete the cluster in that case. I first noticed the original issue here: https://github.com/kata-containers/kata-containers/actions/runs/18659064688/job/53215379040?pr=11950 Fixes: #11953 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-21 09:35:52 -05:00
Fabiano Fidêncio	552378cf1e	helm: Add missing documentation We've recently added support for: * deploying and setting up a snapshotter, via _experimentalSetupSnapshotter * enabling experimental_force_guest_pull, via _experimentalForceGuestPull However, we never updated the documentation for those, thus let's do it now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-21 16:20:21 +02:00
Greg Kurz	43455774ce	Merge pull request #11939 from ldoktor/ocp-helm-sudo ci.ocp: Install helm in local dir	2025-10-21 16:12:41 +02:00
Aurélien Bombo	93eef5b253	docs: Document behavior of procfs and sysfs mounts The claims in the doc come from #808 and #886. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-21 08:50:06 -05:00
Aurélien Bombo	033299e46d	docs: Document behavior of BlockDevice hostPath volumes This is a follow-up to #11832. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-21 08:50:06 -05:00
Aurélien Bombo	22aa27ff5e	tests: Install Go from reliable mirror Downloading Go from storage.googleapis.com fails intermittently with a 403 (see error below) so we switch to go.dev as referenced at https://go.dev/dl/. /tmp/install-go-tmp.Rw5Q4thEWr ~/work/kata-containers/kata-containers /usr/bin/go [install_go.sh:85] INFO: removing go version go1.24.9 linux/amd64 [install_go.sh:94] INFO: Download go version 1.24.6 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 298 100 298 0 0 2610 0 --:--:-- --:--:-- --:--:-- 2614 [install_go.sh:97] INFO: Install go gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now [install_go.sh:99] ERROR: sudo tar -C /usr/local/ -xzf go1.24.6.linux-amd64.tar.gz https://github.com/kata-containers/kata-containers/actions/runs/18602801597/job/53045072109?pr=11947#step:5:17 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-21 08:47:41 -05:00
Manuel Huber	af34308c83	gpu: remove version suffixes for imex and nscq This change ensures that the NVIDIA package repository for nvidia-imex and libnvidia-nspc is being used as source. The NVIDIA repository does not publish these packages with a -580 version suffix, which made us fall back to the packages from the Ubuntu repository. These two packages were recently updated by Ubuntu to depend on nvidia-kernel-common-580-server (this happened from version 580.82.07-0ubuntu1 to version 580.95.05-0ubuntu1). This conflicts with nvidia-kernel-common-580 which gets installed by nvidia-headless-no-dkms-580-open, thus causing a build failure. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-10-21 15:42:51 +02:00
Lukáš Doktor	5038578fba	ci.ocp: Install helm in local dir in CI helm is not yet installed and we don't have root access. Let's use the current dir, which should be writable, and --no-sudo option to install it. Note when helm is installed it should not change anything and simply use the syste-wide installation. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-10-21 06:28:36 +02:00
Steve Horsman	947862f804	Merge pull request #11904 from manuelh-dev/mahuber/conf-rootfs-nv-guest-pull gpu: nvidia rootfs build with guest pull support	2025-10-17 16:08:05 +01:00
Steve Horsman	94b6a1d43e	Merge pull request #10664 from kevinzs2048/add-cca runtime-go \| kata-deploy: Add Arm CCA confidential Guest Support	2025-10-17 14:38:34 +01:00
Manuel Huber	4ad8c31b5a	gpu: build nv rootfs with guest pull support While the local-build's folder's Makefile dependencies for the confidential nvidia rootfs targets already declare the pause image and coco-guest-components dependencies, the actual rootfs composition does not contain the pause image bundle and relevant certificates for guest pull. This change ensure the rootfs gets composed with the relevant files. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-10-16 09:20:49 -07:00
Aurélien Bombo	edbb4b633c	Merge pull request #11890 from microsoft/saulparedes/optional_initdata genpolicy: take path to initdata from command line if provided	2025-10-16 11:04:57 -05:00
Markus Rudy	d5cb9764fd	kata-types: use pretty TOML encoder for initdata TOML was chosen for initdata particularly for the ability to include policy docs and other configuration files without mangling them. The default TOML encoding renders string values as single-line, double-quoted strings, effectively depriving us of this feature. This commit changes the encoding to use `to_string_pretty`, and includes a test that verifies the desirable aspect of encoding: newlines are kept verbatim. Fixes: #11943 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-10-16 12:08:18 +02:00
Kevin Zhao	141070b388	Kata-deploy: Add kata-deploy set up for qemu-cca Support launch qemu-cca in Kata-deploy. Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-16 17:24:52 +08:00
Kevin Zhao	af919686ab	Kata-deploy: Add CCA firmware build support runtime: pass firmware to CCA Realm Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-16 17:24:45 +08:00
Kevin Zhao	16e91bfb21	kata-deploy: Add support for Arm CCA Qemu build The Qemu support is picked up from: https://git.codelinaro.org/linaro/dcap/qemu.git, branch: cca/2025-04-16 More info regarding the CCA software stack dev and test, please refer to link: https://linaro.atlassian.net/wiki/spaces/QEMU/pages/29051027459/Building+an+RME+stack+for+QEMU Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-16 17:24:08 +08:00
Seunguk Shin	c7d5f207f1	kata-deploy: support build confidential rootfs and initrd for CCA Also add cca-attester for coco-guest-component Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org> Co-authored-by: Seunguk Shin <seunguk.shin@arm.com>	2025-10-16 17:24:03 +08:00
Seunguk Shin	40dac78412	kata-deploy: support build confidential kernel and shim-v2 for CCA After supporting the Arm CCA, it will rely on the kernel kvm.h headers to build the runtime. The kernel-headers currently quite new with the traditional one, so that we rely on build the kernel header first and then inject it to the shim-v2 build container. Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org> Co-authored-by: Seunguk Shin <seunguk.shin@arm.com>	2025-10-16 17:23:58 +08:00
Kevin Zhao	bfa7f2486d	runtime: Add Arm64 CCA confidential Guest Support This commit add the support for Arm CCA/RME support in golang runtime. The guest kernel is support since Linux 6.13. The host kernel which Kata is running is picked from: https://gitlab.arm.com/linux-arm/linux-cca branch: cca-host/v8 which is currently very stable and reviewed for a while, and it is expecting to merged this year. The Qemu support is picked up from: https://git.codelinaro.org/linaro/dcap/qemu.git, branch: cca/2025-05-28, The Qemu support will be merged to upstream after the CCA host support official support in linux kernel. More info regarding the CCA software stack dev and test, please refer to link: https://linaro.atlassian.net/wiki/spaces/QEMU/pages/29051027459/Building+an+RME+stack+for+QEMU Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-16 17:23:54 +08:00
stevenhorsman	9b086376a4	tests/k8s: Skip initdata tests on tdx The new initdata variants of the tests are failing on the tdx runner, so as discussed, skip them for now: Issue #11945 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-15 14:52:08 +01:00
stevenhorsman	09149407fd	tests/k8s: Delete k8s-initdata.bats Now we have wider coverage of initdata testing in k8s-guest-pull-image-signature.bats then remove the old testing. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-15 14:52:08 +01:00
stevenhorsman	bdc0a3cf19	tests/k8s: Add initdata variant of registry creds tests Our current set of authenticated registry tests involve setting kernel_params to config the image pull process, but as of kata-containers#11197 this approach is not the main way to set this configuration and the agent config has been removed. Instead we should set the configuration in the `cdh.toml` part of the initdata, so add new test cases for this. In future, when we have been through the deprecation process, we should remove the old tests Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-15 14:52:08 +01:00
stevenhorsman	7fbbd170ee	tests/k8s: Add initdata variants of oci signature tests Our current set of signature tests involve setting kernel_parameters to config the image pull process, but as of https://github.com/kata-containers/kata-containers/pull/11197 this approach is not the main way to set this configuration and the agent config has been removed. Instead we should set the configuration in the `cdh.toml` part of the initdata, so add new test cases for this. In future, when we have been through the deprecation process, we should remove the old tests Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-15 14:52:08 +01:00
stevenhorsman	90ad5cd884	tests/k8s: Refactor initdata annotation Create a shared get_initdata method that injects a cdh image section, so we don't duplicate the initdata structure everywhere Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-15 14:52:08 +01:00
Fabiano Fidêncio	aa7e46b5ed	tests: Check the multi-snapshotter situation on containerd One problem that we've been having for a reasonable amount of time, is containerd not behaving very well when we have multiple snapshotters. Although I'm adding this test with my "CoCo" hat in mind, the issue can happen easily with any other case that requires a different snapshotter (such as, for instance, firecracker + devmapper). With this in mind, let's do some stability tests, checking every hour a simple case of running a few pre-defined containers with runc, and then running the same containers with kata. This should be enough to put us in the situation where containerd gets confused about which snapshotter owns the image layers, and break on us (or not break and show us that this has been solved ...). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-15 13:35:43 +02:00
Manuel Huber	8221361915	gpu: Use variable to differentiate rootfs variants With this change we namespace the stage one rootfs tarball name and use the same name across all uses. This will help overcome several subtle local build problems. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-10-15 12:39:44 +02:00
Hyounggyu Choi	88c333f2a6	agent: Fix race in tests calling LinuxContainer::new() We fix the following error: ``` thread 'sandbox::tests::add_and_get_container' panicked at src/sandbox.rs:901:10: called `Result::unwrap()` on an `Err` value: Create cgroupfs manager Caused by: 0: fs error caused by: Os { code: 17, kind: AlreadyExists, message: "File exists" } 1: File exists (os error 17) note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` by ensuring that the cgroup path is unique for tests run in the same millisecond. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-10-15 11:32:22 +02:00
Hyounggyu Choi	8412af919d	agent/netlink: Attempt to fix ARP and routes tests test_add_one_arp_neighbor ========================= We attempt to fix the following error: ``` thread 'netlink::tests::test_add_one_arp_neighbor' panicked at src/netlink.rs:1163:9: assertion `left == right` failed left: "" right: "192.0.2.127 lladdr 6a:92:3a:59:70:aa PERMANENT" ``` by adding a sleep to prepare_env_for_test_add_one_arp_neighbor() to wait for the kernel interfaces to settle. list_routes =========== We attempt to fix the following error (notice that the available devices contain "dummy_for_arp"): ``` thread 'netlink::tests::list_routes' panicked at src/netlink.rs:986:14: Failed to list routes: available devices: [Interface { device: "", name: "lo", IPAddresses: [IPAddress { family: v6, address: "127.0.0.1", mask: "8", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v6, address: "169.254.1.1", mask: "31", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "2001:db8:85a3::8a2e:370:7334", mask: "128", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "::1", mask: "128", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }], mtu: 65536, hwAddr: "00:00:00:00:00:00", devicePath: "", type_: "", raw_flags: 0, special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, Interface { device: "", name: "enc0", IPAddresses: [IPAddress { family: v6, address: "10.249.65.4", mask: "24", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "fe80::4ff:fe57:b3e4", mask: "64", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }], mtu: 1500, hwAddr: "02:00:04:57:B3:E4", devicePath: "", type_: "", raw_flags: 0, special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, Interface { device: "", name: "docker0", IPAddresses: [IPAddress { family: v6, address: "172.17.0.1", mask: "16", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "fe80::42:56ff:fe5c:d9f9", mask: "64", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }], mtu: 1500, hwAddr: "02:42:56:5C:D9:F9", devicePath: "", type_: "", raw_flags: 0, special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, Interface { device: "", name: "dummy_for_arp", IPAddresses: [IPAddress { family: v6, address: "192.0.2.2", mask: "24", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "fe80::f4f2:64ff:fe46:2b01", mask: "64", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }], mtu: 1500, hwAddr: "4A:73:DE:A3:07:64", devicePath: "", type_: "", raw_flags: 0, special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }] Caused by: 0: error looking up device 19888 1: Received a netlink error message No such device (os error 19) ``` by calling clean_env_for_test_add_one_arp_neighbor() at the start of the test. However this fix is uncertain: the original assumption for the fix was that the "dummy_for_arp" interface left over from test_add_one_arp_neighbor was the cause of the error. But (3) below shows that running list_routes in isolation while that interface is present is NOT enough to repro the error: 1. Running all tests + no clean_env in list_routes => list_routes FAILS (before this PR) 2. Running all tests + clean_env in list_routes => list_routes PASSES (after this PR) 3. Running only list_routes + dummy_for_arp present => list_routes PASSES (manual test, see below) ``` $ ip a l 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 169.254.1.1/31 brd 169.254.1.1 scope global lo valid_lft forever preferred_lft forever inet6 2001:db8:85a3::8a2e:370:7334/128 scope global valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever 2: enc0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 02:00:01:02:e2:47 brd ff:ff:ff:ff:ff:ff inet 10.240.64.4/24 metric 100 brd 10.240.64.255 scope global dynamic enc0 valid_lft 159sec preferred_lft 159sec inet6 fe80::1ff:fe02:e247/64 scope link valid_lft forever preferred_lft forever 311: dummy_for_arp: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether ee:79:66:3a:dc:bc brd ff:ff:ff:ff:ff:ff inet 192.0.2.2/24 scope global dummy_for_arp valid_lft forever preferred_lft forever inet6 fe80::4c2e:83ff:fe7d:ef00/64 scope link valid_lft forever preferred_lft forever $ sudo -E PATH=$PATH make test ../../utils.mk:162: "WARNING: s390x-unknown-linux-musl target is unavailable" Finished `test` profile [unoptimized + debuginfo] target(s) in 0.25s Running unittests src/main.rs (target/s390x-unknown-linux-gnu/debug/deps/kata_agent-b2b5b200deca712e) running 1 test test netlink::tests::list_routes ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 224 filtered out; finished in 0.00s ``` Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-10-15 11:32:22 +02:00
Paul Meyer	06ed957a45	virtcontainers: fix nydus cleanup on rootfs unmount This was discovered by @sprt in https://github.com/kata-containers/kata-containers/pull/10243#discussion_r2373709407. Checking for state.Fstype makes no sense as we know it is empty. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-10-15 09:22:51 +02:00
Zvonko Kaiser	10f8ec0c20	cdi: Add Crate remove Github Hash Use CDI exclusively from crates.io and not from a GH repository. Cargo can easily check if a new version is available and we can far more easier bump it if needed. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-15 09:22:20 +02:00
Greg Kurz	3507b2038e	Merge pull request #11936 from ldoktor/ocp-helm ci.ocp: Use helm to install kata	2025-10-14 18:22:28 +02:00
Lukáš Doktor	bdb0afc4e0	ci.ocp: Fix incorrectly quoted argument with the shellcheck fixes we accidentally quoted the "-n NAMESPACE" argument where we should have used array instead, which lead to oc considering this as a pod name and returning error. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-10-14 17:59:33 +02:00
Lukáš Doktor	f891f340bc	ci.ocp: Use helm to install kata which is the current supported way to deploy kata-containers directly. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-10-14 17:59:33 +02:00
Aurélien Bombo	0c6fcde198	Merge pull request #11918 from fidencio/topic/builds-qemu-use-liburing-newer-than-2.2 builds: qemu: Use a liburing newer than 2.2	2025-10-14 10:17:16 -05:00
Steve Horsman	363701d767	Merge pull request #11915 from stevenhorsman/ibm-runner-followups-part-i ci: Add protobuf-compiler dependencies	2025-10-14 13:28:45 +01:00
Fabiano Fidêncio	2ad81c4797	build: qemu: Fix cache logic We need to ensure that any change on the Dockerfile (and its dir) leads to the build being retriggered, rather than using the cached version. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-14 12:17:43 +02:00
Fabiano Fidêncio	2f73e34e33	builds: qemu: Use a liburing newer than 2.2 Due to a potential regression introduced by: `984a32f17e (565f3835aaed6321caab4f7c4f8560a687f6000b_379_386)` Reported-by: Aurélien Bombo <abombo@microsoft.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-14 12:17:28 +02:00
stevenhorsman	8ce714cf97	ci: Add protobuf-compiler dependencies We are seeing more protoc related failures on the new runners, so try adding the protobuf-compiler dependency to these steps to see if it helps. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-14 10:58:58 +01:00
Fabiano Fidêncio	b0b0038689	versions: Bump QEMU to 10.1.1 QEMU 10.1.1 was released on October 8th, 2025, let's bump it on our side. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-13 23:52:01 +02:00
Fabiano Fidêncio	d46474cfc0	tests: Run apt-get update before installing a package Otherwise it'll just break. :-) Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-13 23:33:46 +02:00
Saul Paredes	ba7a5953c8	tests: k8s-policy-pod.bats: test unspecified initdata path use auto_generate_policy_no_added_flags, so we don't pass --initdata-path to genpolicy Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-10-13 10:47:53 -07:00
Saul Paredes	395f237fc2	tests: k8s: use default-initdata.toml when auto-generating policy - copy default-initdata.toml in create_tmp_policy_settings_dir, so it can be modified by other tests if needed - make auto_generate_policy use default-initdata.toml by default - add auto_generate_policy_no_added_flags, so it may be used by tests that don't want to use default-initdata.toml by default Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-10-13 10:47:53 -07:00
Saul Paredes	dfd269eb87	genpolicy: take path to initdata from command line if provided Otherwise use default initdata. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-10-13 10:47:53 -07:00
Fabiano Fidêncio	fb43d3419f	build: Fix nvidia kernel breakage On commit `9602ba6ccc`, from February this year, we've introduced a check to ensure that the files needed for signing the kernel build are present. However, we've noticed last week that there were a reasonable amount of wrong assumptions with the workflow. :-) Zvonko fixed the majority of those, but this bit was left and it'd cause breakages when using kernel that was cached ... although passing when building new kernels. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-13 19:28:40 +02:00
Fupan Li	8b06f3d95d	Merge pull request #11905 from Apokleos/coldplug-scsidev runtime-rs: Support virtio-scsi for initdata within non-TEE	2025-10-11 16:11:39 +08:00
Xuewei Niu	5acb6d8e13	Merge pull request #11863 from lifupan/fupan_blk_remove runtime-rs: ad the block device hot unplug for clh	2025-10-11 10:31:48 +08:00
Aurélien Bombo	ff973a95c8	Merge pull request #11916 from zvonkok/fix-kernel-module-signing gpu: Fix kernel module signing	2025-10-10 17:17:08 -05:00
Zvonko Kaiser	b00013c717	kernel: Add KBUILD_SIGN_PIN pass through This is needed to the kernel setup picks up the correct config values from our fragments directories. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-10 15:45:34 -04:00
Zvonko Kaiser	37bd5e3c9d	gpu: Add kernel CONFIG check We need to make sure that the kernel we're using has the correct configs set, otherwise the module signing will not work. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-10 15:45:34 -04:00
Fabiano Fidêncio	e782d1ad50	ci: k8s: Test experimental_force_guest_pull Now that we have added the ability to deploy kata-containers with experimental_force_guest_pull configured, let's make sure we test it to avoid any kind of regressions. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-10 20:08:10 +02:00
Fabiano Fidêncio	1bc89d09ae	tests: Consider SNAPSHOTTER in the cluster name Otherwise we have no way to differentiate running tests on qemu-coco-dev with different snapshotters. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-10 20:08:10 +02:00
Fabiano Fidêncio	496e255ea2	build: Fix KBUILD_SIGN_PIN usage What was done in the past, trying to set the env var on the same step it'd be used, simply does not work. Instead, we need to properly set it through the `env` set up, as done now. We're also bumping the kata_config_version to ensure we retrigger the kernel builds. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-10 15:25:10 +02:00
Paul Meyer	5ae891ab46	versions: bump opa 1.6.0 -> 1.9.0 Bumping opa to latest release. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-10-10 10:58:51 +02:00
Steve Horsman	a570fdc0fd	Merge pull request #11909 from kata-containers/ibm-runners-test ci: Enable new ibm runners	2025-10-10 09:42:53 +01:00
stevenhorsman	8dcd91cf5f	ci: Enable new ibm runners We have some scalable s390x and ppc runners, so start to use them for build and test, to improve the throughput of our CI Signed-off-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-10-10 09:42:06 +01:00
Fabiano Fidêncio	06a3bbdd44	ci: k8s: coco: Add "Report tests" step For some reason we didn't have the "Report tests" step as part of the TEE jobs. This step immensely helps to check which tests are failing and why, so let's add it while touching the workflow. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-10 09:51:59 +02:00
Fabiano Fidêncio	a1f90fe350	tests: k8s: Unify k8s TEE tests There's no reason to have the code duplication between the SNP / TDX tests for CoCo, as those are basically using the same configuration nowadays. Note that for the TEEs case, as the nydus-snapshotter is deployed by the admin, once, instead of deploying it on every run ... I'm actually removing the nydus-snapshotter steps so we make it clear that those steps are not performed by the CI. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-10 09:51:59 +02:00
Alex Lyn	4c386b51d9	runtime-rs: Add support for handling virtio-scsi devices As virtio-scsi has been set the default block device driver, the runtime also need to correctly handle the virtio-scsi info, specially the SCSI address required within kata-agent handling logic. And getting and assigning the scsi_addr to kata agent device id will be enough. This commit just do such work. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-10-10 11:31:04 +08:00
Fupan Li	4002a91452	runtime-rs: ad the block device hot unplug for clh Since runtime-rs support the block device hotplug with creating new containers, and the device would also be removed when the container stopped, thus add the block device unplug for clh. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-10-10 10:02:12 +08:00
Zvonko Kaiser	afbec780a9	Merge pull request #11903 from zvonkok/ppcie gpu: PPCIE support DGX like systems	2025-10-09 21:06:41 -04:00
Aurélien Bombo	a3a45429f6	Merge pull request #11865 from microsoft/danmihai1/nested-configmap-secret tests: k8s-nested-configmap-secret policy	2025-10-09 11:33:50 -05:00
Alex Lyn	b42ef09ffb	Merge pull request #11888 from spuzirev/main runtime: fix "num-queues expects uint64" error with virtio-blk	2025-10-09 20:21:32 +08:00
Xuewei Niu	2a43bf37ed	Merge pull request #11894 from M-Phansa/main runtime: fix device typo	2025-10-09 16:53:40 +08:00
Alex Lyn	a54d95966b	runtime-rs: Support virtio-scsi for initdata within non-TEE This commit introduces support for selecting `virtio-scsi` as the block device driver for QEMU during initial setup. The primary goal is to resolve a conflict in non-TEE environments: 1. The global block device configuration defaults to `virtio-scsi`. 2. The `initdata` device driver was previously designed and hardcoded to `virtio-blk-pci`. 3. This conflict prevented unified block device usage. By allowing `virtio-scsi` to be configured at cold boot, the `initdata` device can now correctly adhere to the global setting, eliminating the need for a hardcoded driver and ensuring consistent block device configuration across all supported devices (excluding rootfs). Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-10-09 15:52:33 +08:00
Xuewei Niu	5208ee4ec0	Merge pull request #11674 from was-saw/dragonball_seccomp runtime-rs: add seccomp support for dragonball	2025-10-09 15:01:15 +08:00
wangxinge	8e1b33cc14	docs: add document for seccomp This commit adds a document to use seccomp in runtime-rs Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>	2025-10-09 13:25:17 +08:00
wangxinge	2abf6965ff	dragonball: add seccomp support for dragonball This commit modifies seccomp framework to support different restrictions for different threads. Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>	2025-10-09 13:25:17 +08:00
wangxinge	bb6fb8ff39	runtime-rs: add seccomp support for dragonball The implementation of the seccomp feature in Dragonball currently has a basic framework. But the actual restriction rules are empty. This pull request includes the following changes: - Modifiy configuration files to relevant configuration files. - Modifiy seccomp framework to support different restrictions for different threads. - Add new seccomp rules for the modified framework. This commit primarily implements the changes 1 and 3 for runtime-rs. Fixes: #11673 Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>	2025-10-09 13:25:17 +08:00
Zvonko Kaiser	91739d4425	gpu: PPCIE support DGX like systems For DGX like systems we need additional binaries and libraries, enable the Kata AND CoCo use-case. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Update tools/osbuilder/rootfs-builder/nvidia/nvidia_rootfs.sh Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-09 00:00:12 +00:00
Dan Mihai	364d3cded0	tests: k8s-nested-configmap-secret policy Add auto-generated agent policy in k8s-nested-configmap-secret.bats. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-08 23:37:54 +00:00
Sergei Puzyrev	62b12953c7	runtime: fix "num-queues expects uint64" error with virtio-blk Unneeded type-conversion was removed. Fixes #11887 Signed-off-by: Sergei Puzyrev <spuzirev@gmail.com>	2025-10-08 17:09:22 -05:00
Adeet Phanse	4e4f9c44ae	runtime: fix device typo Fix device typo in dragonball / runtime-rs / runtime. Signed-off-by: Adeet Phanse <adeet.phanse@mongodb.com>	2025-10-08 17:08:27 -05:00
Aurélien Bombo	d954932876	Merge pull request #11883 from kata-containers/sprt/zizmor-fixes3 ci: zizmor: Address all issues	2025-10-08 17:01:48 -05:00
Aurélien Bombo	07645cf58b	ci: actionlint: Address issues and set as required Address issues just introduced and set actionlint as a required by removing the path filter. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-08 16:55:27 -05:00
Aurélien Bombo	b3a551d438	ci: zizmor: Reestablish as required test We can re-require this now that we've addressed all the issues. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-08 16:55:27 -05:00
Aurélien Bombo	5a4ddb8c71	ci: zizmor: Fix all `template-injection` alerts Fix all instances of template injection by using environment variables as recommended by Zizmor, instead of directly injecting values into the commands. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-08 16:55:26 -05:00
Aurélien Bombo	7b203d1b43	ci: zizmor: Ignore `dangerous-triggers` audit for known safe usage The two ignored cases are strictly necessary for the CI to work today, and we have various security mitigations in place. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-08 16:55:08 -05:00
Aurélien Bombo	7afdfc7388	ci: zizmor: Disable `undocumented-permissions` audit There are 62 such warnings and addressing them would take quite a bit of time so just disable them for now. help[undocumented-permissions]: permissions without explanatory comments --> ./.github/workflows/release.yaml:71:7 \| 71 \| packages: write \| ^^^^^^^^^^^^^^^ needs an explanatory comment 72 \| id-token: write \| ^^^^^^^^^^^^^^^ needs an explanatory comment 73 \| attestations: write \| ^^^^^^^^^^^^^^^^^^^ needs an explanatory comment \| = note: audit confidence → High Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-08 16:55:08 -05:00
Aurélien Bombo	889ba0d5db	Merge pull request #11901 from kata-containers/sprt/remove-docs-url-check gha: Fix `docs-url-alive-check` workflow	2025-10-08 14:42:58 -05:00
Aurélien Bombo	ec81ea95df	gha: Add `workflow_dispatch` trigger to `docs-url-alive-check` We can't test this PR because the workflow needs this trigger, so adding this will allow testing future PRs. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-08 14:39:34 -05:00
Aurélien Bombo	4d760e64ae	gha: Fix docs-url-alive-check workflow The Go installation step was broken because the checkout action was checking out the code in a subdirectory: https://github.com/kata-containers/kata-containers/actions/runs/18265538456/job/51999316919 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-08 14:39:34 -05:00
Aurélien Bombo	476c827fca	Merge pull request #11878 from kata-containers/sprt/privileged-docs docs: Document `privileged_without_host_devices=false` as unsupported	2025-10-08 11:12:45 -05:00
Fabiano Fidêncio	dbb1eb959c	kata-deploy: Allow users to set experimental_force_guest_pull For those who are not willing to use the nydus-snapshotter for pulling the image inside the guest, let's allow them setting the experimetal_force_guest_pull, introduced by Edgeless, as part of our helm-chart. This option can be set as: _experimentalForceGuestPull: "qemu-tdx,qemu-coco-dev" Which would them ensure that the configuration for `qemu-tdx` and `qemu-coco-dev` would have the option enabled. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 17:43:09 +02:00
Fabiano Fidêncio	8c4bad68a8	kata-deploy: Remove kustomize yamls, rely on helm-chart only As the kata-deploy helm chart has been the only way we've been testing kata-containers deployment as part of our CI, it's time to finally get rid of the kustomize yamls and avoid us having to maintain two different methods (with one of those not being tested). Here I removed: * kata-deploy yamls and kustomize yamls * kata-cleanup yamls and kustomize yamls * kata-rbac yals and kustomize yamls * README.md for the kustomize yamls was removed Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 16:54:19 +02:00
Fabiano Fidêncio	3418cedacc	ci: Add tests for erofs-snapshotter (for coco-qemu-dev) erofs-snapshotter can be used to leverage sharing the image from the host to the guest without the need of a shared filesystem (such as virtio-fs or virtio-9p). This case is ideal for Confidential Computing enabled on Kata Containers, and we can immensely benefit from this snapshotter, thus let's test it as soon as possible so we can find issues, report bugs, and ask for enhancement requests. There are at least a few things that we know for sure to be problematic now: * Policy has to be adjusted to the erofs-snapshotter * There is no support for signed nor encrypted images * Tests that use the KBS are disabled for now Even with the limitations, I do believe we should be testing the snapshoitter, so we can team up and get those limitations addressed. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 10:34:09 +02:00
Fabiano Fidêncio	544f688104	tests: Add ability to deploy vanilla k8s with erofs As done in the previous commit, let's expand the vanilla k8s deployment to also allow the erofs host side configuration. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 10:34:09 +02:00
Fabiano Fidêncio	3ac6579ca6	tests: Add support for deploying vanilla k8s We already have support for deploying a few flavours of k8s that are required for different tests we perform. Let's also add the ability to deploy vanilla k8s, as that will be very useful in the next commits in this series. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 10:34:09 +02:00
Fabiano Fidêncio	aa9e3fc3d5	versions: Update containerd active / latest versions The active version is 2.1.x, and the latest is 2.2.0-beta.0. The latest is what we'll be using to test if the "to be released" version of containerd works well for our use-cases. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 10:34:09 +02:00
Fabiano Fidêncio	287db1865f	tests: Relax regex used to install containerd Let's make sure that we can get non-official releases as well, otherwise we won't be able to test a coming release of containerd, to know whether it solves issues that we face or not, before it's actually released. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 10:34:09 +02:00
Zvonko Kaiser	59b4e3d3f8	gpu: Add CONFIG_FW_LOADER to the kernel We need it for the newer CC kernel Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-08 10:01:27 +02:00
Zvonko Kaiser	7061f64db5	gpu: Fix confidential build NVRC introduced the confidential feature flag and we haven't updated the rootfs build to accomodate. If rootfs_type==confidential user --feature=confidential Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-08 10:01:27 +02:00
Zvonko Kaiser	2260f66339	gpu: Some fixes regarding the rootfs v580 With the 580 driver version we need new dependencies in the rootfs. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-08 10:01:27 +02:00
Dan Mihai	08272ab673	Merge pull request #11884 from kata-containers/sprt/priv-test tests/k8s: Add test for privileged containers	2025-10-07 19:18:06 -07:00
Szymon Klimek	8dc6b24e7d	kata-deploy: accept 25.10 as supported distro for TDX Canonical TDX release is not needed for vanilla Ubuntu 25.10 but GRUB_CMDLINE_LINUX_DEFAULT needs to contain `nohibernate` and `kvm_intel.tdx=1` Signed-off-by: Szymon Klimek <szymon.klimek@intel.com>	2025-10-07 23:41:52 +02:00
Dan Mihai	650863039b	tests: k8s-volume: auto-generate policy Auto-generate the agent policy, instead of using the insecure "allow all" policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-07 23:35:06 +02:00
Dan Mihai	5ed76b3c91	tests: k8s-volume: retry failed exec Use grep_pod_exec_output to retry possible failing "kubectl exec" commands. Other tests have been hitting such errors during CI in the past. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-07 23:35:06 +02:00
Dan Mihai	6ab59453ff	genpolicy: better parsing of mount path Mount paths ending in '/' were not parsed correctly. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-07 23:35:06 +02:00
Dan Mihai	ba792945ef	genpolicy: additional mount_source_allows logging Make debugging policy errors related to storage mount sources easier to debug. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-10-07 23:35:06 +02:00
Aurélien Bombo	6e451e3da0	tests/k8s: Add test for privileged containers This adds an integration test to verify that privileged containers work properly when deploying Kata with kata-deploy. This is a follow-up to #11878. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-07 09:59:05 -05:00
Fabiano Fidêncio	f994bacf6c	tests: coco: Use the new way to set up nydus snapshotter Let's rely on kata-deploy setting up the nydus snapshotter for us, instead of doing this with external code. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Fabiano Fidêncio	6f17125ea4	tests: Allow using the new way to deploy nydus-snapshotter This allows us to stop setting up the snapshotter ourselves, and just rely con kata-deploy to do so. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Fabiano Fidêncio	000c9cce23	kata-deploy: chart: Add `_experimentalSetupSnapshotter` Let's expose the EXPERIMENTAL_SETUP_SNAPSHOTTER script environment variable to our chart, allowing then users of our helm chart to take advantage of this experimental feature. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Fabiano Fidêncio	d6a1881b8b	kata-deploy: scripts: Allow setting up multiple snapshotters We may deploy in scenarios where we want to have both snapshotters set up, sometimes even for simple test on which one behaves better. With this in mind, let's allow EXTERNAL_SETUP_SNAPSHOTTER to receive a comma separated list of snapshotters, such as: ``` EXPERIMENTAL_SETUP_SNAPSHOTTER="erofs,nydus" ``` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Fabiano Fidêncio	445af6c09b	kata-deploy: scripts: Allow deploying erofs-snapshotters Similarly to what's been done for the nydus-snapshotter, let's allow users to have erofs-snapshotter set up by simply passing: ``` EXPERIMENTAL_SETUP_SNAPSHOTTER="erofs". ``` Mind that erofs, although a built-in containerd snapshotter, has system depdencies that we will NOT install and it's up to the admin to do so. These dependencies are: * erofs-utils * fsverity * erofs module loaded Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Fabiano Fidêncio	4359c7b15d	tests: Ensure the nydus-snapshotter versions are aligned In the previous commit we added the assumption that the nydus-snapshotter version should be the same in two different places. Now, with this test, we ensure those will always be in sync. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Fabiano Fidêncio	2e0ce2f39f	kata-deploy: scripts: Allow deploying nydus-snapshotter Let's introduce a new EXPERIMENTAL_SETUP_SNAPSHOTTER environemnt variable that, when set, allows kata-deploy to put the nydus snapshotter in the correct place, and configure containerd accordingly. Mind, this is a stop gap till the nydus-snapshotter helm chart is ready to be used and behaving well enough to become a weak dependency of our helm chart. When that happens this code can be deleted entirely. Users can have nydus-snapshotter deployed and configured for the guest-pull use case by simply passing: ``` EXPERIMENTAL_SETUP_SNAPSHOTTER="nydus" ``` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Fabiano Fidêncio	1e2c86c068	kata-deploy: scripts: Only add conf file to the imports once Otherwise we'd end up adding a the file several times, which could lead to problems when removing the entry, leading to containerd not being able to start due to an import file not being present. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Fabiano Fidêncio	e1269afe8a	tests: Only use Authorization when GH_TOKEN is available The code, how it was, would lead to the following broke command: `--header "Authorization: Bearer: "` Let's only expand that part of the command if ${GH_TOKEN} is passed, otherwise we don't even bother adding it. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00
Dan Mihai	5e46f814dd	Merge pull request #11832 from kata-containers/sprt/dev-hostpath runtime: Simplify mounting guest devices when using hostPath volumes	2025-10-06 12:36:36 -07:00
Steve Horsman	0d58bad0fd	Merge pull request #11840 from kata-containers/dependabot/cargo/src/tools/agent-ctl/astral-tokio-tar-0.5.5 build(deps): bump astral-tokio-tar from 0.5.2 to 0.5.5 in /src/tools/agent-ctl	2025-10-06 09:35:56 +01:00
Aurélien Bombo	6ff78373cf	docs: Document `privileged_without_host_devices=false` as unsupported Document that privileged containers with privileged_without_host_devices=false are not generally supported. When you try the above, the runtime will pass all the host devices to Kata in the OCI spec, and Kata will fail to create the container for various reasons depending on the setup, e.g.: - Attempting to hotplug uninitialized loop devices. - Attempting to remount /dev devices on themselves when the agent had already created them as default devices (e.g. /dev/full). - "Conflicting device updates" errors. - And more... privileged_without_host_devices was originally created to support Kata [1][2] and lots of people are having issues when it's set to false [3]. [1] https://github.com/kata-containers/runtime/issues/1568 [2] https://github.com/containerd/cri/pull/1225 [3] https://github.com/kata-containers/kata-containers/issues?q=is%3Aissue%20%20in%3Atitle%20privileged Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-02 15:21:19 -05:00
Fabiano Fidêncio	300f7e686e	build: Fix initramfs build We have noticed in the CI that the `gen_init_cpio ...` was returning 255 and breaking the build. Why? I am not sure. When chatting with Steve, he suggested to split the command, so it'd be easier to see what's actually breaking. But guess what? There's no breakage when we split the command. So, let's try it out and see whether the CI passes after it. If someone is willing to educate us on this one, please, that would be helpful! :-) Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-02 20:58:22 +02:00
Zvonko Kaiser	2693daf503	gpu: Install dcgm export from the CUDA repo Do not use the repo to install the exporter, we rely on the version tested with Ubuntu <version> Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-02 18:05:13 +02:00
Zvonko Kaiser	56c6512781	gpu: Bump to noble and rearrange repos Moving the CUDA repo to the top for all essential packages and adding a repo priority favouring NVIDIA based repos. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-02 18:05:13 +02:00
Aurélien Bombo	eeecd6d72b	Merge pull request #11872 from kata-containers/sprt/rust-use-uninit agent/rustjail: Fix potentially uninitialized memory read in unsafe code	2025-10-02 10:39:25 -05:00
Manuel Huber	4b7c1db064	ci: Add test case for openvpn Introduce new test case which verifies that openvpn clients and servers can run as Kata pods and can successfully establish a connection. Volatile certificates and keys are generated by an initialization container and injected into the client and server containers. This scenario requires TUN/TAP support for the UVM kernel. Signed-off-by: Manuel Huber <mahuber@microsoft.com> Co-authored-by: Manuel Huber <manuelh@nvidia.com>	2025-10-02 11:40:49 +02:00
Manuel Huber	34ecb11b35	tests: ease add_allow_all_policy_to_yaml if case No need to die when a Kind that does not require a policy annotation is found in a pod manifest. Print an informational message instead. Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2025-10-02 11:40:49 +02:00
Manuel Huber	e36f788570	kernel: add required configs for openvpn support Currently, use of openvpn clients/servers is not possible in Kata UVMs. Following error message can be expected: ERROR: Cannot open TUN/TAP dev /dev/net/tun: No such device (errno=19) To support opevpn scenarios using bridging and TAP, we enable various kernel networking config options. Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2025-10-02 11:40:49 +02:00
Aurélien Bombo	a9fc501c08	check-spelling: Add hostPath to dictionary Manually added "hostPath" to main.txt then regenerated the dictionary with `./kata-spell-check.sh make-dict`. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-01 15:32:21 -05:00
Aurélien Bombo	c7a478662f	check-spelling: Run `make-dict` This simply ran `./kata-spell-check.sh make-dict` as documented in [1]. Unclear why it leads to changes - maybe it hadn't been run in a while. [1] https://github.com/kata-containers/kata-containers/tree/main/tests/cmd/check-spelling#create-the-master-dictionary-files Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-01 15:32:21 -05:00
Aurélien Bombo	5c21b1faf3	runtime: Simplify mounting guest devices when using hostPath volumes This change crystallizes and simplifies the current handling of /dev hostPath mounts with virtually no functional change. Before this change: - If a mount DESTINATION is in /dev and it is a non-regular file on the HOST, the shim passes the OCI bind mount as is to the guest (e.g. /dev/kmsg:/dev/kmsg). The container rightfully sees the GUEST device. - If the mount DESTINATION does not exist on the host, the shim relies on k8s/containerd to automatically create a directory (ie. non-regular file) on the HOST. The shim then also passes the OCI bind mount as is to the guest. The container rightfully sees the GUEST device. - For other /dev mounts, the shim passes the device major/minor to the guest over virtio-fs. The container rightfully sees the GUEST device. After this change: - If a mount SOURCE is in /dev and it is a non-regular file on the HOST, the shim passes the OCI bind mount as is to the guest. The container rightfully sees the GUEST device. - The shim does not anymore rely on k8s/containerd to create missing mount directories. Instead it explicitely handles missing mount SOURCES, and treats them like the previous bullet point. - The shim no longer uses virtio-fs to pass /dev device major/minor to the guest, instead it passes the OCI bind mount as is. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-10-01 15:32:21 -05:00
Markus Rudy	285aaad13e	Merge pull request #11868 from burgerdev/serial-tests kata-sys-util: use a tempdir per test case	2025-10-01 14:34:18 +02:00
Markus Rudy	507a0e09f3	agent: use TEST-NET-1 addresses for netlink tests test_add_one_arp_neighbor modifies the root network namespace, so we should ensure that it does not interfere with normal network setup. Adding an IP to a device results in automatic routes, which may affect routing to non-test endpoints. Thus, we change the addresses used in the test to come from TEST-NET-1, which is designated for tests and usually not routable. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-10-01 09:00:52 +02:00
Markus Rudy	bbc006ab7c	agent: add debug info to netlink tests list_routes and test_add_one_arp_neighbor have been flaky in the past (#10856), but it's been hard to tell what exactly is going wrong. This commit adds debug information for the most likely problem in list_routes: devices being added/removed/modified concurrently. Furthermore, it adds the exit code and stderr of the ip command, in case it failed to list the ARP neighborhood. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-10-01 09:00:52 +02:00
Markus Rudy	43f6a70897	kata-sys-util: use a tempdir per test case Rust unit tests are executed concurrently [1], so sharing a directory of test files between test cases is prone to race conditions. This commit changes the pci_manager tests such that each test uses its own tempfile::tempdir, which provides nice isolation and obsoletes the need to manually clean up. [1]: https://doc.rust-lang.org/book/ch11-02-running-tests.html#running-tests-in-parallel-or-consecutively Fixes: #11852 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-10-01 09:00:52 +02:00
Aurélien Bombo	a3669d499a	agent/rustjail: Fix potentially uninitialized memory read in unsafe code The previous code only checked the result of with_nix_path(), not statfs(), thus leading to an uninitialized memory read if statfs() failed. No functional change otherwise. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-30 15:48:07 -05:00
Aurélien Bombo	20c60b21bd	Merge pull request #11839 from Sumynwa/sumsharma/agent-ctl-vm-container agent-ctl: Add fs sharing using virtio-fs when booting a pod vm.	2025-09-30 15:45:10 -05:00
Aurélien Bombo	7b2a7ca4d8	Merge pull request #11869 from burgerdev/cargo-fmt kata-sys-util: format mount.rs	2025-09-30 10:27:08 -05:00
Markus Rudy	a21a94a2e8	kata-sys-util: format mount.rs PR #11849 was merged before fixing a formatting issue. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-09-30 13:02:30 +02:00
Mikko Ylinen	6f45a7f937	runtime: config: allow TDX QGS port=0 `85f3391bc` added the support for TDX QGS port=0 but missed defaultQgsPort in the default config. defaultQgsPort overrides user provided tdx_quote_generation_service_socket_port=0. After this change, defaultQgsPort is not needed anymore since there's no default: any positive integer is OK and negative or unset value becomes a parse error. QEMUTDXQUOTEGENERATIONSERVICESOCKETPORT in the Makefile is used to provide a sane default when tdx_quote_generation_service_socket_port gets set in the configuration. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-09-30 09:47:05 +02:00
Xuewei Niu	ca11a7387d	Merge pull request #11636 from burgerdev/darwin-ci ci: add genpolicy build for Darwin	2025-09-30 13:52:39 +08:00
Aurélien Bombo	575381cb7e	Merge pull request #11846 from kata-containers/sprt/reinstate-mariner Revert "ci: temporarily avoid using the Mariner Host image"	2025-09-29 15:49:53 -05:00
Dan Mihai	4b308817bc	Merge pull request #11858 from microsoft/danmihai/policy-tests-upstream2 tests: k8s: auto-generate policy for additional tests	2025-09-29 13:39:22 -07:00
Aurélien Bombo	693a1461d2	tests: policy: Set oci_version to 1.2.0 for Mariner Mariner recently upgraded to containerd 2.0. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-29 12:14:51 -05:00
Aurélien Bombo	756f3a73df	Revert "ci: temporarily avoid using the Mariner Host image" This reverts commit `e8405590c1`. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-29 12:14:51 -05:00
Aurélien Bombo	c8fdb0e971	Merge pull request #11849 from shwetha-s-poojary/fix_ppc_mount_ut libs: Fix the test_parse_mount_options failure on ppc64le	2025-09-29 11:08:21 -05:00
Markus Rudy	369124b180	ci: build genpolicy on darwin genpolicy is a developer tool that should be usable on MacOS. Adding it to the darwin CI job ensures that it can still be built after changes. On an Apple M2, the output of `uname -m` is `arm64`, which is why a new case is needed in the arch_to_* functions. We're not going to cross-compile binaries on darwin, so don't install any additional Rust targets. Fixes: #11635 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-09-29 09:48:32 +02:00
Markus Rudy	369aed0203	kata-types: conditionally include safe-path Most of the kata-types code is reusable across platforms. However, some functions in the mount module require safe-path, which is Linux-specific and can't be used on other platforms, notably darwin. This commit adds a new feature `safe-path` to kata-types, which enables the functions that use safe-path. The Linux-only callers kata-ctl and runtime-rs enable this feature, whereas genpolicy only needs initdata and does not need the functions from the mount module. Using a feature instead of a target_os restriction ensures that the developer experience for genpolicy remains the same. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-09-29 09:48:32 +02:00
Sumedh Alok Sharma	c94e65e982	agent-ctl: Add fs sharing using virtio-fs when booting a pod vm. This commit adds changes to enable fs sharing between host/guest using virtio-fs when booting a pod VM for testing. This primarily enables sharing container rootfs for testing container lifecycle commands. Summary of changes is as below: - adds minimal virtiofsd code to start userspace daemon (based on `runtime-rs/crates/resource/src/share_fs`) - adds the virtiofs device to the test vm - prepares and mounts the container rootfs on host - modifies container storage & oci specs Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2025-09-29 07:20:42 +00:00
Markus Rudy	63515242c5	tests: fix shellcheck findings in install_rust.sh Fixing the shellcheck issues first so that they are not coupled to the subsequent commit introducing Darwin support to the script. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-09-28 12:01:23 +02:00
Zvonko Kaiser	c4e352f7ff	Merge pull request #11856 from zvonkok/gpu_guest_components gpu: Add libgcc for RUST libc=gnu builds	2025-09-26 18:27:16 -04:00
Dan Mihai	ef0f8723cf	tests: k8s-nginx-connectivity: auto-generated policy Auto-generate policy for nginx-deployment pods, instead of hard-coding the "allow all" policy. Note that the `busybox_pod` - created using `kubectl run` - still doesn't have an Init Data annotation, so it is using the default policy built into the Kata Guest rootfs image file. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-26 20:24:13 +00:00
Dan Mihai	8943f0d9b2	tests: k8s-liveness-probes: auto-generate policy Auto-generate agent policy in k8s-liveness-probes.bats, instead of using the non-confidential "allow all" policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-26 20:23:12 +00:00
Dan Mihai	d9bc7e2b76	tests: k8s-credentials-secrets: auto-generate policy Auto-generate the agent policy for pod-secret-env.yaml, using "genpolicy -c inject_secret.yaml". Support for passing Secret specification files as "-c" arguments of genpolicy has been added when fixing #10033 with PR #10986. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-26 20:23:12 +00:00
Zvonko Kaiser	3743eb4cea	gpu: Add ligcc for RUST libc=gnul builds Since we cannot build all components with libc=musl and static RUSTFLAG we still need to ship libcc for AA or other guest components. Without this change the guest components do not work and we see /usr/local/bin/attestation-agent: error while loading shared libraries: libgcc_s.so.1: cannot open shared object file: No such file or directory Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-09-26 15:08:58 -04:00
Dan Mihai	32453a576f	Merge pull request #11845 from microsoft/danmihai/policy-tests-upstream tests: k8s: auto-generate policy for additional tests	2025-09-26 11:32:23 -07:00
Aurélien Bombo	f3293ed404	Merge pull request #11855 from kata-containers/sprt/zizmor-fixes2 gha: zizmor: fix "workflow or action definition without a name" error	2025-09-26 12:09:52 -05:00
Hyounggyu Choi	077aaa6480	Merge pull request #11854 from kata-containers/sprt/pipefail-lib tests/k8s: Add set -euo pipefail to lib.sh	2025-09-26 12:49:59 +02:00
Aurélien Bombo	433e59de1f	gha: zizmor: fix "workflow or action definition without a name" error This fixes that error everywhere by adding a `name:` field to all jobs that were missing it. We keep the same name as the job ID to ensure no disturbance to the required job names. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-25 23:34:40 -05:00
Aurélien Bombo	282e20bc37	tests/k8s: Add set -euo pipefail to lib.sh -o pipefail in particular ensures that exec_host() returns the right exit code. -u is also added for good measure. Note that $BATS_TEST_DIRNAME is set by bats so we move its usage inside the function. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-25 23:05:05 -05:00
Aurélien Bombo	d1f52728cc	Merge pull request #11853 from kata-containers/sprt/zizmor-fix gha: Run Zizmor without Advanced Security	2025-09-25 14:06:53 -05:00
Aurélien Bombo	0b40ad066a	gha: Set Zizmor check as non-required As a consequence of moving away from Advanced Security for Zizmor, it now checks the entire codebase and will error out on this PR and future. To be reverted once we address all Zizmor findings in a future PR. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-25 10:50:49 -05:00
Aurélien Bombo	2e033d0079	gha: Run Zizmor without Advanced Security This does not change the security of the analysis, this is just to work around zizmorcore/zizmor-action#43. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-25 10:50:41 -05:00
shwetha-s-poojary	c28ffac060	libs: Fix the test_parse_mount_options failure on ppc64le This PR fixes a test that failed on platforms like ppc64le due to a hardcoded mount option length. * Test was failing on ppc64le due to larger system page size (e.g., 65536 bytes) * Original test used a hardcoded 4097-byte string assuming 4KB page size * Replaced with MAX_MOUNT_PARAM_SIZE + 1 to reflect actual system limit Ensures test fails correctly across all architectures Fixes: #11852 Signed-off-by: shwetha-s-poojary <shwetha.s-poojary@ibm.com>	2025-09-25 19:56:51 +05:30
Greg Kurz	f6d352d088	Merge pull request #11835 from ldoktor/ocp-pp-revision ci.ocp: Avoid unsupported "git --revision"	2025-09-25 16:10:48 +02:00
Xuewei Niu	98446e7338	Merge pull request #11678 from StevenFryto/rootless_vmm runtime-rs: Add support for running the VMM in non-root mode	2025-09-25 22:03:25 +08:00
Aurélien Bombo	3ce7693a2d	Merge pull request #11851 from BbolroC/remove-comment-for-hadolint-dl3007 ci: Remove DL3007 ignore comment for base image	2025-09-25 09:03:07 -05:00
Xuewei Niu	46cbb2fb98	Merge pull request #11719 from whyeinstein/csi-kata-spdkvolume csi-kata-directvolume: Add basic SPDK volume support	2025-09-25 21:53:46 +08:00
Hyounggyu Choi	c961f70b7e	ci: Remove DL3007 ignore comment for base image The Hadolint warning DL3007 (pin the version explicitly) is no longer applicable. We have updated the base image to use a specific version digest, which satisfies the linter's requirement for reproducible builds. This commit removes the corresponding inline ignore comment. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-09-25 15:46:39 +02:00
Dan Mihai	fe5ee803a8	tests: k8s-sysctls.bats auto-generated policy Auto-generate policy in k8s-sysctls.bats, instead of hard-coding the "allow all" policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-25 13:03:19 +00:00
Dan Mihai	9d3d3c9b0f	tests: k8s-pod-quota.bats auto-generated policy Auto-generate policy in k8s-pod-quota.bats, instead of hard-coding the "allow all" policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-25 13:03:19 +00:00
Dan Mihai	0008ecd18b	tests: k8s-inotify.bats auto-generated policy Auto-generate policy for k8s-inotify.bats, instead of hard-coding the "allow all" policy. Fixes: #8889 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-25 13:03:19 +00:00
Dan Mihai	711e7b8014	tests: k8s-hostname.bats auto-generated policy Auto-generate policy for k8s-hostname.bats, instead of hard-coding the "allow all" policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-25 13:03:19 +00:00
Dan Mihai	566e1abb09	tests: k8s-empty-dirs.bats generated policy Auto-generated policy for k8s-empty-dirs.bats, instead of hard-coding the "allow all" policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-25 13:03:19 +00:00
stevenfryto	9e33888f06	runtime-rs: supporting the QEMU VMM process running in non-root mode This change enables to run the QEMU VMM using a non-root user when rootless flag is set true in the configuration. Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>	2025-09-25 19:30:29 +08:00
stevenfryto	bde6eb7c3a	runtime-rs: add generic support for running the VMM in non-root mode This commit introduces generic support for running the VMM in rootless mode in runtime-rs: 1.Detect whether the VMM is running in rootless mode. 2.Before starting the VMM process, create a non-root user and launch the VMM with that user’s UID and GID; also add the KVM user's group ID to the VMM process's supplementary groups so the VMM process can access /dev/kvm. 3.Add the setup of the rootless directory located in the dir /run/user/<uid> directory, and modify some path variables to be functions that return the path with the rootless directory prefix when running in rootless mode. Fixes: #11414 Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>	2025-09-25 19:30:29 +08:00
why	5d76811c8a	csi-kata-directvolume: Add basic SPDK volume support Introduce initial implementation for SPDK-backed CSI volumes, allowing basic create and delete operations with vhost-user-blk integration. Signed-off-by: why <1206176262@qq.com>	2025-09-25 19:29:50 +08:00
Xuewei Niu	319237e447	Merge pull request #11848 from BbolroC/pin-alpine-to-stable-digest GHA: Pin Alpine to 3.20 for tee-unencrypted image	2025-09-25 19:29:22 +08:00
Hyounggyu Choi	e9653eae6e	GHA: Pin Alpine to 3.20 for tee-unencrypted image We recently hit the following error during build: ``` RUN ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key -P "" OpenSSL version mismatch. Built against 3050003f, you have 30500010 ``` This happened because `alpine:latest` moved forward and the `ssh-keygen` binary in the base image was compiled against a newer OpenSSL version that is not available at runtime. Pinning the base image to the stable release (3.20) avoids the mismatch and ensures consistent builds. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-09-25 11:49:04 +02:00
Steve Horsman	0a9e730f54	Merge pull request #11847 from Sumynwa/sumsharma/agent-ctl-ci-fix tests: agent-ctl: Fix cleanup for testing with qemu	2025-09-25 10:37:45 +01:00
Sumedh Alok Sharma	1be3785fa0	tests: agent-ctl: Fix cleanup for testing with qemu This change fixes clean up logic when running tests in a vm booted with qemu wrt to qmp.sock & console.sock files, and no longer assumes any path for them. Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2025-09-25 07:30:17 +00:00
Fupan Li	7c58ec7daa	Merge pull request #11833 from kata-containers/sprt/rust-io-bug agent/rustjail: Fix double free in TTY handling	2025-09-25 10:03:45 +08:00
Fupan Li	79f51ab237	runtime-rs: set the default block driver as virtio-scsi for qemu Change the default block driver to virtio-scsi. Since the latest qemu's commit: https://gitlab.com/qemu-project/qemu/-/commit/ 984a32f17e8dab0dc3d2328c46cb3e0c0a472a73 brings a bug for virtio-blk-pci with io_uring mode at line: https://gitlab.com/qemu-project/qemu/-/commit/ 984a32f17e8dab0dc3d2328c46cb3e0c0a472a73# ce8eeb01f8b84f8cb8d3c35684d473fe1ee670f9_345_352 In order to avoid this issue, change the default block driver to virtio-scsi. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-24 14:49:53 +02:00
Wainer Moschetta	0bdc462bed	Merge pull request #11841 from microsoft/danmihai1/test-timing-info tests: k8s: add test duration information	2025-09-24 08:17:54 -03:00
Fupan Li	362c177b3d	Merge pull request #11843 from Apokleos/remove-initdata-anno runtime-rs: Remove InitData annotation from OCI Spec	2025-09-24 18:25:37 +08:00
Alex Lyn	62c936b916	runtime-rs: Use the updated OCI Spec annotation as the argument As OCI Spec annotation has been updated with adding or remove items, we should use the updated annotation as the passed argument. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-24 13:04:51 +08:00
Alex Lyn	9eca015d73	runtime-rs: Remove InitData annotation from OCI Spec This commit removes the InitData annotation from the OCI Spec's annotations. Similar to the Policy annotation, InitData is now exclusively handled and transmitted to the guest via the sandbox's init data mechanism. Removing this redundant and potentially large annotation simplifies the OCI Spec and streamlines the guest initialization process. This change aligns the handling of InitData with existing practices within runtime-go. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-24 09:32:13 +08:00
Aurélien Bombo	dedd833cdd	agent: Add note about future breaking change in nix Tracked in #11842. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-23 16:23:54 -05:00
Aurélien Bombo	ecb22cb3e3	agent/rustjail: Fix double free in TTY handling The repro below would show this error in the logs (in debug mode only): fatal runtime error: IO Safety violation: owned file descriptor already closed The issue was that the `pseudo.slave` file descriptor was being owned by multiple variables simultaneously. When any of those variables would go out of scope, they would close the same file descriptor, which is undefined behavior. To fix this, we clone: we create a new file descriptOR that refers to the same file descriptION as the original. When the cloned descriptor is closed, this affect neither the original descriptor nor the description. Only when the last descriptor is closed does the kernel cleans up the description. Note that we purposely consume (not clone) the original descriptor with `child_stdin` as `pseudo` is NOT dropped automatically. Repro ----- Prerequisites: - Use Rust 1.80+. - Build the agent in debug mode. $ cat busybox.yaml apiVersion: v1 kind: Pod metadata: name: busybox spec: containers: - image: busybox:latest name: busybox runtimeClassName: kata $ kubectl apply -f busyboox.yaml pod/busybox created $ kubectl exec -it busybox -- sh error: Internal error occurred: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "e6c602352849647201860c1e1888d99ea3166512f1cc548b9d7f2533129508a9": cannot enter container 76a499cbf747b9806689e51f6ba35e46d735064a3f176f9be034777e93a242d5, with err ttrpc: closed Fixes: #11054 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-23 16:23:50 -05:00
Dan Mihai	38a28b273a	Merge pull request #11814 from charludo/main genpolicy: match sandbox name by regex	2025-09-23 14:14:11 -07:00
Dan Mihai	e9f69ce321	tests: k8s: add test duration information Log how much time "kubectl get pods" and each test case are taking, just in case that will reveal unusually slow test clusters, and/or opportunities to improve tests. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-23 19:24:38 +00:00
stevenhorsman	c2b0650491	release: Bump version to 3.21.0 Bump VERSION and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-23 20:59:00 +02:00
dependabot[bot]	e24e564eb7	build(deps): bump astral-tokio-tar in /src/tools/agent-ctl Bumps [astral-tokio-tar](https://github.com/astral-sh/tokio-tar) from 0.5.2 to 0.5.5. - [Release notes](https://github.com/astral-sh/tokio-tar/releases) - [Changelog](https://github.com/astral-sh/tokio-tar/blob/main/CHANGELOG.md) - [Commits](https://github.com/astral-sh/tokio-tar/compare/v0.5.2...v0.5.5) --- updated-dependencies: - dependency-name: astral-tokio-tar dependency-version: 0.5.5 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-09-23 17:46:48 +00:00
Fabiano Fidêncio	bfc54d904a	agent: Fix format issues In the previous commit we've added some code that broke `cargo fmt -- --check` without even noticing, as the code didn't go through the CI process (due to it being a security advisory). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-23 16:47:39 +02:00
Steve Horsman	3e67f92e34	Merge commit from fork Fix malicious host can circumvent initdata verification on TDX	2025-09-23 13:31:29 +01:00
Alex Lyn	a9ec8ef21f	kata-types: remove trailing slash from DEFAULT_KATA_GUEST_SANDBOX_DIR Trailing slash in DEFAULT_KATA_GUEST_SANDBOX_DIR caused double slashes in mount_point (e.g. "/run/kata-containers/sandbox//shm"), which failed OPA strict equality checks against policy mount_point. Removing it aligns generated paths with policy and fixes CreateSandboxRequest denial. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-23 14:01:22 +02:00
Steve Horsman	bcd0c0085c	Merge pull request #11821 from mythi/coco-guest-update Confidential containers version updates	2025-09-23 12:45:38 +01:00
Mikko Ylinen	5cb1332348	build: enable nvidia-attester for coco-guest-components coco-guest-components tarball is used as is for both vanilla coco rootfs and the nvidia enabled rootfs. nvidia-attester can be built without nvml so make it globally enabled for coco-guest-components. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-09-23 12:38:32 +03:00
Mikko Ylinen	e878d4a90a	versions: bump guest-components and trustee for CoCo v0.16.0 Pick the latest CoCo components targeted for the next release. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-09-23 12:38:32 +03:00
Charlotte Hartmann Paludo	2cea32cc23	genpolicy: match sandbox name by regex `allow_interactive_exec` requires a sandbox-name annotation, however this is only added for pods by genpolicy. Other pod-generating resources have unpredictable sandbox names. This patch instead uses a regex for the sandbox name in genpolicy, based on the specified metadata and following Kubernetes' naming logic. The generated regex is then used in the policy to correctly match the sandbox name. Fixes: #11823 Signed-off-by: Charlotte Hartmann Paludo <git@charlotteharludo.com> Co-authored-by: Paul Meyer <katexochen0@gmail.com> Co-authored-by: Markus Rudy <mr@edgeless.systems>	2025-09-23 10:31:58 +02:00
Lukáš Doktor	5c14d2956a	ci.ocp: Avoid unsupported "git --revision" the git version in CI doesn't support "git clone --revision", workaround it by using fetch directly. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-09-23 09:29:06 +02:00
Fupan Li	a27009012c	Merge pull request #11834 from Apokleos/fix-initdata-whitespace CI: Keep base64 output of initdata annotation is a single line	2025-09-23 15:16:35 +08:00
Alex Lyn	4e793d635e	Merge pull request #11736 from kata-containers/enhance-copyfile runtime-rs: Enhance copyfile when sharedfs is disabled	2025-09-23 14:15:44 +08:00
Alex Lyn	f254eeb0e9	CI: Keep base64 output is a single line This commit addresses an issue where base64 output, when used with a default configuration, would introduce newlines, causing decoding to fail on the runtime. The fix ensures base64 output is a single, continuous line using the -w0 flag. This guarantees the encoded string is a valid Base64 sequence, preventing potential runtime errors caused by invalid characters. Note that: When you use the base64 command without any parameters, it typically automatically adds newlines to the output, usually every 76 chars. In contrast, base64 -w0 explicitly tells the command not to add any newlines (-w for wrap, and 0 for a width of zero), which results in a continuous string with no whitespace. This is a critical distinction because if you pass a Base64 string with newlines to a runtime, it may be treated as an invalid string, causing the decoding process to fail. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-23 11:58:53 +08:00
Fupan Li	72a0f5daec	Merge pull request #11794 from Sumynwa/sumsharma/clh_netdev_hotplug_pciinfo runtime: clh: Add pci path for hotplugged network endpoints	2025-09-23 09:57:57 +08:00
Dan Mihai	02ace265d9	Merge pull request #11827 from microsoft/danmihai1/exec-retries tests: k8s: retry kubectl exec	2025-09-22 17:14:50 -07:00
Hyounggyu Choi	16c2dd7c96	Merge pull request #11769 from Apokleos/enhance-blockdev Enhance block device AIO mode	2025-09-22 14:01:38 +02:00
Alex Lyn	5dd36c6c0f	runtime-rs: Correctly set permission and mode for dir when copy files Correctly set dir's permissions and mode. This update ensures: The dir_mode field of CopyFileRequest is set to DIR_MODE_PERMS (equivalent to Go's 0o750 \| os.ModeDir), which is primarily used for the top-level directory creation permissions. The file_mode field now directly uses metadata.mode() (equivalent to Go's st.Mode) for the target entry. This change aims to resolve potential permission issues or inconsistencies during directory and file creation within the guest environment by precisely matching the expected mode propagation of the Kata agent. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 17:59:57 +08:00
Greg Kurz	0f5511962c	Merge pull request #11638 from ldoktor/ocp-peer-pods ci.ocp: More debug output and tweaks	2025-09-22 11:57:46 +02:00
Alex Lyn	429133cedb	runtime-rs: Introduce shared FS volume management in VolumeResource The core purpose of introducing volume_manager to VolumeResource is to centralize the management of shared file system volumes. By creating a single VolumeManager instance within VolumeResource, all shared file volumes are managed by one central entity. This single volume_manager can accurately track the references of all ShareFsVolume instances to the shared volumes, ensuring correct reference counting, proper volume lifecycle management, and preventing issues like volumes being overwritten. This new design ensures that all shared volumes are managed by a central entity, which: (1) Guarantees correct reference counting. (2) Manages the volume lifecycle correctly, avoiding issues like volumes being overwritten. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 15:03:41 +08:00
Alex Lyn	90c99541da	runtime-rs: Integrate VolumeManager into ShareFsVolume lifecycle This commit integrates the new `VolumeManager` into the `ShareFsVolume` lifecycle. Instead of directly copying files, `ShareFsVolume::new` now uses the `VolumeManager` to get a guest path and determine if the volume needs to be copied. It also updates the `cleanup` function to release the volume's reference count, allowing the `VolumeManager` to manage its state and clean up resources when no longer in use. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 15:03:27 +08:00
Alex Lyn	e73daa2f14	runtime-rs: Add sandbox level volume manager within non-sharedfs This commit introduces a new `VolumeManager` to track the state of shared volumes, including their reference count and its corresponding container ids. The manager's goal is to handle the lifecycle of shared filesystem volumes, including: (1) Volume State Tracking: Tracks the mapping from host source paths to guest destination paths. (2) Reference Counting: Manages reference counts for each volume, preventing premature cleanup when multiple containers share the same source. (3) Deterministic guest paths: Generates unique guest paths using random string to avoid naming conflicts. (4) Improved Management: Provides a centralized way to handle volume creation, copying, and release, including aborting file watchers when volumes are no longer in use. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 14:45:16 +08:00
Mikko Ylinen	28ab972b3f	agent-ctl: bump image-rs pull image-rs from CoCo guest-components that is targeted for CoCo v0.16.0. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-09-22 08:31:58 +03:00
Alex Lyn	313c7313f0	runtime-rs: Refactor code to improve copyfile logic and readability This commit refactors the `CopyFile` related code to streamline the logic for creating guest directories and make the code structure clearer. Its main goal is to improve the overall maintainability and facilitate future feature extensions. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 11:30:47 +08:00
Alex Lyn	f36377070a	runtime-rs: Enhance Copyfile to ensure existing contents synchronized This commit is designed to perform a full sync before starting monitoring to ensure that files which exist before monitoring starts are also synced. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 11:30:35 +08:00
Alex Lyn	2f5319675a	runtime-rs: Set native aio more for initdata block device This commit updates the configuration for the initdata block device to use the BlockDeviceAio::Native mode. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 10:13:44 +08:00
Alex Lyn	5ca403b5d9	runtime-rs: Allow per-device AIO mode configuration for block devices This commit enhances control over block device AIO modes via hotplug. Previously, hotplugging block devices was set with default AIO mode (io_uring). Even if users reset the AIO mode in the configuration file, the changes would not be correctly applied to individual block devices. With this update, users can now explicitly configure the AIO mode for hot-plugging block devices via the configuration, and those settings will be correctly applied. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 10:13:44 +08:00
Alex Lyn	425e93a9b8	runtime-rs: Get more block device info within Device Manager We need more information about block device, just relapce the original method get_block_driver with get_block_device_info and return its BlockDeviceInfo. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-22 10:13:44 +08:00
Xuewei Niu	50ffa0fbfd	Merge pull request #11495 from Caspian443/temp-selinux runtime-rs: align SELinux feature with runtime-go (#9866)	2025-09-21 17:12:37 +08:00
Caspian443	2221b76b67	runtime-rs: Add selinux support for hypervisor - read selinux_label from OCI spec in sandbox - set selinux_label in preparevm and startvm in hypervisor Fixes: [#9866](https://github.com/Caspian443/kata-containers/issues/9866) Signed-off-by: Caspian443 <scrisis843@gmail.com>	2025-09-21 13:59:17 +08:00
Caspian443	a658db8746	runtime-rs: hypervisor: add SELinux support functions - Add disable_selinux and selinux_label fields to hypervisor for SELinux support. - Implement related SELinux support functions. Fixes: #9866 Signed-off-by: Caspian443 <scrisis843@gmail.com>	2025-09-21 13:59:17 +08:00
Xuewei Niu	04948c616e	Merge pull request #11830 from zvonkok/gpu-lts gpu: Add correct latest driver per default	2025-09-21 13:58:34 +08:00
Zvonko Kaiser	e6f12d8f86	gpu: Add latest driver per default Lets make sure that we use latest driver for CI and release. There was a sort step missing. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-09-20 23:50:35 +00:00
Fabiano Fidêncio	54e8081222	qemu: Fix submodules location change The submodule change led to a breakage on our build of QEMU. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-20 22:12:27 +02:00
Lukáš Doktor	346ebd0ff9	ci.ocp: Allow to set CAA_IMAGE we might want to provide different CAA_IMAGE (repo) to reproduce issues. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-09-20 06:57:54 +02:00
Lukáš Doktor	bf90ccaf75	ci.ocp: Allow to set/provide PP_IMAGE_ID to be able to test with older or custom peer-pod image. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-09-20 06:57:54 +02:00
Lukáš Doktor	b7143488d9	ci.ocp: Allow to set CAA TAG to allow re-running with older CAA tag for bisection/reproduction. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-09-20 06:57:54 +02:00
Lukáš Doktor	12c5e0f33f	ci.ocp: Log more details on failure recently we got ErrImagePull, having more details should help analyzing issues. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-09-20 06:57:54 +02:00
Lukáš Doktor	7565c881e6	ci.ocp: Log variables in bash-friendly format this should simplify copy&paste of the values from logs. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-09-20 06:57:54 +02:00
Lukáš Doktor	a300b6b9a9	ci.ocp: Allow to set operator/caa commits this can help reproducing or bisecting issues related to operator/caa versions. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-09-20 06:57:53 +02:00
Dan Mihai	524bf66cbc	tests: k8s-credentials-secrets: retry on exec error Retry after "kubectl exec" failure, instead of aborting the test immediately. Example of recent error: https://github.com/kata-containers/kata-containers/actions/runs/17828061309/job/50693999052?pr=11822 not ok 1 Credentials using secrets (in test file k8s-credentials-secrets.bats, line 59) `kubectl exec $pod_name -- "${pod_exec_command[@]}" \| grep -w "username"' failed Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-19 17:45:05 +00:00
Dan Mihai	01c7949bfd	tests: k8s-number-cpus: retry on kubectl exec error Retry after "kubectl exec" failure, instead of aborting the test immediately. Example of recent error: https://github.com/kata-containers/kata-containers/actions/runs/17813996758/job/50644372056 not ok 1 Check number of cpus ... error: Internal error occurred: error sending request: Post "https://10.224.0.4:10250/exec/kata-containers-k8s-tests/cpu-test/c1?command=sh&command=-c&command= cat+%!F(MISSING)proc%!F(MISSING)cpuinfo+%!C(MISSING)grep+processor%!C(MISSING)wc+-l&error=1&output=1": EOF Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-19 17:44:55 +00:00
Dan Mihai	91c3804959	tests: k8s: add container_exec_with_retries() Add container_exec_with_retries(), useful for retrying if needed commands similar to: kubectl exec <pod_name> -c <container_name> -- <command> Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-19 17:42:59 +00:00
Dan Mihai	eec6c8b0c4	tests: k8s: retry after kubectl exec error Some of the k8s tests were already retrying if `kubectl exec` succeeded but produced empty output. Perform the same retries on `kubectl exec` error exit code too, instead of aborting the test immediately. Example of recent exec error: https://github.com/kata-containers/kata-containers/actions/runs/17813996758/job/50644372056 not ok 1 Check number of cpus ... error: Internal error occurred: error sending request: Post "https://10.224.0.4:10250/exec/kata-containers-k8s-tests/cpu-test/c1?command=sh&command=-c&command= cat+%!F(MISSING)proc%!F(MISSING)cpuinfo+%!C(MISSING)grep+processor%!C(MISSING)wc+-l&error=1&output=1": EOF Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-19 15:43:39 +00:00
Hyounggyu Choi	0fb40eda12	Merge pull request #11822 from BbolroC/runtime-no-hotplug-ibm-sel-s390x runtime: Set maxmem to initialmem on s390x when memory hotplug is disabled	2025-09-18 17:31:01 +02:00
Hyounggyu Choi	d90e785901	runtime: Set maxmem to initialmem on s390x when memory hotplug is disabled On s390x, QEMU fails if maxmem is set to 0: ``` invalid value of maxmem: maximum memory size (0x0) must be at least the initial memory size ``` This commit sets maxmem to the initial memory size for s390x when hotplug is disabled, resolving the error while still ensuring that memory hotplug remains off. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-09-18 14:05:33 +02:00
Mikko Ylinen	49fbd6e7af	runtime: qemu: disable memory hotplug for ConfidentialGuests The setting '-m xM,slots=y,maxmem=zM' where maxmem is from the host's memory capacity is failing with confidential VMs on hosts having 1T+ of RAM. slots/maxmem are necessary for setups where the container memory is hotplugged to the VM during container creation based on createContainer info. This is not the case with CoCo since StaticResourceManagement is enabled and memory hotplug flows have not been checked. To avoid unexpeted errors with maxmem, disable slots/maxmem in case ConfidentialGuest is requested. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-09-17 23:43:36 +02:00
Dan Mihai	ca244c7265	Merge pull request #11753 from Apokleos/fix-anno runtime-rs: Fix annotations within runtime-rs to pass the agent policy check	2025-09-16 16:42:26 -07:00
Dan Mihai	e2992b51ad	tests: k8s-job debug information Log the output of "kubectl logs", to hopefully help understand test failures similar to: https://github.com/kata-containers/kata-containers/actions/runs/17709473340/job/50326984605?pr=11753 not ok 1 Run a job to completion (in test file k8s-job.bats, line 37) `kubectl logs "$pod_name" \| grep "$pi_number"' failed Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-16 22:36:31 +02:00
Dan Mihai	8854e69e28	tests: k8s-empty-dirs debug information Log the output of "kubectl logs", to hopefully help understand test failures similar to: https://github.com/kata-containers/kata-containers/actions/runs/17709473340/job/50326984613?pr=11753 not ok 2 Empty dir volume when FSGroup is specified with non-root container (from function `assert_equal' in file k8s-empty-dirs.bats, line 16, in test file k8s-empty-dirs.bats, line 65) `assert_equal "1001" "$uid"' failed Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-16 22:36:31 +02:00
Fabiano Fidêncio	96108006f2	agent: Panic on errors accessing the attestation agent binary Let's make sure that whenever we try to access the attestation agent binariy, we only proceed the startup in case: * the binary is found (CoCo case) * the binary is not present (non-CoCo case) In case any error that's not `NotFound`, we should simply abort as that could mean a potential tampering with the binary (which would be reported as an EIO). Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-09-16 21:35:00 +02:00
Fabiano Fidêncio	d056fb20fe	initramfs: Enforce --panic-on-corruption for veritysetup Let's enforce an error on veritysetup in case there's any tampering with the rootfs. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-09-16 21:35:00 +02:00
Alex Lyn	bc1170ba0c	runtime-rs: Add bundle_path annotation within oci spec Add the annotation of OCI bundle path to store its path. As it'll be checked within agent policy, we need add them to pass agent policy validations. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-16 21:31:02 +02:00
Alex Lyn	71ddbac56d	runtime-rs: Correctly set CONTAINER_TYPE_KEY within OCI Spec annotation With the help of `update_ocispec_annotations`, we'll add the contaienr type key with "io.katacontainers.pkg.oci.container_type" and its corresponding type "pod_sandbox" when it's pause container and "pod_container" when it's an other containers. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-16 21:31:02 +02:00
Alex Lyn	a47c0cdf66	kata-types: Introduce a helper to update oci spec annotations It'll updates OCI annotations by removing specified keys and adding new ones. This function creates a new `HashMap` containing the updated annotations, ensuring that the original map remains unchanged. It is optimized for performance by pre-allocating the necessary capacity and handling removals and additions efficiently. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-16 21:31:02 +02:00
Alex Lyn	9992e1c416	kata-types: Export `POD_CONTAINER` and `POD_SANDBOX` constants as public To enable access to the constants `POD_CONTAINER` and `POD_SANDBOX` from other crates, their visibility has been updated to public. This change addresses the previous limitation of restricted access and ensures these values can be utilized across the codebase. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-16 21:31:02 +02:00
Alex Lyn	95585d818f	runtime-rs: Add sandbox annotation of nerdctl network namespace Add the annotation of nerdctl network namespace to let nerdctl know which namespace to use when calling the selected CNI plugin with "nerdctl/network-namespace". As it'll be checked within agent policy, we need add them to pass agent policy validations. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-16 21:31:00 +02:00
Dan Mihai	bc75f6a158	Merge pull request #11783 from billionairiam/agenttypo kata-agent: Rename misleading variable in config parsing	2025-09-16 11:07:17 -07:00
Fabiano Fidêncio	e31a06d51d	kata-manager: Handle zst unpacking On `63f6dcdeb9` we added the support to download either a .xz or a .zst tarball file. However, we missed adding the code to properly unpack a .zst tarball file. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-16 19:16:14 +02:00
Fabiano Fidêncio	4265beb081	tools: agent-ctl: Fix unresolved ch import agent-ctl's make check has been failing with: ``` Checking kata-agent-ctl v0.0.1 (/home/ubuntu/runner/_layout/_work/kata-containers/kata-containers/src/tools/agent-ctl) error[E0432]: unresolved import `hypervisor::ch` --> src/vm/vm_ops.rs:10:5 \| 10 \| ch::CloudHypervisor, \| ^^ could not find `ch` in `hypervisor` \| note: found an item that was configured out --> /home/ubuntu/runner/_layout/_work/kata-containers/kata-containers/src/runtime-rs/crates/hypervisor/src/lib.rs:30:9 \| 30 \| pub mod ch; \| ^^ note: the item is gated here --> /home/ubuntu/runner/_layout/_work/kata-containers/kata-containers/src/runtime-rs/crates/hypervisor/src/lib.rs:26:1 \| 26 \| / #[cfg(all( 27 \| \| feature = "cloud-hypervisor", 28 \| \| any(target_arch = "x86_64", target_arch = "aarch64") 29 \| \| ))] \| \|___^ ``` Let's just make sure that we include ch conditionally as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-16 18:44:33 +02:00
Fupan Li	4a92fc1129	runtime-rs: add the sandbox's shm volume support Docker containers support specifying the shm size using the --shm-size option and support sandbox-level shm volumes, so we've added support for shm volumes. Since Kubernetes doesn't support specifying the shm size, it typically uses a memory-based emptydir as the container's shm, and its size can be specified. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-16 16:32:41 +02:00
Fupan Li	d48c542a52	runtime-rs: Support Firecracker disk rate limiter This PR adds code that passes disk limiter parameters from KC configuration to Firecracker. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-16 16:27:58 +02:00
Fupan Li	e0caeb32fc	runtime-rs: move the rate limiter to hypervisor config Since the rate limiter would be shared by cloud-hypervisor and firecracker etc, thus move it from clh's config to hypervisor config crate which would be shared by other vmm. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-16 16:27:58 +02:00
Fupan Li	73e31ea19a	runtime-rs: add the block devices io limit support Given that Rust-based VMMs like cloud-hypervisor, Firecracker, and Dragonball naturally offer user-level block I/O rate limiting, I/O throttling has been implemented to leverage this capability for these VMMs. This PR specifically introduces support for cloud-hypervisor. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-16 16:27:58 +02:00
Steve Horsman	ac74ef4505	Merge pull request #11801 from Apokleos/blk-sharerw runtime-rs: Enable share-rw=true when hotplug block device within qemu	2025-09-16 14:55:57 +01:00
Sumedh Alok Sharma	3443ddf24d	runtime: clh: Add pci path for hotplugged network endpoints This commit introduces changes to parse the PciDeviceInfo received in response payload when adding a network device to the VM with cloud hypervisor. When hotplugging a network device for a given endpoint, it rightly sets the PciPath of the plugged-in device in the endpoint. In calls like virtcontainers/sandbox.go:AddInterface, the later call to agent sends the pci info for uevents (instead of empty value) to rightly update the interfaces instead of failing with `Link not found` Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2025-09-16 12:45:57 +00:00
Alex Lyn	e9a5de35e8	runtime-rs: Enable share-rw=true when hotplug block device within qemu Support for the share-rw=true parameter has been added. While this parameter is essential for maintaining data consistency across multiple QEMU instances sharing a backend disk image, its implementation also serves to standardize parameters with the block device hotplug functionality in kata-runtime/qemu. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-16 10:55:29 +01:00
Fupan Li	df852b77b5	Merge pull request #11799 from Apokleos/fix-virtual-volume-type runtime-rs: Bugfix for kata virtual volume overlay fstype	2025-09-16 09:38:07 +08:00
Dan Mihai	489b677927	Merge pull request #11732 from microsoft/saulparedes/init_data_policy_support genpolicy: add init data support	2025-09-15 15:45:57 -07:00
Fabiano Fidêncio	8abfef358a	tests: Only run docker tests with one VMM Docker tests have been broken for a while and should be removed if we cannot maintain those. For now, though, let's limit it to run only with one hypervisor and avoid wasting resources for no reason. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-15 23:03:04 +02:00
Fabiano Fidêncio	dce6f13da8	tests: Only run devmapper tests with QEMU devmapper tests have been failing for a while. It's been breaking on the kata-deploy deployment, which is most likely related to Disk Pressure. Removing files was not enough to get the tests to run, so we'll just run those with QEMU as a way to test fixes. Once we get the test working, we can re-enable the other VMMs, but for now let's just not waste resources for no reason. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-15 23:02:33 +02:00
Saul Paredes	e3e406ff26	tests: remove add_allow_all_policy_to_yaml call from helper func add_allow_all_policy_to_yaml now also sets the initdata annotation. So don't overwrite the initdata annotation that was previously set by create_coco_pod_yaml_with_annotations. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:40:29 -07:00
Saul Paredes	cc73b14e26	docs: update policy docs Update policy docs to use initdata annotation and encoding Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:40:29 -07:00
Saul Paredes	b5352af1ee	tests: update tests that manually set policy Use new initdata annotation instead Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:40:29 -07:00
Saul Paredes	2d8c3206c7	gha: allow cbl-mariner to test using initdata annotation Allow "cc_init_data" hypervisor annotation. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:40:29 -07:00
Saul Paredes	5d124523f8	runtime: add initdata support in clh Prepare the initdata image and mount it as a block device. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:40:21 -07:00
Saul Paredes	252d4486f1	runtime: delete initdata annotation Delete annotation from OCI spec and sandbox config. This is done after the optional initdata annotation value has been read. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:34:26 -07:00
Saul Paredes	af41f5018f	runtime: share initdata setup code Move setup code such that it can be used by other hypervisors. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:34:26 -07:00
Saul Paredes	a427537914	genpolicy: add initdata support Encode policy inside initdata and encode as annotation (base64(gzip(toml))). Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:34:26 -07:00
Saul Paredes	10de56a749	kata-types: expose encode and decode initdata helper methods These methods can be used by other components, such as genpolicy. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-09-15 11:34:26 -07:00
Mikko Ylinen	86fe419774	versions: update kernel-confidential to Linux v6.16.7 update to the latest available v6.16 stable series kernel for CoCo. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-09-15 20:29:22 +02:00
Steve Horsman	fab828586b	Merge pull request #11771 from stevenhorsman/attempt-crio-1.34.0-bump runtime: Bump cri-o to latest	2025-09-15 17:31:13 +01:00
Alex Tibbles	fa6e4981a1	versions: bump ovmf edk2 version Update ovmf to latest release. Includes CVE-2024-38805 fix. EDK2 changelogs for releases since edk2-stable202411: https://github.com/tianocore/edk2/releases/tag/edk2-stable202508 https://github.com/tianocore/edk2/releases/tag/edk2-stable202505 https://github.com/tianocore/edk2/releases/tag/edk2-stable202502 Signed-off-by: Alex Tibbles <alex@bleg.org>	2025-09-15 15:38:33 +02:00
stevenhorsman	dc64d256bf	runtime: Bump cri-o to latest Bump cri-o to 1.34.0 to try and remediate security advisories CVE-2025-0750 and CVE-2025-4437. Note: Running ``` go get github.com/cri-o/cri-o@v1.34.0 ``` seems to bump a lot of other go modules, hence the size of the vendor diff Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 14:29:06 +01:00
stevenhorsman	16dd1de0ab	kata-monitor: Update deprecated use of grpc functions In google.golang.org/grpc v1.72.0, `DialContext`, is deprecated, so switch to use `NewClient` instead. `grpc.WithBlock()` is deprecated and not recommend, so remove this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 14:29:06 +01:00
stevenhorsman	b9ff5ffc21	kata-monitor: Replace use of deprecated expfmt.FmtText In `github.com/prometheus/common v0.62.0` expfmt.FmtText is deprecated, so replace with `expfmt.NewFormat(expfmt.TypeTextPlain)`. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 14:29:06 +01:00
stevenhorsman	7f86b967d1	runtime: Replace use of deprecated expfmt.FmtText In `github.com/prometheus/common v0.62.0` expfmt.FmtText is deprecated, so replace with `expfmt.NewFormat(expfmt.TypeTextPlain)`. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 14:29:06 +01:00
stevenhorsman	62ed86d1aa	runtime: Update deprecated use of grpc.Dial In google.golang.org/grpc v1.72.0, `Dial`, is deprecated, so switch to use `NewClient` instead Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 14:29:06 +01:00
stevenhorsman	334340aa18	runtime: Update remove methods In selinux v1.12.0, `label.SetProcessLabel`, was removed to be replaced by `selinux.SetExecLabel` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 14:29:06 +01:00
Fabiano Fidêncio	ad7e60030a	tests: k8s: kata-deploy: Remove unnecessary dirs to free up space This is following Steve's suggestion, based on what's been done on cloud-api-adaptor. The reason we're doing it here is because we've seen pods being evicted due to disk pressure. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-15 15:27:54 +02:00
Fabiano Fidêncio	60ba121a0d	kata-deploy: nit: Fix test name Just add a "is" there as it was missing. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-15 15:27:54 +02:00
Fabiano Fidêncio	d741544fa6	kata-deploy: Don't fail if the runtimeclass is already deleted I've hit this when using a machine with slow internet connection, which took ages to download the kata-cleanup image, and then helm timed out in the middle of the cleanup, leading to the cleanup job being restarted and then bailing with an error as the runtimeclasses that kata-deploy tries to delete were already deleted. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-15 15:27:54 +02:00
Fupan Li	679cdeadc8	runtime: fix the issue clh resize vcpu failed Since the cloud hypervisor's resize vCPU is an asynchronous operation, it's possible that the previous resize operation hasn't completed when the request is sent, causing the current call to return an error. Therefore, several retries can be performed to avoid this error. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-15 14:29:25 +02:00
Alex Tibbles	66a3d4b4a2	versions: bump kernel to 6.12.47 Update LTS kernel to latest. Signed-off-by: Alex Tibbles <alex@bleg.org>	2025-09-15 14:19:48 +02:00
Alex Tibbles	710c117a24	version: Bump QEMU to v10.1.0 A minor release of QEMU is out, so update to it for fixes and features. QEMU changelog: https://wiki.qemu.org/ChangeLog/10.1 Notes: * AVX support is not an option to be enabled / disabled anymore. * Passt requires Glibc 2.40.+, which means a dependency on Ubuntu 25.04 or newer, thus we're disabling it. Signed-off-by: Alex Tibbles <alex@bleg.org>	2025-09-15 14:19:25 +02:00
stevenhorsman	e3aa973995	versions(deps): Bump slab versions prior to 0.4.10 Although versions of slab prior to 0.4.10, don't have a security vulnearability, we can bump them all to keep things in sync Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 09:48:03 +02:00
stevenhorsman	9c0fcd30c5	ci: Add slab to dependabot groups Add slab, so that in future the different component bumps are all done together Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 09:48:03 +02:00
stevenhorsman	924051c652	genpolicy: Bump slab crate to 0.4.11 Bump versions to remediate CVE-2025-55159 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 09:48:03 +02:00
stevenhorsman	8fb4332d42	agent-ctl: Bump slab crate to 0.4.11 Bump versions to remediate CVE-2025-55159 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 09:48:03 +02:00
dependabot[bot]	84bcf34c75	build(deps): bump slab from 0.4.10 to 0.4.11 in /src/runtime-rs Bumps [slab](https://github.com/tokio-rs/slab) from 0.4.10 to 0.4.11. - [Release notes](https://github.com/tokio-rs/slab/releases) - [Changelog](https://github.com/tokio-rs/slab/blob/master/CHANGELOG.md) - [Commits](https://github.com/tokio-rs/slab/compare/v0.4.10...v0.4.11) --- updated-dependencies: - dependency-name: slab dependency-version: 0.4.11 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-15 09:48:03 +02:00
Fabiano Fidêncio	60790907ef	clh: Update to v48.0 release ``` Experimental fw_cfg Device Support This feature enables passing configuration data and files, such as VM boot configurations (kernel, kernel cmdline, e820 memory map, and ACPI tables), from the host to the guest. (#7117) Experimental ivshmem Device Support Support for inter-VM shared memory has been added. For more information, please refer to the ivshmem documentation. (#6703) Firmware Boot Support on riscv64 In addition to direct kernel boot, firmware boot support has been added on riscv64 hosts. (#7249) Increased vCPU Limit on x86_64/kvm The maximum number of supported vCPUs on x86_64 hosts using KVM has been raised from 254 to 8192. (#7299) Improved Block Performance with Small Block Sizes Performance for virtio-blk with small block sizes (16KB and below) is enhanced via submitting async IO requests in batches. (#7146) Faster VM Pause Operation The VM pause operation now is significantly faster particularly for VMs with a large number of vCPUs. (#7290) Updated Documentation on Windows Guest Support Our Windows documentation now includes instructions to run Windows 11 guests, in addition to Windows Server guests. (#7218) Policy on AI Generated Code We will decline any contributions known to contain contents generated or derived from using Large Language Models (LLMs). Details can be found in our contributing documentation. (#7162) Removed SGX Support The SGX support has been removed, as announced in the deprecation notice two release cycles ago. (#7093) Notable Bug Fixes Seccomp filter fixes with glibc v2.42 (#7327) Various fixes related to (#7331, #7334, #7335) ``` From https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v48.0 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-15 08:30:18 +02:00
Fupan Li	4dc21aa966	Merge pull request #11766 from Apokleos/fix-create_container_timeout kata-types: Support create_container_timeout set within configuration	2025-09-15 10:19:58 +08:00
Alex Lyn	7874505249	Merge pull request #11782 from Apokleos/enhance-policy-rs genpolicy: Enhance policy rule for runtime-rs scenarios	2025-09-15 10:07:14 +08:00
Alex Lyn	e3d6cb8547	Merge pull request #11716 from lifupan/fupan_main runtime-rs: make the virtio-blk use the pci bus as default	2025-09-15 09:49:40 +08:00
Alex Lyn	7062a769b7	genpolicy: Exclude cgroup namespace from namespace validation Exclude 'cgroup' namespace from namespace checks during `allow_linux` validation. This complements the existing exclusion of the 'network' namespace. As runtime-rs has specific cgroup namespace configurations, and excluding it from policy validation ensures parity between runtime-rs and runtime-go implementations. This allows focusing validation on critical namespaces like PID, IPC, and MNT, while avoiding potential policy mismatches due to another cgroup namespace management by the runtime-rs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-14 17:24:06 +08:00
Alex Lyn	12a9ad56b4	genpolicy: Normalize namespace type for mount/mnt compatibility Add `normalize_namespace_type()` function to map "mount" (case-insensitive) to "mnt" while keeping other values unchanged. This ensures namespace comparisons treat "mount" and "mnt" as equivalent. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-14 17:24:06 +08:00
Alex Lyn	ebdfbd3120	genpolicy: Make comparison order-independent and accept CAP_X/X - Use set comparison to ignore ordering differences when matching capabilities. - Add normalization to strip "CAP_" prefix to support both CAP_XXX and XXX formats. This makes capability matching more robust against different ordering and naming formats. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-14 17:23:58 +08:00
Alex Lyn	04dedda6ed	runtime-rs: Bugfix for kata virtual volume overlay fstype As prvious configure with overlayfs is incorrect, which causes the agent policy validation failure. And it's also different with runtime-go's configuration. In this patch, we'll correct its fstype with overlay and align with runtime on this matter. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-14 16:38:09 +08:00
Fupan Li	d073af4e64	dragonball: fix the issue of missing unregister doorbell It should unregister the doorbell resources once the device was reset. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-14 09:08:26 +08:00
Fupan Li	2844a6f938	runtime-rs: sync hotunplug the block devices for dragonball When hot-removing a block device, the kernel must first unmount the device and then destroy it on the VM. Therefore, a prepare_remove_block_device procedure must be added to wait for the kernel to unmount the device before destroying it on the VM. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-14 09:08:26 +08:00
Fupan Li	6e5fe96ed1	dragonball: sync remove the block devices When hot-removing a block device, the kernel must first remove the device and then destroy it on the VM. Therefore, a prepare_remove_block_device procedure must be added to wait for the kernel to unmount the device before destroying it on the VM. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-14 09:08:26 +08:00
Fupan Li	c80ddd3fd9	runtime-rs: make virtio-blk use the pci bus as default Since Dragonball's MMIO bus only supports legacy interrupts, while the PCI bus supports MSIX interrupts, to improve performance for block devices, virtio-blk devices are set to PCI bus mode by default. We had tested the virtio-blk's performance using the fio with the following commands: fio -filename=./test -direct=1 -iodepth 32 -thread -rw=randrw -rwmixread=50 -ioengine=libaio -bs=4k -size=10G -numjobs=4 -group_reporting -name=mytest When used the legacy interrupt, the io test is as below: read : io=20485MB, bw=195162KB/s, iops=48790, runt=107485msec write: io=20475MB, bw=195061KB/s, iops=48765, runt=107485msec Once switched to msix innterrupt, the io test is as below: read : io=20485MB, bw=260862KB/s, iops=65215, runt= 80414msec write: io=20475MB, bw=260727KB/s, iops=65181, runt= 80414msec We can get 34% performance improvement. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-14 09:08:26 +08:00
Fupan Li	2dd172c5b6	dragonball: Add the pci bus support for virtio-blk Added support for PCI buses for virtio-blk devices. This commit adds support for PCI buses for both cold-plugged and hot-plugged virtio-blk devices. Furthermore, during hot-plugging, support is added for synchronous waiting for hot-plug completion. This ensures that multiple devices can be hot-plugged successfully without causing upcall busy errors. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-14 09:08:26 +08:00
Fupan Li	3c3823f2e4	dragonball: refactoring the pci system manager In order to support the pci bus for virtio devices, move the pci system manager from vfio manager to device manager, thus it can be shared by both of vfio and virtio pci devices. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-14 09:08:26 +08:00
Fupan Li	59273e8b2d	dragonball: add the msix interrupt support Add the msix notify support for virito queues. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-14 09:08:26 +08:00
Fupan Li	7de6455742	dragonball: add the pci bus support for virtio Add the pci bus support for virtio devices. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-14 09:08:26 +08:00
Dan Mihai	34925ae740	Merge pull request #11795 from microsoft/danmihai1/snp-annotations runtime: snp: enable CoCo annotations	2025-09-12 14:23:54 -07:00
Dan Mihai	60beb5236d	runtime: snp: enable CoCo annotations Use @DEFENABLEANNOTATIONS_COCO@ in configuration-qemu-snp.toml, for consistency with the tdx and coco-dev configuration files. k8s-initdata.bats was failing during CI on SNP without this change, because the cc_init_data annotation was disabled. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-12 15:38:33 +00:00
RuoqingHe	a011d2132f	Merge pull request #11775 from RuoqingHe/fix-test_execute_hook libs: Fix unit tests under non-root user	2025-09-12 08:03:05 +08:00
Aurélien Bombo	760b465bb0	Merge pull request #11788 from kata-containers/sprt/zizmor-branch ci: Run Zizmor on pushes to any branch	2025-09-11 11:52:06 -05:00
Aurélien Bombo	11655ef029	ci: Run Zizmor on pushes to any branch This runs Zizmor on pushes to any branch, not just main. This is useful for: 1. Testing changes in feature branches with the manually-triggered CI. 2. Forked repos that may use a different name than "main" for their default branch. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-11 09:33:25 -05:00
Ruoqing He	f6e93c2094	libs: Fix test_get_uds_with_sid_with_zero Test case for `get_uds_with_sid` with an empty run directory would not hit the 0 match arm, i.e. "sandbox with the provided prefix {short_id:?} is not found", because `get_uds_with_sid` will try to create the directory with provided short id before detecting `target_id`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-09-11 02:04:54 +00:00
Ruoqing He	b10e5a2250	libs: Fix test_get_uds_with_sid_ok Preset directory `kata98654sandboxpath1` will produce more than one `target_id` in `get_uds_with_sid`, which causes test to fail. Remove that directory to make this test work. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-09-11 02:04:54 +00:00
Ruoqing He	efeba0b8ed	libs: Detect guest protection before testing `test_arch_guest_protection_*` test cases get triggered simultaneously, which is impossible for a single machine to pass. Modify tests to detect protection file before preceding. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-09-11 02:04:54 +00:00
Ruoqing He	a9ba18d48c	libs: Fix test_execute_hook test Case 4 of `test_execute_hook` would fail because `args` could not be empty, while by providing `build_oci_hook` with `vec![]` would result in empty args at execution stage. Modify `build_oci_hook` to set args as `None` when empty vector is provided. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-09-11 02:04:54 +00:00
Dan Mihai	5d59341f7f	Merge pull request #11780 from ryansavino/snp-guest-kernel-upgrade-issue packaging: add required modules for confidential guest kernel	2025-09-10 18:21:26 -07:00
Liang, Ma	a989686cf6	kata-agent: Rename misleading variable in config parsing The variable `addr` was used to store the log level string read from the `LOG_LEVEL_ENV_VAR` environment variable. This name is misleading as it implies a network address rather than a log level value. This commit renames the variable to `level` to more accurately reflect its purpose, improving the overall readability of the configuration code. A minor whitespace formatting fix in a macro is also included. Signed-off-by: Liang, Ma <liang3.ma@intel.com>	2025-09-11 07:54:48 +08:00
Steve Horsman	58259aa5f4	Merge pull request #11754 from stevenhorsman/go.mod-1.24.6-bump versions: Tidy up go.mod versions	2025-09-10 14:11:33 +01:00
Hyounggyu Choi	1737777d28	Merge pull request #11743 from BbolroC/enable-ci-qemu-se-runtime-rs runtime-rs: Enable s390x nightly test for IBM SEL	2025-09-10 15:00:16 +02:00
Alex Lyn	1d26d07110	Merge pull request #11781 from lifupan/fupan_main_qemu runtime-rs: log out the qemu console when debug enabled	2025-09-10 16:59:30 +08:00
Hyounggyu Choi	1060a94b08	GHA: Add s390x nightly test for runtime-rs on IBM SEL A new internal nightly test has been established for runtime-rs. This commit adds a new entry `cc-se-e2e-tests-rs` to the existing matrix and renames the existing entry `cc-se-e2e-tests` to `cc-se-e2e-tests-go`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-09-10 10:57:40 +02:00
Hyounggyu Choi	37764d18d4	tests: Skip k8s tests for qemu-se-runtime-rs Tests skipped because tests for `qemu-se` are skipped: - k8s-empty-dirs.bats - k8s-inotify.bats - k8s-shared-volume.bats Tests skipped because tests for `qemu-runtime-rs` are skipped: - k8s-block-volume.bats - k8s-cpu-ns.bats - k8s-number-cpus.bats Let's skip the tests above to run the nightly test for runtime-rs on IBM SEL. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-09-10 10:57:40 +02:00
Steve Horsman	e502fa2feb	Merge pull request #11731 from kata-containers/dependabot/go_modules/src/tools/csi-kata-directvolume/github.com/ulikunitz/xz-0.5.14 build(deps): bump github.com/ulikunitz/xz from 0.5.11 to 0.5.14 in /src/tools/csi-kata-directvolume	2025-09-10 09:47:28 +01:00
Steve Horsman	3f25b88f89	Merge pull request #11737 from kata-containers/dependabot/cargo/src/runtime-rs/tracing-subscriber-0.3.20 build(deps): bump tracing-subscriber from 0.3.17 to 0.3.20 in /src/runtime-rs	2025-09-10 09:47:07 +01:00
Steve Horsman	22bc29cb4a	Merge pull request #11746 from stevenhorsman/bump-tests-go-mod-yaml-3.0.1 versions: Bump gopkg.in/yaml.v3	2025-09-10 09:46:18 +01:00
RuoqingHe	106c6cea59	Merge pull request #11774 from RuoqingHe/2025-09-09-disable-make-test-libs-temporarily ci: gatekeeper: Mark `make test libs` not required	2025-09-10 14:52:33 +08:00
Fupan Li	16be168062	runtime-rs: log out the qemu console when debug enabled When hypervisor's debug enabled, log out the qemu's console messages for kernel boot debugging. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-10 14:19:15 +08:00
Fupan Li	5715408d61	runtime-rs: add the console device to kernel boot for qemu Add the console device to kernel boot, thus we can log out the kernel's boot message for debug. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-09-10 14:10:45 +08:00
Ruoqing He	6a2d813196	ci: gatekeeper: Mark `make test libs` not required There are still some issues to be address before we can mark `make test` for `libs` as required. Mark this case as not required temporarily. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-09-10 03:52:20 +00:00
Ryan Savino	85779a6f1a	packaging: add required modules for confidential guest kernel SNP launch was failing after the confidential guest kernel was upgraded to 6.16.1. Added required module CONFIG_MTRR enabled. Added required module CONFIG_X86_PAT enabled. Fixes: #11779 Signed-off-by: Ryan Savino <ryan.savino@amd.com>	2025-09-09 21:58:15 -05:00
Xuewei Niu	c1ee0985ed	Merge pull request #11770 from stevenhorsman/agent-ctl-bump-hypervisor agent-ctl: version: bump hypervisor	2025-09-09 11:59:25 +08:00
Aurélien Bombo	ceab55a871	Merge pull request #11772 from kata-containers/sprt/zizmor-hash ci: security: Fix "commit hash does not point to a Git tag"	2025-09-08 13:56:25 -05:00
Aurélien Bombo	b640fe5a6a	Merge pull request #11756 from kata-containers/sprt/curl-logging ci: cri-containerd-amd64: add logging for curl failures	2025-09-08 11:55:29 -05:00
Aurélien Bombo	c0030c271c	ci: security: Fix "commit hash does not point to a Git tag" This fixes all such issues, ie.: https://github.com/kata-containers/kata-containers/security/code-scanning/459 https://github.com/kata-containers/kata-containers/security/code-scanning/508 https://github.com/kata-containers/kata-containers/security/code-scanning/510 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-08 11:17:54 -05:00
Aurélien Bombo	cbcc7af6f3	Merge pull request #11615 from kata-containers/sprt/zizmor-pedantic security: gha: Run Zizmor in auditor mode	2025-09-08 10:28:19 -05:00
stevenhorsman	87356269d8	versions: Tidy up go.mod versions Update go 1.23 references to go 1.24.6 to match versions.yaml Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-08 14:03:47 +01:00
stevenhorsman	2d28f3d267	agent-ctl: version: bump hypervisor Bump the version of runtime-rs' hypervisor crate to upgrade (indirectly) protobug and remediate vulnerability RUSTSEC-2024-0437 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-08 13:23:03 +01:00
dependabot[bot]	5ae34ab240	build(deps): bump github.com/ulikunitz/xz Bumps [github.com/ulikunitz/xz](https://github.com/ulikunitz/xz) from 0.5.11 to 0.5.14. - [Commits](https://github.com/ulikunitz/xz/compare/v0.5.11...v0.5.14) --- updated-dependencies: - dependency-name: github.com/ulikunitz/xz dependency-version: 0.5.14 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-09-08 11:30:49 +01:00
Alex Lyn	8eeea7d1fc	runtime-rs: Correct the default create_container_timeout with 30s The previous document about the default of create_container_timeout is 30,000 millseconds which not keep alignment with runtime-go. In this commit, we'll change it as 30 seconds. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-07 21:59:37 +08:00
Alex Lyn	3e53f2814a	kata-types: Support create_container_timeout set within configuration Since it aligns with the create_container_timeout definition in runtime-go, we need to set the value in configuration.toml in seconds, not milliseconds. We must also convert it to milliseconds when the configuration is loaded for request_timeout_ms. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-07 21:59:32 +08:00
Alex Lyn	4644a02871	Merge pull request #11752 from Apokleos/fix-hooks-devcgrp runtime-rs: Remove default value of Linux.Resources.Devices and correctly set Hooks in OCI Spec to meet with Agent Policy requirements	2025-09-07 18:01:02 +08:00
stevenhorsman	66dc24566f	versions: Bump gopkg.in/yaml.v3 Bump gopkg.in/yaml.v3 from 3.0.0 to 3.0.1 to remediate CVE-2022-28948 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-09-05 16:36:48 +01:00
Aurélien Bombo	c480737ebd	ci: cri-containerd-amd64: add logging for curl failures This is to investigate #11755. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-05 10:35:45 -05:00
Aurélien Bombo	efbc69a2ec	Merge pull request #11760 from kata-containers/sprt/oidc-fix ci: aks: Refresh OIDC token in case access token expired	2025-09-05 10:29:35 -05:00
Dan Mihai	1f68f15995	Merge pull request #11759 from microsoft/danmihai1/policy-storages genpolicy: print Input and Policy storages	2025-09-04 15:07:45 -07:00
Aurélien Bombo	f39517a18a	ci: aks: Refresh OIDC token in case access token expired It's possible that tests take a long time to run and hence that the access token expires before we delete the cluster. In this case `az cli` will try to refresh the access token using the OIDC token (which will have definitely also expired because its lifetime is ~5 minutes). To address this we refresh the OIDC token manually instead. Automatic refresh isn't supported per Azure/azure-cli#28708. Fixes: #11758 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-04 12:44:02 -05:00
Dan Mihai	9b0b7fc795	genpolicy: print Input and Policy storages Print the Storage data structures, to help with debugging. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-09-04 16:03:03 +00:00
Cameron Baird	bdd98ec623	ci: Add test case for iptables, exercised via istio init container Introduce new test case in k8s-iptables.bats which verifies that workloads can configure iptables in the UVM. Users discovered that they weren't able to do this for common usecases such as istio. Proper support for this should be built into UVM kernels. This test ensures that current and future kernel configurations don't regress this functionality. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-09-04 07:18:45 +02:00
Cameron Baird	d16026f7b9	kernel: add required configs for ip6tables support Currently, the UVM kernel fails for istio deployments (at least with the version we tested, 1.27.0). This is because the istio sidecar container uses ip6tables and the required kernel configs are not built-in: ``` iptables binary ip6tables has no loaded kernel support and cannot be used, err: exit status 3 out: ip6tables v1.8.10 (legacy): can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?) Perhaps ip6tables or your kernel needs to be upgraded. ``` Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-09-04 07:18:45 +02:00
Aurélien Bombo	1dcc67c241	security: gha: Use Zizomor's auditor mode This is the strictest possible setting for Zizmor. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-03 12:30:09 -05:00
Hyounggyu Choi	49ca96561b	Merge pull request #11750 from BbolroC/use-pattern-working-for-both-runtimes tests: Use "Failed" consistently for both runtimes	2025-09-03 13:06:05 +02:00
Alex Lyn	e235fc1efb	runtime-rs: Remove default value of Linux.Resources.Devices in OCI Spec In certain scenarios, particularly under CoCo/Agent Policy enforcement, the default initial value of `Linux.Resources.Devices` is considered non-compliant, leading to container creation failures. To address this issue and ensure consistency with the behavior in `runtime-go`, this commit removes the default value of `Linux.Resources.Devices` from the OCI Spec. This cleanup ensures that the OCI Spec aligns with runtime expectations and prevents policy violations during container creation. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-03 18:42:34 +08:00
Alex Lyn	203f7090a6	runtime-rs: Ensure the setting of hooks when OCI Hooks is existing. Only the StartContainer hook needs to be reserved for execution in the guest, but we also make sure that the setting happens only when the OCI Hooks does exist, otherwise we do nothing. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-09-03 17:38:40 +08:00
Hyounggyu Choi	6d6202bbe3	tests: Use "Failed" consistently for both runtimes In k8s-guest-pull-image.bats, `failed to pull image` is not caught by assert_logs_contain() for runtime-rs. To ensure consistency, this commit changes `failed` to `Failed`, which works for both runtimes. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-09-03 09:09:13 +02:00
Hyounggyu Choi	150c90e32a	Merge pull request #11728 from BbolroC/fix-sealed-secret-volume runtime-rs: Adjust path for sealed secret mount check	2025-09-02 16:57:33 +02:00
Fupan Li	9cc1c76ade	Merge pull request #11729 from kata-containers/dependabot/go_modules/src/tools/log-parser/gopkg.in/yaml.v3-3.0.1 build(deps): bump gopkg.in/yaml.v3 from 3.0.0 to 3.0.1 in /src/tools/log-parser	2025-09-02 17:05:51 +08:00
dependabot[bot]	8330dd059f	build(deps): bump tracing-subscriber in /src/runtime-rs Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.3.17 to 0.3.20. - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.3.17...tracing-subscriber-0.3.20) --- updated-dependencies: - dependency-name: tracing-subscriber dependency-version: 0.3.20 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-08-29 20:44:35 +00:00
Xuewei Niu	f6ff9cf717	Merge pull request #11689 from Caspian443/fix-devmapper-selinux-mount-issue runtime-rs: Empty block-rootfs Storage.options and align with Go runtime	2025-08-29 15:29:46 +08:00
Aurélien Bombo	754f07cff2	Merge pull request #11614 from kata-containers/workflow-permissions-tightening Workflow permissions tightening	2025-08-28 10:56:03 -05:00
dependabot[bot]	3a0416c99f	build(deps): bump gopkg.in/yaml.v3 in /src/tools/log-parser Bumps gopkg.in/yaml.v3 from 3.0.0 to 3.0.1. --- updated-dependencies: - dependency-name: gopkg.in/yaml.v3 dependency-version: 3.0.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-08-28 14:03:22 +00:00
Hyounggyu Choi	65fdb18c96	runtime-rs: Adjust path for sealed secret mount check Mount validation for sealed secret requires the base path to start with `/run/kata-containers/shared/containers`. Previously, it used `/run/kata-containers/sandbox/passthrough`, which caused test failures where volume mounts are used. This commit renames the path to satisfy the validation check. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-28 15:38:07 +02:00
Fabiano Fidêncio	08d2ba1969	cgroups: Fix "." parent cgroup special case `ef642fe890` added a special case to avoid moving cgroups that are on the "default" slice in case of deletion. However, this special check should be done in the Parent() method instead, which ensures that the default resource controller ID is returned, instead of ".". Fixes: #11599 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-27 08:15:15 +02:00
Caspian443	617af4cb3b	runtime-rs: Empty block-rootfs Storage.options and align with Go runtime - Set guest Storage.options for block rootfs to empty (do not propagate host mount options). - Align behavior with Go runtime: only add xfs nouuid when needed. Signed-off-by: Caspian443 <scrisis843@gmail.com>	2025-08-26 01:27:21 +00:00
Caspian443	9a7aadaaca	libs: Introduce rootfs fs types - Add new kata-types::fs module with: - VM_ROOTFS_FILESYSTEM_EXT4 - VM_ROOTFS_FILESYSTEM_XFS - VM_ROOTFS_FILESYSTEM_EROFS - Export fs module in src/libs/kata-types/src/lib.rs - Remove duplicated filesystem constants from src/runtime-rs/crates/hypervisor/src/lib.rs - Update src/runtime-rs/crates/hypervisor/src/kernel_param.rs (and tests) to import from kata_types::fs Signed-off-by: Caspian443 <scrisis843@gmail.com>	2025-08-26 01:26:53 +00:00
Fabiano Fidêncio	63f6dcdeb9	kata-manager: Support xz and zst suffixes for the kata tarball We moved to `.zst`, but users still use the upstream kata-manager to download older versions of the project, thus we need to support both suffixes. Fixes: #11714 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-25 21:15:06 +02:00
Fupan Li	687d0bf94a	Merge pull request #11715 from fidencio/topic/backport-qemu-reclaim-guest-freed-memory runtime: qemu: Add reclaim_guest_freed_memory [BACKPORT]	2025-08-25 16:59:29 +08:00
Fabiano Fidêncio	fd1b8ceed1	runtime: qemu: Add reclaim_guest_freed_memory [BACKPORT] Similar to what we've done for Cloud Hypervisor in the commit `9f76467cb7`, we're backporting a runtime-rs feature that would be benificial to have as part of the go runtime. This allows users to use virito-balloon for the hypervisor to reclaim memory freed by the guest. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-22 23:56:47 +02:00
stevenhorsman	b4545da15d	workflows: Set top-level permissions to empty The default suggestion for top-level permissions was `contents: read`, but scorecard notes anything other than empty, so try updating it and see if there are any issues. I think it's only needed if we run workflows from other repos. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-22 14:13:21 +01:00
stevenhorsman	f79e453313	workflows: Tighten up workflow permissions Since the previous tightening a few workflow updates have gone in and the zizmor job isn't flagging them as issues, so address this to remove potential attack vectors Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-22 14:13:21 +01:00
Fabiano Fidêncio	e396a460bc	Revert "local-build: Enforce USE_CACHE=no" This reverts commit `cb5f143b1b`, as the cached packages have been regenerated after the switch to using zstd. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-22 14:03:36 +02:00
Steve Horsman	23d2dfaedc	Merge pull request #11707 from fidencio/topic/switch-to-use-zstd-when-possible kata-deploy: local-build: Use zstd instead of xz	2025-08-22 10:06:00 +01:00
stevenhorsman	8cbb1a4357	runtime: Fix non constant Errorf formatting As part of the go 1.24.6 bump there are errors about the incorrect use of a errorf, so switch to the non-formatting version, or add the format string as appropriate Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-22 10:44:15 +02:00
stevenhorsman	381da9e603	versions: Bump golang to 1.24.6 golang 1.25 has been released, so 1.23 is EoL, so we should update to ensure we don't end up with security issues Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-22 10:44:15 +02:00
stevenhorsman	0ccf429a3d	workflows: Switch workflows to use install_go.sh Update the two workflows that used setup-go to instead call `install_go.sh` script, which handles installing the correct version of golang Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-22 10:44:15 +02:00
stevenhorsman	5f7525f099	build: Add darwin support to arch_to_golang Avoid the error `ERROR: unsupported architecture: arm64` in install_go.sh on darwin Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-22 10:44:15 +02:00
stevenhorsman	3391c6f1c5	ci: Make install_go.sh more portable `${kernel_name,,}` is bash 4.0 and not posix compliant, so doesn't work on macos, so switch to `tr` which is more widely supported Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-22 10:44:15 +02:00
Alex Lyn	91913f9e82	Merge pull request #11711 from stevenhorsman/remote-allow-cc_init_data-annotation runtime: Enable init_data annotation	2025-08-22 14:41:53 +08:00
Fupan Li	1a0fbbfa32	Merge pull request #11699 from Apokleos/support-nonprotection runtime-rs: Support initdata within NonProtection scenarios	2025-08-22 10:24:47 +08:00
Hyounggyu Choi	41dcfb4a9f	Merge pull request #11321 from BbolroC/reconnect-timeout-qemu-se runtime-rs: Adjust VSOCK timeouts for IBM SEL	2025-08-22 00:34:05 +02:00
Fabiano Fidêncio	cb5f143b1b	local-build: Enforce USE_CACHE=no We need that to regenerate the tarballs that are already cached in the zstd format. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-21 21:00:20 +02:00
stevenhorsman	081823b388	runtime: Enable init_data annotation In #11693 the cc_init_data annotation was changes to be hypervisor scoped, so each hypervisor needs to explicitly allow it in order to use it now, so add this to both the go and rust runtime's remote configurations Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-21 19:26:10 +01:00
Fabiano Fidêncio	f8d7ff40b4	local-build: Fix shim-v2 no cache build with measured rootfs We need to get the root_hash.txt file from the image build, otherwise there's no way to build the shim using those values for the configuration files. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-21 19:56:01 +02:00
Fabiano Fidêncio	ad240a39e6	kata-deploy: tools: tests: Use zstd instead of xz Although the compress ratio is not as optimal as using xz, it's way faster to compress / uncompress, and it's "good enough". This change is not small, but it's still self-contained, and has to get in at once, in order to help bisects in the future. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-08-21 19:53:55 +02:00
Fabiano Fidêncio	9cc97ad35c	kata-deploy: Bump image to use alpine 3.22 As 3.18 is already EOL. We need to add `--break-system-packages` to enforce the install of the installation of the yq version that we rely on. The tests have shown that no breakage actually happens, fortunately. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-08-21 19:53:55 +02:00
Fabiano Fidêncio	1329ce355e	versions: image / initrd: Bump to alpine 3.22 As the 3.18 is EOL'ed. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-21 19:53:55 +02:00
Fabiano Fidêncio	c32fc409ec	rootfs-builder: Bump alpine to 3.22 As we were using a very old non-supported version. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-21 19:53:55 +02:00
Zvonko Kaiser	60d87b7785	gpu: Add more debugging to CI/CD Capture NVRC logs via journalctl Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-08-21 18:09:20 +02:00
Alex Lyn	e430727cb6	runtime-rs: Change the initdata device driver with block_device_driver Currently, we change vm_rootfs_driver as the initdata device driver with block_device_driver. Fixes #11697 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-21 18:56:26 +08:00
Alex Lyn	5cc028a8b1	runtime-rs: Support initdata within NonProtection scenarios we also need support initdat within nonprotection even though the platform is detected as NonProtection or usually is called nontee host. Within these cases, there's no need to validate the item of `confidential_guest=true`, we believe the result of the method `available_guest_protection()?`. Fixes #11697 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-21 18:56:23 +08:00
Hyounggyu Choi	faf5aed965	runtime-rs: Adjust VSOCK timeouts for IBM SEL The default `reconnect_timeout` (3 seconds) was found to be insufficient for IBM SEL when using VSOCK. This commit updates the timeouts as follows: - `dial_timeout_ms`: Set to 90ms to match the value used in go-runtime for IBM SEL - `reconnect_timeout_ms`: Increased to 5000ms based on empirical testing Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-21 12:35:44 +02:00
Hyounggyu Choi	b7d2973ce5	Merge pull request #11696 from BbolroC/enable-initdata-ibm-sel-runtime-rs runtime-rs Enable initdata IBM SEL	2025-08-21 09:23:46 +02:00
Hyounggyu Choi	c4b4a3d8bb	tests: Add hypervisor qemu-se-runtime-rs for initdata This commit adds a new hypervisor `qemu-se-runtime-rs` to test initdata for IBM SEL (s390x). Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-20 18:57:50 +02:00
Hyounggyu Choi	2ec70bc8e2	runtime-rs: Enable initdata spec for IBM SEL Add support for the `InitData` resource config on IBM SEL, so that a corresponding block device is created and the initdata is passed to the guest through this device. Note that we skip passing the initdata hash via QEMU’s object, since the hypervisor does not yet support this mechanism for IBM SEL. It will be introduced separately once QEMU adds the feature. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-20 18:57:50 +02:00
Zvonko Kaiser	c980b6e191	release: Bump version to 3.20.0 Bump VERSION and helm-chart versions Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-08-20 18:18:05 +02:00
Markus Rudy	30aff429df	Merge pull request #11647 from Park-Jiyeonn/opt/sealed-secret-prefix-check Optimize sealed secret scanning to avoid full file reads	2025-08-20 17:18:20 +02:00
Alex Lyn	014ab2fce6	Merge pull request #11693 from BbolroC/revert-initdata-annotation runtime-rs: Fix issues for initdata	2025-08-20 21:17:52 +08:00
Fabiano Fidêncio	dd1752ac1c	Merge pull request #11634 from mythi/coco-kernel-v6.16 versions: update kernel-confidential to Linux v6.16.1	2025-08-20 13:01:05 +02:00
Fupan Li	29ab8df881	Merge pull request #11514 from Apokleos/ci-for-libs CI: Introduce CI for libs to Improve code quality and reduce noises	2025-08-20 18:59:27 +08:00
Hyounggyu Choi	0ac8f1f70e	Merge pull request #11705 from Apokleos/remove-default-guesthookpath kata-types: remove default setting of guest_hook_path	2025-08-20 11:15:25 +02:00
Mikko Ylinen	a0ae1b6608	packaging: kernel: libdw-dev and python3 to builder image These new dependencies are needed by Linux 6.16+. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-08-20 11:34:09 +03:00
Mikko Ylinen	412a384aad	versions: update kernel-confidential to Linux v6.16.1 Linux v6.16 brings some useful features for the confidential guests. Most importantly, it adds an ABI to extend runtime measurement registers (RTMR) for the TEE platforms supporting it. This is currently enabled on Intel TDX only. The kernel version bump from v6.12.x to v6.16 forces some CONFIG_* changes too: MEMORY_HOTPLUG_DEFAULT_ONLINE was dropped in favor of more config choices. The equivalent option is MHP_DEFAULT_ONLINE_TYPE_ONLINE_AUTO. X86_5LEVEL was made unconditional. Since this was only a TDX configuration, dropping it completely as part of v6.16 is fine. CRYPTO_NULL2 was merged with CRYPTO_NULL. This was only added in confidential guest fragments (cryptsetup) so we can drop it in this update. CRYPTO_FIPS now depends on CRYPTO_SELFTESTS which further depends on EXPERT which we don't have. Enable both in a separate config fragment for confidential guests. This can be moved to a common setting once other targets bump to post v6.16. CRYPTO_SHA256_SSE3 arch optimizations were reworked and are now enabled by default. Instead of adding it to whitelist.conf, just drop it completely since it was only enabled as part of "measured boot" feature for confidential guests. CONFIG_CRYPTO_CRC32_S390 was reworked the same way. In this case, whitelist.conf is needed. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-08-20 11:32:48 +03:00
Hyounggyu Choi	0daafecef2	Revert "runtime-rs: Correct the coresponding initdata annotation const" This reverts commit `37685c41c7`. This renames the relevant constant for initdata. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-20 10:15:23 +02:00
Hyounggyu Choi	f0db4032f2	Revert "kata-types: Align the initdata annotation with kata-runtime's definition" This reverts commit `ede773db17`. `cc_init_data` should be under a hypervisor category because it is a hypervisor-specific feature. The annotation including `runtime` also breaks a logic for `is_annotation_enabled()`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-20 10:15:23 +02:00
Hyounggyu Choi	208cec429a	runtime-rs: Introduce CoCo-specific enable_annotations We need to include `cc_init_data` in the enable_annotations array to pass the data. Since initdata is a CoCo-specific feature, this commit introduces a new array, `DEFENABLEANNOTATIONS_COCO`, which contains the required string and applies it to the relevant CoCo configuration. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-20 10:15:23 +02:00
Hyounggyu Choi	1f978ecc31	runtime-rs: Fix issues for empty initdata annotation test Currently, there are 2 issues for the empty initdata annotation test: - Empty string handling - "\[CDH\] \[ERROR\]: Get Resource failed" not appearing `add_hypervisor_initdata_overrides()` does not handle an empty string, which might lead to panic like: ``` called `Result::unwrap()` on an `Err` value: gz decoder failed Caused by: failed to fill whole buffer ``` This commit makes the function return an empty string for a given empty input and updates the assertion string to one that appears in both go-runtime and runtime-rs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-20 10:15:23 +02:00
alex.lyn	b23d094928	CI: Introduce CI for libs to Improve code quality and reduce noises Currently, runtime-rs related code within the libs directory lacks sufficient CI protection. We frequently observe the following issues: - Inconsistent Code Formatting: Code that has not been properly formatted is merged. - Failing Tests: Code with failing unit or integration tests is merged. To address these issues, we need introduce stricter CI checks for the libs directory. This may specifically include: - Code Formatting Checks - Mandatory Test Runs Fixes #11512 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-08-20 15:36:09 +08:00
alex.lyn	0f19465b3a	shim-interface: Do cargo check and reduce warnings Reduce shim-interface's warings caused by non-formatted or unchecked operations. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-08-20 15:36:09 +08:00
alex.lyn	e05197e81c	safe-path: Do cargo check and reduce warnings Reduce warings caused by non-formatted or unchecked operations. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-08-20 15:36:09 +08:00
alex.lyn	683d673f4f	protocols: Do cargo format to make codes clean Fix protocols' warings by correctly do cargo check/format. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-08-20 15:36:09 +08:00
alex.lyn	38242d3a61	kata-types: Do cargo check and reduce warnings Reduce noises caused by non-formated codes. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-08-20 15:36:09 +08:00
alex.lyn	283fd45045	kata-sys-utils: fix warnings for s390x The warning reports as bwlow: ``` --> kata-sys-util/src/protection.rs:145:9 \| 145 \| return Err(ProtectionError::NoPerms)?; \| ^^^^^^^ help: remove it \| ... error: `to_string` applied to a type that implements `Display` in `format!` args --> kata-sys-util/src/protection.rs:151:16 \| 151 \| err.to_string() \| ^^^^^^^^^^^^ help: remove this ``` Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-08-20 15:36:09 +08:00
alex.lyn	730b0f1769	kata-sys-utils: Do cargo check codes and reduce warnings Fix kata-sys-utils warings by correctly do cargo check and test it well. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-08-20 15:35:42 +08:00
Fabiano Fidêncio	585d0be342	Merge pull request #11691 from alextibbles/update-lts-kernel versions: update to latest LTS kernel 6.12.42	2025-08-20 08:55:06 +02:00
Fupan Li	b748688e69	Merge pull request #11698 from Apokleos/filter-arpneibhors runtime-rs: Add only static ARP entries with handle_neighours	2025-08-20 14:05:20 +08:00
Alex Lyn	c4af9be411	kata-types: remove default setting of guest_hook_path To make it aligned with the setting of runtime-go, we should keep it as empty when users doesn't enable and set its specified path. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-20 13:56:42 +08:00
Zvonko Kaiser	bce8efca67	gpu: Rebuild initrd and image for kernel bump We need to make sure that we use the latest kernel and rebuild the initrd and image for the nvidia-gpu use-cases otherwise the tests will fail since the modules are not build against the new kernel and they simply fail to load. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-08-19 17:32:42 -04:00
Alex Tibbles	e20f6b2f9d	versions: update to latest LTS kernel 6.12.42 Fixes #11690 Signed-off-by: Alex Tibbles <alex@bleg.org>	2025-08-19 17:32:42 -04:00
Fabiano Fidêncio	3503bcdb50	Merge pull request #11701 from alextibbles/go-stdlib-#11700 versions: sync go.mod with versions.yaml for go 1.23.12	2025-08-19 22:14:57 +02:00
Alex Tibbles	a03dc3129d	versions: sync go.mod with versions.yaml for go 1.23.12 OSV-Scanner highlights go.mod references to go stdlib 1.23.0 contrary to intention in versions.yaml, so synchronize them. Make a converse comment for versions.yaml. Fixes: #11700 Signed-off-by: Alex Tibbles <alex@bleg.org>	2025-08-19 11:30:19 -04:00
Hyounggyu Choi	93ec470928	runtime/tests: Update annotation for initdata Let's rename the runtime-rs initdata annotation from `io.katacontainers.config.runtime.cc_init_data` to `io.katacontainers.config.hypervisor.cc_init_data`. Rationale: - initdata itself is a hypervisor-specific feature - the new name aligns with the annotation handling logic: `c92bb1aa88/src/libs/kata-types/src/annotations/mod.rs (L514-L968)` This commit updates the annotation for go-runtime and tests accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-08-19 15:17:01 +02:00
Alex Lyn	903e608c23	runtime-rs: Add only static ARP entries with handle_neighours To make it aligned with runtime-go, we need add only static ARP entries into the targets. Fixes #11697 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-19 20:09:20 +08:00
Steve Horsman	c92bb1aa88	Merge pull request #11684 from zvonkok/gpu-required gatekeeper: GPU test required	2025-08-15 10:30:19 +01:00
Hyounggyu Choi	28bd0cf405	Merge pull request #11640 from rafsal-rahim/bm-initdata-s390x Feat \| Implement initdata for bare-metal/qemu for s390x	2025-08-15 10:42:32 +02:00
Zvonko Kaiser	3a4e1917d2	gatekeeper: Make GPU test required We now run a simple RAG pipeline with each PR to make sure we do not break GPU support. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-08-14 18:35:39 -04:00
Aurélien Bombo	3a5e2060aa	Merge pull request #11683 from kata-containers/sprt/static-checks-default-branch ci: static-checks: Don't hardcode default repo branch	2025-08-14 17:01:18 -05:00
Zvonko Kaiser	55ee8abf0b	Merge pull request #11658 from kata-containers/amd64-nvidia-gpu-cicd-step2 gpu: AMD64 NVIDIA GPU CI/CD Part 2	2025-08-14 17:51:26 -04:00
Aurélien Bombo	0fa7d5b293	ci: static-checks: Don't hardcode default repo branch This would cause weird issues for downstreams which default branch is not "main". Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-08-14 13:22:20 -05:00
Zvonko Kaiser	dcb62a7f91	Merge pull request #11525 from was-saw/qemu-seccomp runtime-rs: add seccomp support for qemu	2025-08-14 12:35:32 -04:00
Zvonko Kaiser	8be41a4e80	gpu: Add embeding service For a simple RAG pipeline add a embeding service Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-08-14 16:34:21 +00:00
RuoqingHe	65a9fe0063	Merge pull request #11670 from kevinzs2048/add-aavmf CI: change the directory for Arm64 firmware	2025-08-14 21:30:21 +08:00
stevenhorsman	43cdde4c5d	test/k8s: Extend initdata tests to run on s390x Enable testing of initdata on the qemu-coco-dev and qemu-se runtime classes, so we can validate the function on s390x Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-14 17:10:58 +05:30
rafsalrahim	9891b111d1	runtime: Add initdata support to s390x - Added support for initdata device on s390x. - Generalized devno generation for QEMU CCW devices. Signed-off-by: rafsalrahim <rafsal.rahim@ibm.com>	2025-08-14 17:10:58 +05:30
wangxinge	d147e2491b	runtime-rs: add seccomp support for qemu This commit support the seccomp_sandbox option from the configuration.toml file and add the logic for appending command-line arguments based on this new configuration parameter. Fixes: #11524 Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>	2025-08-14 18:45:03 +08:00
Xuewei Niu	479cce8406	Merge pull request #11536 from was-saw/clh/fc-seccomp runtime-rs: add seccomp support for cloud hypervisor and firecracker	2025-08-14 18:23:14 +08:00
Dan Mihai	ea74024b93	Merge pull request #11663 from burgerdev/arp genpolicy: support AddARPNeighbors	2025-08-13 14:54:36 -07:00
Kevin Zhao	aadad0c9b6	CI: change the directory for Arm64 firmware Previouly it is reusing the ovmf, which will enter some issue for path checking, so move to aavmf as it should be. Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-08-13 23:39:44 +02:00
Fabiano Fidêncio	cfd0ebe85f	Merge pull request #11675 from katexochen/snp-guest-policy runtime: make SNP guest policy configurable	2025-08-13 22:20:51 +02:00
Steve Horsman	c7f4c9a3bb	Merge pull request #11676 from stevenhorsman/golang-1.23.12-bump versions: Bump golang to 1.23.12	2025-08-13 15:24:17 +01:00
Park.Jiyeon	2f50c85b12	agent: avoid full file reads when scanning sealed secrets. Read only the sealed secret prefix instead of the whole file. Improves performance and reduces memory usage in I/O-heavy environments. Fixes: #11643 Signed-off-by: Park.Jiyeon <jiyeonnn2@icloud.com>	2025-08-13 20:32:03 +08:00
Paul Meyer	5635410dd3	runtime: make SNP guest policy configurable Dependening on the platform configuration, users might want to set a more secure policy than the QEMU default. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-08-13 09:06:36 +02:00
stevenhorsman	1a6f1fc3ac	versions: Bump golang to 1.23.12 Bump go version to remediate vuln GO-2025-3849 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-08-12 14:46:29 +01:00
Dan Mihai	9379a18c8a	Merge pull request #11565 from Sumynwa/sumsharma/agent_ctl_vm_boot_support agent-ctl: Add option "--vm" to boot pod VM for testing.	2025-08-11 09:36:23 -07:00
Sumedh Alok Sharma	c7c811071a	agent-ctl: Add option --vm to boot pod VM for testing. This change introduces a new command line option `--vm` to boot up a pod VM for testing. The tool connects with kata agent running inside the VM to send the test commands. The tool uses `hypervisor` crates from runtime-rs for VM lifecycle management. Current implementation supports Qemu & Cloud Hypervisor as VMMs. In summary: - tool parses the VMM specific runtime-rs kata config file in /opt/kata/share/defaults/kata-containers/runtime-rs/* - prepares and starts a VM using runtime-rs::hypervisor vm APIs - retrieves agent's server address to setup connection - tests the requested commands & shutdown the VM Fixes #11566 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2025-08-11 11:03:18 +00:00
wangxinge	f3a669ee2d	runtime-rs: add seccomp support for cloud hypervisor and firecracker The seccomp feature for Cloud Hypervisor and Firecracker is enabled by default. This commit introduces an option to disable seccomp for both and updates the built-in configuration.toml file accordingly. Fixes: #11535 Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>	2025-08-11 17:59:30 +08:00
Hyounggyu Choi	407252a863	Merge pull request #11641 from Apokleos/kata-log runtime-rs: Label system journal log with kata	2025-08-11 08:44:31 +02:00
Alex Lyn	196d7d674d	runtime-rs: Label system journal log with kata Route kata-shim logs directly to systemd-journald under 'kata' identifier. This refactoring enables `kata-shim` logs to be properly attributed to 'kata' in systemd-journald, instead of inheriting the 'containerd' identifier. Previously, `kata-shim` logs were challenging to filter and debug as they appeared under the `containerd.service` unit. This commit resolves this by: 1. Introducing a `LogDestination` enum to explicitly define logging targets (File or Journal). 2. Modifying logger creation to set `SYSLOG_IDENTIFIER=kata` when logging to Journald. 3. Ensuring type safety and correct ownership handling for different logging backends. This significantly enhances the observability and debuggability of Kata Containers, making it easier to monitor and troubleshoot Kata-specific events. Fixes: #11590 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com> Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-10 16:00:36 +08:00
Aurélien Bombo	be148c7f72	Merge pull request #11666 from kata-containers/sprt/static-check-exclude-security-md ci: static-checks: add SECURITY.md to exclude list	2025-08-08 12:50:29 -05:00
Fabiano Fidêncio	dcbdf56281	Merge pull request #11660 from zvonkok/remove-stable ci: Remove stable	2025-08-08 14:18:25 +02:00
Xuewei Niu	1d2f2d6350	Merge pull request #11219 from fidencio/topic/version-qemu-bump-to-10.0.0 version: Bump QEMU to v10.0.0	2025-08-08 19:04:45 +08:00
RuoqingHe	aaf8de3dbf	Merge pull request #11669 from kevinzs2048/add-timeout ci: cri-containerd: add 5s timeout for creating sanbox with crictl	2025-08-08 18:25:58 +08:00
Alex Lyn	9816ffdac7	Merge pull request #11653 from Apokleos/align-initdata-annoation Align initdata annoation with kata-runtime	2025-08-08 16:24:09 +08:00
Kevin Zhao	1aa65167d7	CI: cri-containerd: add 5s timeout for creating sanbox with crictl After moving Arm64 CI nodes to new one, we do faced an interesting issue for timeout when it executes the command with crictl runp, the error is usally: code = DeadlineExceeded Fixes: #11662 Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-08-08 15:41:39 +08:00
Fupan Li	b50777a174	Merge pull request #10580 from pmores/make-vcpu-allocation-more-accurate runtime-rs: make vcpu allocation more accurate	2025-08-08 14:14:40 +08:00
Xuewei Niu	beea0c34c5	Merge pull request #11060 from kata-containers/sprt/vfsd-metadata runtime: virtio-fs: Support "metadata" cache mode	2025-08-08 11:13:57 +08:00
Fabiano Fidêncio	f9e16431c1	version: Bump QEMU to v10.0.3 As the new release of QEMU is out, let's switch to it and take advantage of bug fixes and improvements. QEMU changelog: https://wiki.qemu.org/ChangeLog/10.0 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-08-07 22:31:30 +02:00
Greg Kurz	f9a6359674	Merge pull request #11667 from c3d/bug/11633-qmp qemu: Respect the JSON schema for hot plug	2025-08-07 16:04:12 +02:00
Aurélien Bombo	6d96875d04	runtime: virtio-fs: Support "metadata" cache mode The Rust virtiofsd supports a "metadata" cache mode [1] that wasn't present in the C version [2], so this PR adds support for that. [1] https://gitlab.com/virtio-fs/virtiofsd [2] https://qemu.weilnetz.de/doc/5.1/tools/virtiofsd.html#cmdoption-virtiofsd-cache Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-08-07 21:24:40 +08:00
Pavel Mores	69f21692ed	runtime-rs: enable vcpu allocation tests in CI This series should make runtime-rs's vcpu allocation behaviour match the behaviour of runtime-go so we can now enable pertinent tests which were skipped so far due the difference between both shims. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Pavel Mores	00bfa3fa02	runtime-rs: re-adjust config after modifying it with annotations Configuration information is adjusted after loading from file but so far, there has been no similar check for configuration coming from annotations. This commit introduces re-adjusting config after annotations have been processed. A small refactor was necessary as a prerequisite which introduces function TomlConfig::adjust_config() to make it easier to invoke the adjustment for a whole TomlConfig instance. This function is analogous to the existing validate() function. The immediate motivation for this change is to make sure that 0 in "default_vcpus" annotation will be properly adjusted to 1 as is the case if 0 is loaded from a config file. This is required to match the golang runtime behaviour. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Pavel Mores	e2156721fd	runtime-rs: add tests to exercise floating-point 'default_vcpus' Also included (as commented out) is a test that does not pass although it should. See source code comment for explanation why fixing this seems beyond the scope of this PR. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Pavel Mores	1f95d9401b	runtime-rs: change representation of default_vcpus from i32 to f32 This commit focuses purely on the formal change of type. If any subsequent changes in semantics are needed they are purposely avoided here so that the commit can be reviewed as a 100% formal and 0% semantic change. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Pavel Mores	cdc0eab8e4	runtime-rs: make sandbox vcpu allocation more accurate This commit addresses a part of the same problem as PR #7623 did for the golang runtime. So far we've been rounding up individual containers' vCPU requests and then summing them up which can lead to allocation of excess vCPUs as described in the mentioned PR's cover letter. We address this by reversing the order of operations, we sum the (possibly fractional) container requests and only then round up the total. We also align runtime-rs's behaviour with runtime-go in that we now include the default vcpu request from the config file ('default_vcpu') in the total. We diverge from PR #7623 in that `default_vcpu` is still treated as an integer (this will be a topic of a separate commit), and that this implementation avoids relying on 32-bit floating point arithmetic as there are some potential problems with using f32. For instance, some numbers commonly used in decimal, notably all of single-decimal-digit numbers 0.1, 0.2 .. 0.9 except 0.5, are periodic in binary and thus fundamentally not representable exactly. Arithmetics performed on such numbers can lead to surprising results, e.g. adding 0.1 ten times gives 1.0000001, not 1, and taking a ceil() results in 2, clearly a wrong answer in vcpu allocation. So instead, we take advantage of the fact that container requests happen to be expressed as a quota/period fraction so we can sum up quotas, fundamentally integral numbers (possibly fractional only due to the need to rewrite them with a common denominator) with much less danger of precision loss. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Christophe de Dinechin	ec480dc438	qemu: Respect the JSON schema for hot plug When hot-plugging CPUs on QEMU, we send a QMP command with JSON arguments. QEMU 9.2 recently became more strict[1] enforcing the JSON schema for QMP parameters. As a result, running Kata Containers with QEMU 9.2 results in a message complaining that the core-id parameter is expected to be an integer: ``` qmp hotplug cpu, cpuID=cpu-0 socketID=1, error: QMP command failed: Invalid parameter type for 'core-id', expected: integer ``` Fix that by changing the core-id, socket-id and thread-id to be integer values. [1]: `be93fd5372` Fixes: #11633 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2025-08-07 09:13:57 +02:00
Alex Lyn	37685c41c7	runtime-rs: Correct the coresponding initdata annotation const As we have changed the initdata annotation definition, Accordingly, we also need correct its const definition with KATA_ANNO_CFG_RUNTIME_INIT_DATA. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-07 10:45:28 +08:00
Alex Lyn	163f04a918	Merge pull request #11651 from microsoft/danmihai1/debug-kubectl-logs tests: k8s-sandbox-vcpus-allocation debug info	2025-08-07 10:27:29 +08:00
Aurélien Bombo	e3b4d87b6d	ci: static-checks: add SECURITY.md to exclude list This adds SECURITY.md to the list of GH-native files that should be excluded by the reference checker. Today this is useful for downstreams who already have a SECURITY.md file for compliance reasons. When Kata onboards that file, this commit will also be required. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-08-06 11:24:52 -05:00
Markus Rudy	3eb0641431	genpolicy: add rule for AddARPNeighbors When the network interface provisioned by the CNI has static ARP table entries, the runtime calls AddARPNeighbor to propagate these to the agent. As of today, these calls are simply rejected. In order to allow the calls, we do some sanity checks on the arguments: We must ensure that we don't unexpectedly route traffic to the host that was not intended to leave the VM. In a first approximation, this applies to loopback IPs and devices. However, there may be other sensitive ranges (for example, VPNs between VMs), so there should be some flexibility for users to restrict this further. This is why we introduce a setting, similar to UpdateRoutes, that allows restricting the neighbor IPs further. The only valid state of an ARP neighbor entry is NUD_PERMANENT, which has a value of 128 [1]. This is already enforced by the runtime. According to rtnetlink(7), valid flag values are 8 and 128, respectively [2], thus we allow any combination of these. [1]: https://github.com/torvalds/linux/blob/4790580/include/uapi/linux/neighbour.h#L72 [2]: https://github.com/torvalds/linux/blob/4790580/include/uapi/linux/neighbour.h#L49C20-L53 Fixes: #11664 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-08-06 17:24:36 +02:00
Zvonko Kaiser	1b1b3af9ab	ci: Remove trigger for stable branch We do not support stable branches anymore, remove the trigger for it. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-08-06 09:22:24 +08:00
Hyounggyu Choi	af01434226	Merge pull request #11646 from kata-containers/sprt/param-static-checks ci: static-checks: Auto-detect repo by default	2025-08-05 22:13:20 +02:00
Alex Lyn	ede773db17	kata-types: Align the initdata annotation with kata-runtime's definition To make it work within CI, we do alignment with kata-runtime's definition with "io.katacontainers.config.runtime.cc_init_data". Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-03 22:51:39 +08:00
Dan Mihai	05eca5ca25	tests: k8s-sandbox-vcpus-allocation debug info Print more details about the behavior of "kubectl logs", trying to understand errors like: https://github.com/kata-containers/kata-containers/actions/runs/16662887973/job/47164791712 not ok 1 Check the number vcpus are correctly allocated to the sandbox (in test file k8s-sandbox-vcpus-allocation.bats, line 37) `[ `kubectl logs ${pods[$i]}` -eq ${expected_vcpus[$i]} ]' failed with status 2 No resources found in kata-containers-k8s-tests namespace. ... k8s-sandbox-vcpus-allocation.bats: line 37: [: -eq: unary operator expected Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-08-01 20:09:17 +00:00
Aurélien Bombo	c47bff6d6a	Merge pull request #11637 from kata-containers/sprt/remove-install-az-cli gha: Remove unnecessary install-azure-cli step	2025-08-01 09:34:46 -05:00
Fabiano Fidêncio	82f141a02e	Merge pull request #11632 from burgerdev/codegen runtime: reproducible generation of Golang proto bindings	2025-07-31 23:49:18 +02:00
Fabiano Fidêncio	7198c8789e	Merge pull request #11639 from zvonkok/gpu_guest_components gpu: guest components	2025-07-31 21:42:31 +02:00
Aurélien Bombo	9585e608e5	ci: static-checks: Auto-detect repo by default This auto-detects the repo by default (instead of having to specify KATA_DEV_MODE=true) so that forked repos can leverage the static-checks.yaml CI check without modification. An alternative would have been to pass the repo in static-checks.yaml. However, because of the matrix, this would've changed the check name, which is a pain to handle in either the gatekeeper/GH UI. Example fork failure: https://github.com/microsoft/kata-containers/actions/runs/16656407213/job/47142421739#step:8:75 I've tested this change to work in a fork. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-07-31 14:33:24 -05:00
Zvonko Kaiser	8422411d91	gpu: Add coco guest components The second stage needs to consider the coco guest components Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-31 17:11:21 +00:00
Markus Rudy	3fd354b991	ci: add codegen to static-checks Signed-off-by: Markus Rudy <mr@edgeless.systems> Fixes: #11631 Co-authored-by: Steve Horsman <steven@uk.ibm.com>	2025-07-31 17:58:25 +01:00
Markus Rudy	9e38fd2562	tools: add image for Go proto bindings In order to have a reproducible code generation process, we need to pin the versions of the tools used. This is accomplished easiest by generating inside a container. This commit adds a container image definition with fixed dependencies for Golang proto/ttrpc code generation, and changes the agent Makefile to invoke the update-generated-proto.sh script from within that container. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-07-31 17:58:25 +01:00
Markus Rudy	f7a36df290	runtime: generate proto files The generated Go bindings for the agent are out of date. This commit was produced by running src/agent/src/libs/protocols/hack/update-generated-proto.sh with protobuf compiler versions matching those of the last run, according to the generated code comments. Since there are new RPC methods, those needed to be added to the HybridVSockTTRPCMockImp. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-07-31 17:58:25 +01:00
Fabiano Fidêncio	d077ed4c1e	Merge pull request #11645 from kata-containers/topic/fix-kbuild-sign-pin-issue build: nvidia: Fix KBUILD_SIGN_PIN breakage	2025-07-31 18:31:34 +02:00
Fabiano Fidêncio	8d30b84abd	build: nvidia: Fix KBUILD_SIGN_PIN breakage We only need KBUILD_SIGN_PIN exported when building nvidia related artefacts. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-31 16:39:20 +02:00
Fabiano Fidêncio	20bef41347	Merge pull request #11236 from kata-containers/amd64-nvidia-gpu-cicd gpu: AMD64 NVIDIA GPU CI/CD	2025-07-31 14:52:01 +02:00
Aurélien Bombo	96f1d95de5	gha: Remove unnecessary install-azure-cli step az cli is already installed by the azure/login action. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-07-30 10:42:56 -05:00
Zvonko Kaiser	fbb0e7f2f2	gpu: Add secrets passthrough to the workflow We need to pass-through the secrets in all the needed workflows ci, ci-on-push, ci-nightly, ci-devel Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:51:01 +00:00
Zvonko Kaiser	30778594d0	gpu: Add arm64-nvidia-a100 to actionlint.yaml Make zizmor happy about our custom runner label Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	8768e08258	gpu: Add embeding service For a simple RAG pipeline add a embeding service Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	254dbd9b45	gpu: Add Pod spec for NIM llama Pod spec for the NIM inferencing service Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	568b13400a	gpu: Add NIM bats test We're running a simple NIM container to test if the GPUs are working properly Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	6188b7f79f	gpu: Add run_kubernetes_nv_tests.sh Replicate what we have for run_tests and run .bats files Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	9a829107ba	gpu: Add selector for k8s tests We want to reuse the current run_tests with GPUs, introduce a var that will define what to run. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	7669f1fbd1	gpu: Add NVIDIA GPU test block for amd64 Once we have the amd64 artifacts we can run some arm64 k8s tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	97d7575d41	gpu: Disable metrics tests We are not running the metrics tests anyway for now lets make room to run the GPU tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:58 +00:00
Anastassios Nanos	00e0db99a3	Merge pull request #11627 from itsmohitnarayan/FirecrackerVersionUpdate	2025-07-30 13:59:55 +03:00
Kumar Mohit	5cccbb9f41	versions: Upgrade Firecracker Version to 1.12.1 Updated versions.yaml to use Firecracker v1.12.1. Replaced firecracker and jailer binaries under /opt/kata/bin. Tested with kata-fc runtime on Kubernetes: - Deployed pods using gitpod/openvscode-server - Verified microVM startup, container access, and Firecracker usage - Confirmed Firecracker and jailer versions via CLI Signed-off-by: Kumar Mohit <68772712+itsmohitnarayan@users.noreply.github.com>	2025-07-30 12:51:08 +05:30
Saul Paredes	1aaaef2134	Merge pull request #11553 from microsoft/danmihai1/genpolicy-cleanup genpolicy: reduce complexity	2025-07-28 14:32:59 -07:00
Dan Mihai	c11c972465	genpolicy: config layer logging clean-up Use a simple debug!() for logging the config_layer string, instead of transcoding, etc. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	30bfa2dfcc	genpolicy: use CoCo settings by default - "confidential_emptyDir" becomes "emptyDir" in the settings file. - "confidential_configMap" becomes "configMap" in settings. - "mount_source_cpath" becomes "cpath". - The new "root_path" gets used instead of the old "cpath" to point to the container root path.. - "confidential_guest" is no longer used. By default it gets replaced by "enable_configmap_secret_storages"=false, because CoCo is using CopyFileRequest instead of the Storage data structures for ConfigMap and/or Secret volume mounts during CreateContainerRequest. - The value of "guest_pull" becomes true by default. - "image_layer_verification" is no longer used - just CoCo's guest pull is supported. - The Request input files from unit tests are changing to reflect the new default settings values described above. - tests/integration/kubernetes/tests_common.sh adjusts the settings for platforms that are not set-up for CoCo during CI (i.e., platforms other than SNP, TDX, and CoCo Dev). Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	94995d7102	genpolicy: skip pulling layers for guest-pull Skip pulling container image layers when guest-pull=true. The contents of these layers were ignored due to: - #11162, and - tarfs snapshotter support having been removed from genpolicy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	f6016f4f36	genpolicy: remove tarfs snapshotter support AKS Confidential Containers are using the tarfs snapshotter. CoCo upstream doesn't use this snapshotter, so remove this Policy complexity from upstream. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:10 +00:00
Steve Horsman	077c59dd1f	Merge pull request #11385 from wainersm/ci_make_coco_nontee_required ci/gatekeeper: make run-k8s-tests-coco-nontee job required	2025-07-28 14:16:23 +01:00
Steve Horsman	74fba9c736	Merge pull request #11619 from kata-containers/install-dependencies-gh-cli ci: Try passing api token into githubh api call	2025-07-28 13:35:12 +01:00
Xuewei Niu	2a3c8b04df	Merge pull request #11613 from RuoqingHe/clippy-fix-for-libs-20250721 mem-agent: Ignore Cargo.lock	2025-07-28 17:45:29 +08:00
RuoqingHe	3f46347dc5	Merge pull request #11618 from RuoqingHe/fix-dragonball-default-build dragonball: Fix warnings in default build	2025-07-28 11:24:46 +08:00
Xuewei Niu	e5d5768c75	Merge pull request #11626 from RuoqingHe/bump-cloud-hypervisor-v47 versions: Upgrade to Cloud Hypervisor v47.0	2025-07-28 10:34:45 +08:00
Ruoqing He	4ca6c2d917	mem-agent: Ignore Cargo.lock `mem-agent` here is now a library and do not contain examples, ignore Cargo.lock to get rid of untracked file noise produced by `cargo run` or `cargo test`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-28 10:32:46 +08:00
Ruoqing He	3ec10b3721	runtime: clh: Re-generate client code against v47.0 Re-generates the client code against Cloud Hypervisor v47.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 20:44:14 +02:00
Ruoqing He	14e9d2c815	versions: Upgrade to Cloud Hypervisor v47.0 Details of v47.0 release can be found in our roadmap project as iteration v47.0: https://github.com/orgs/cloud-hypervisor/projects/6. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 20:42:24 +02:00
Xuewei Niu	6f6d64604f	Merge pull request #11598 from justxuewei/cgroups	2025-07-25 17:53:03 +08:00
Hyounggyu Choi	860779c4d9	Merge pull request #11621 from Apokleos/enhance-copyfile runtime-rs: Some extra work to enhance copyfile with sharedfs disabled	2025-07-25 11:27:03 +02:00
Ruoqing He	639273366a	dragonball: Gate `MmapRegion` behind `virtio-fs` `MmapRegion` is only used while `virtio-fs` is enabled during testing dragonball, gate the import behind `virtio-fs` feature. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:09:35 +00:00
Ruoqing He	2e81ac463a	dragonball: Allow unused to suppress warnings Some variables went unused if certain features are not enabled, use `#[allow(unused)]` to suppress those warnings at the time being. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:07:19 +00:00
Ruoqing He	5f7da1ccaa	dragonball: Silence never read fields Some fields in structures used for testing purpose are never read, rename to send out the message. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:07:19 +00:00
Ruoqing He	225e6fffbc	dragonball: Gate `VcpuManagerError` behind `host-device` `VcpuManagerError` is only needed when `host-device` feature is enabled, gate the import behind that feature. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:07:19 +00:00
Ruoqing He	0502b05718	dragonball: Remove `with-serde` feature assertion Code inside `test_mac_addr_serialization_and_deserialization` test does not actually require this `with-serde` feature to test, removing the assertion here to enable this test. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:05:55 +00:00
Xuewei Niu	60e3679eb7	runtime-rs: Add full cgroups support on host Add full cgroups support on host. Cgroups are managed by `FsManager` and `SystemdManager`. As the names impies, the `FsManager` manages cgroups through cgroupfs, while the `SystemdManager` manages cgroups through systemd. The two manages support cgroup v1 and cgroup v2. Two types of cgroups path are supported: 1. For colon paths, for example "foo.slice:bar:baz", the runtime manages cgroups by `SystemdManager`; 2. For relative/absolute paths, the runtime manages cgroups by `FsManager`. vCPU threads are added into the sandbox cgroups in cgroup v1 + cgroupfs, others, cgroup v1 + systemd, cgroup v2 + cgroupfs, cgroup v2 + systemd, VMM process is added into the cgroups. The systemd doesn't provide a way to add thread to a unit. `add_thread()` in `SystemdManager` is equivalent to `add_process()`. Cgroup v2 supports threaded mode. However, we should enable threaded mode from leaf node to the root node (`/`) iteratively [1]. This means the runtime needs to modify the cgroups created by container runtime (e.g. containerd). Considering cgroupfs + cgroup v2 is not a common combination, its behavior is aligned with systemd + cgroup v2, which is not allowed to manage process at the thread level. 1: https://www.kernel.org/doc/html/v4.18/admin-guide/cgroup-v2.html#threads Fixes: #11356 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-25 14:52:55 +08:00
alex.lyn	613dba6f1f	runtime-rs: Some extra work to enhance copyfile with sharedfs disabled As some reasons, it first should make it align with runtime-go, this commit will do this work. Fixes #11543 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-25 11:39:20 +08:00
Xuewei Niu	6aa3517393	tests: Prevent the shim from being killed in k8s-oom test The actual memory usage on the host is equal to the hypervisor memory usage plus the user memory usage. An OOM killer might kill the shim when the memory limit on host is same with that of container and the container consumes all available memory. In this case, the containerd will never receive OOM event, but get "task exit" event. That makes the `k8s-oom.bats` test fail. The fix is to add a new container to increase the sandbox memory limit. When the container "oom-test" is killed by OOM killer, there is still available memory for the shim, so it will not be killed. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-24 23:44:21 +08:00
Steve Horsman	c762a3dd4f	Merge pull request #11372 from kata-containers/dependabot/cargo/src/dragonball/openssl-af8515b6e0 build(deps): bump the openssl group across 4 directories with 1 update	2025-07-24 13:27:24 +01:00
Fupan Li	fdbe549368	Merge pull request #11547 from Apokleos/virtio-scsi runtime-rs: support block device driver virtio-scsi within qemu-rs	2025-07-24 18:02:11 +08:00
Xuewei Niu	635272f3e8	runtime-rs: Ignore SIGTERM signal in shim When enabling systemd cgroup driver and sandbox cgroup only, the shim is under a systemd unit. When the unit is stopping, systemd sends SIGTERM to the shim. The shim can't exit immediately, as there are some cleanups to do. Therefore, ignoring SIGTERM is required here. The shim should complete the work within a period (Kata sets it to 300s by default). Once a timeout occurs, systemd will send SIGKILL. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-24 17:15:15 +08:00
Xuewei Niu	79f29bc523	runtime-rs: QEMU get_thread_ids() returns real vCPU's tids The information is obtained through QMP query_cpus_fast. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-24 17:15:15 +08:00
stevenhorsman	475baf95ad	ci: Try passing api token into githubh api call Our CI keeps on getting ``` jq: error (at <stdin>:1): Cannot index string with string "tag_name" ``` during the install dependencies phase, which I suspect might be due to github rate limits being reduced, so try to pass through the `GH_TOKEN` env and use it in the auth header. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-24 08:49:32 +01:00
alex.lyn	b40d65bc1b	runtime-rs: support block device driver virtio-scsi within qemu-rs It is important that we continue to support VirtIO-SCSI. While VirtIO-BLK is a common choice, virtio-scsi offers significant performance advantages in specific scenarios, particularly when utilizing iothreads and with NVMe Fabrics. Maintaining Flexibility and Choice by supporting both virtio-blk and virtio-scsi, we provide greater flexibility for users to choose the optimal storage（virtio-blk, virtio-scsi) interface based on their specific workload requirements and hardware configurations. As virtio-scsi controller has been created when qemu vm starts with block device driver is set to `virtio-scsi`. This commit is for blockdev_add the backend block device and device_add frondend virtio-scsi device via qmp. Fixes #11516 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 14:00:02 +08:00
alex.lyn	e683a7fd37	runtime-rs: Change the device_id with block device index As block device index is an very important unique id of a block device and can indicate a block device which is equivalent to device_id. In case of index is required in calculating scsi LUN and reduce useless arguments within reusing `hotplug_block_device`, we'd better change the device_id with block device index. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	4521cae0c0	runtime-rs: Support AIO for hotplugging block device within qemu In this commit, block device aio are introduced within hotplug_block_device within qemu via qmp and the "iouring" is set the default. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	b4d276bc2b	runtime-rs: Handle virtio-scsi within device manager It should be correctly handled within the device manager when do create_block_device if the driver_option is virtio-scsi. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	fbd84fd3f4	runtime-rs: Support virtio-scsi device within handle_block_volume It supports handling scsi device when block device driver is `scsi`. And it will ensure a correct storage source with LUN. Fixes #11516 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	57645c0786	runtime-rs: Add support for block device AIO In this commit, three block device aio modes are introduced and the "iouring" is set the default. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	40e6aacc34	runtime-rs: Introduce scsi_addr within BlockConfig for SCSI devices It's used to help discover scsi devices inside guest and also add a new const value `KATA_SCSI_DEV_TYPE` to help pass information. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	125383e53c	runtime-rs: Add support for configurable block device aio AIO is the I/O mechanism used by qemu with options: - threads Pthread based disk I/O. - native Native Linux I/O. - io_uring (default mode) Linux io_uring API. This provides the fastest I/O operations on Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:56:52 +08:00
dependabot[bot]	ef9d960763	build(deps): bump the openssl group across 4 directories with 1 update Bumps the openssl group with 1 update in the /src/dragonball directory: [openssl](https://github.com/sfackler/rust-openssl). Bumps the openssl group with 1 update in the /src/runtime-rs directory: [openssl](https://github.com/sfackler/rust-openssl). Bumps the openssl group with 1 update in the /src/tools/genpolicy directory: [openssl](https://github.com/sfackler/rust-openssl). Bumps the openssl group with 1 update in the /src/tools/kata-ctl directory: [openssl](https://github.com/sfackler/rust-openssl). Updates `openssl` from 0.10.72 to 0.10.73 - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73) Updates `openssl` from 0.10.72 to 0.10.73 - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73) Updates `openssl` from 0.10.72 to 0.10.73 - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73) Updates `openssl` from 0.10.72 to 0.10.73 - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.73 dependency-type: indirect update-type: version-update:semver-patch dependency-group: openssl - dependency-name: openssl dependency-version: 0.10.73 dependency-type: indirect update-type: version-update:semver-patch dependency-group: openssl - dependency-name: openssl dependency-version: 0.10.73 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: openssl - dependency-name: openssl dependency-version: 0.10.73 dependency-type: indirect update-type: version-update:semver-patch dependency-group: openssl ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-23 15:17:12 +00:00
Fabiano Fidêncio	58925714d2	Merge pull request #11579 from Apokleos/fix-hotplug-blk runtime-rs: Support hotplugging host block devices within qemu-rs	2025-07-23 11:10:04 +02:00
alex.lyn	a12ae58431	runtime-rs: Support hotplugging host block devices within qemu-rs Although Previous implementation of hotplugging block device via QMP can successfully hot-plug the regular file based block device, but it fails when the backend is /dev/xxx(e.g. /dev/loop0). With analysis about it, we can know that it lacks the ablility to hotplug host block devices. This commit will fill the gap, and make it work well for host block devices. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-22 15:40:03 +08:00
Fabiano Fidêncio	acae4480ac	Merge pull request #11604 from fidencio/release/3.19.1 release: Bump version to 3.19.1	2025-07-22 09:00:15 +02:00
Fabiano Fidêncio	0220b4d661	release: Bump version to 3.19.1 As there were a few moderate security vulnerability fixes missed as part of the 3.19.0 release. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-21 20:09:21 +02:00
Steve Horsman	09efcfbd86	Merge pull request #11606 from kata-containers/dependabot/cargo/src/tools/genpolicy/zerocopy-0.6.6 build(deps): bump zerocopy from 0.6.1 to 0.6.6 in /src/tools/genpolicy	2025-07-21 18:58:56 +01:00
Steve Horsman	9f04d8e121	Merge pull request #11605 from kata-containers/dependabot/cargo/src/tools/kata-ctl/unsafe-libyaml-0.2.11 build(deps): bump unsafe-libyaml from 0.2.9 to 0.2.11 in /src/tools/kata-ctl	2025-07-21 18:50:01 +01:00
dependabot[bot]	a9c8377073	build(deps): bump zerocopy from 0.6.1 to 0.6.6 in /src/tools/genpolicy --- updated-dependencies: - dependency-name: zerocopy dependency-version: 0.6.6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-21 12:50:38 +00:00
dependabot[bot]	0b4c434ece	build(deps): bump unsafe-libyaml in /src/tools/kata-ctl Bumps [unsafe-libyaml](https://github.com/dtolnay/unsafe-libyaml) from 0.2.9 to 0.2.11. - [Release notes](https://github.com/dtolnay/unsafe-libyaml/releases) - [Commits](https://github.com/dtolnay/unsafe-libyaml/compare/0.2.9...0.2.11) --- updated-dependencies: - dependency-name: unsafe-libyaml dependency-version: 0.2.11 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-21 12:46:27 +00:00
Fabiano Fidêncio	35629d0690	Merge pull request #11603 from stevenhorsman/security-updates-21-jul dependencies: More crate bumps to resolve security issues	2025-07-21 14:33:07 +02:00
stevenhorsman	162ba19b85	agent-ctl: Bump rusttls Bump rusttls to >=0.23.18 to remediate RUSTSEC-2024-0399 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:41:59 +01:00
stevenhorsman	42339e9cdf	dragonball: Update url crate Update url to 2.5.4 to bump idna to 1.0.3 and remediate RUSTSEC-2024-0421 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:35:05 +01:00
stevenhorsman	1795361589	runk: Update rustjail Update the rustjail crate to pull in the latest security fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:31:18 +01:00
stevenhorsman	28929f5b3e	runtime: Bump promethus Bump this crate to remove the old version of protobuf and remediate RUSTSEC-2024-0437 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:29:57 +01:00
stevenhorsman	e66aa1ef8c	runtime: Bump promethus and ttrpc-codegen Bump these crates to remove the old version of protobuf and remediate RUSTSEC-2024-0437 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:29:39 +01:00
Fabiano Fidêncio	d60513ece9	Merge pull request #11597 from kata-containers/topic/fix-release-static-tarball-content release: Copy the VERSION file to the tarball	2025-07-20 21:06:40 +02:00
Fabiano Fidêncio	55aae75ed7	shellcheck: Fix issues on kata-deploy-merge-builds.sh As we're already touching the file, let's get those fixed. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-20 09:33:50 +02:00
Fabiano Fidêncio	aaeb3b3221	release: Copy the VERSION file to the tarball For the release itself, let's simply copy the VERSION file to the tarball. To do so, we had to change the logic that merges the build, as at that point the tag is not yet pushed to the repo. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-20 00:06:14 +02:00
Fabiano Fidêncio	21ccaf4a80	Merge pull request #11596 from fidencio/release/v3.19.0 release: Bump version to 3.19.0	2025-07-19 18:27:36 +02:00
Fabiano Fidêncio	60f312b4ae	release: Bump version to 3.19.0 Bump VERSION and helm-chart versions Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-19 09:11:30 +02:00
Fabiano Fidêncio	1351ccb2de	Merge pull request #11576 from Tim-Zhang/update-protobuf-to-fix-CVE-2025-53605 chore: Update protobuf to fix CVE-2025-53605	2025-07-19 07:43:13 +02:00
Fabiano Fidêncio	7f5f032aca	runtime-rs: Update containerd-shim / containerd-shim-protos Let's bump those to their 0.10.0 releases, which contain fixes for the CVE-2025-53605. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-19 00:18:01 +02:00
Fabiano Fidêncio	6dc4c0faae	Merge pull request #11589 from fidencio/topic/fix-tdx-qemu-path-for-non-gpu qemu: tdx: Fix binary path for non-gpu TDX	2025-07-18 17:24:00 +02:00
Tim Zhang	2fe9df16cc	gent-ctl: update Cargo.lock to fix CVE-2025-53605 Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/392 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 16:13:25 +02:00
Tim Zhang	45b44742de	genpolicy: update Cargo.lock to fix CVE-2025-53605 Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/394 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 16:10:52 +02:00
Tim Zhang	fa9ff1b299	kata-ctl: update prometheus/protobuf to fix CVE-2025-53605 Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/395 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 16:05:13 +02:00
Tim Zhang	d0e7a51f7b	dragonball: update prometheus/protobuf to fix CVE-2025-53605 Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/396 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh>	2025-07-18 16:02:29 +02:00
Tim Zhang	222393375a	agent: update ttrpc-codegen to remove dependency on protobuf v2 To fix CVE-2025-53605. Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/397 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 16:02:07 +02:00
Fabiano Fidêncio	60c3d89767	Merge pull request #11558 from gmintoco/feature/helm-nodeSelector helm: add nodeSelector support to kata-deploy chart	2025-07-18 15:52:19 +02:00
Fabiano Fidêncio	3143787f69	qemu: tdx: Fix binary path for non-gpu TDX On commit `90bc749a19`, we've changed the QEMUTDXPATH in order to get it to work with GPUs, but the change broke the non-GPU TDX use-case, which depends on the distro binary. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 15:26:27 +02:00
Fabiano Fidêncio	497a3620c2	tests: Remove references to qemu-sev As it's been removed from our codebase. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 12:49:54 +02:00
Fabiano Fidêncio	17ce44083c	runtime: Remove reference to sev package Otherwise it'll just break static checks. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 12:49:54 +02:00
Gus Minto-Cowcher	3b5cd2aad6	helm: remove qemu-sev references qemu-sev support has been removed, but those bits were left behind by mistake. Signed-off-by: Gus Minto-Cowcher <gus@basecamp-research.com>	2025-07-18 12:49:54 +02:00
Gus Minto-Cowcher	41d41d51f7	helm: add nodeSelector support to kata-deploy chart - Add nodeSelector configuration to values.yaml with empty default - Update DaemonSet template to conditionally include nodeSelector - Add documentation and examples for nodeSelector usage in README - Allows users to restrict kata-containers deployment to specific nodes by labeling them Signed-off-by: Gus Minto-Cowcher <gus@basecamp-research.com>	2025-07-18 12:49:54 +02:00
Fabiano Fidêncio	7d709a0759	Merge pull request #11493 from stevenhorsman/agent-ctl-tag-cache ci: cache: Tag agent-ctl cache	2025-07-18 12:12:46 +02:00
Fabiano Fidêncio	4a6c718f23	Merge pull request #11584 from zvonkok/fix-kernel-debug-enabled kernel: fix enable kernel debug	2025-07-18 11:38:36 +02:00
Sumedh Alok Sharma	47184e82f5	Merge pull request #11313 from Ankita13-code/ankitapareek/exec-id-agent-fix agent: update the processes hashmap to use exec_id as primary key	2025-07-18 14:07:15 +05:30
Fabiano Fidêncio	d9daddce28	Merge pull request #11578 from justxuewei/vsock-async runtime-rs: Fix the issue of blocking socket with Tokio	2025-07-18 10:13:03 +02:00
Xuewei Niu	629c942d4b	runtime-rs: Fix the issue of blocking socket with Tokio According to the issue [1], Tokio will panic when we are giving a blocking socket to Tokio's `from_std()` method, the information is as follows: ``` A panic occurred at crates/agent/src/sock/vsock.rs:59: Registering a blocking socket with the tokio runtime is unsupported. If you wish to do anyways, please add `--cfg tokio_allow_from_blocking_fd` to your RUSTFLAGS. ``` A workaround is to set the socket to non-blocking. 1: https://github.com/tokio-rs/tokio/issues/7172 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-18 10:55:48 +08:00
Xuewei Niu	1508e6f0f5	agent: Bump Tokio to v1.46.1 Tokio now has a newer version, let us bump it. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-18 10:55:48 +08:00
Xuewei Niu	5a4050660a	runtime-rs: Bump Tokio to v1.46.1 Tokio now has a newer version, let us bump it. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-18 10:55:48 +08:00
Zvonko Kaiser	a786dc48b0	kernel: fix enable kernel debug The KERNEL_DEBUG_ENABLED was missing in the outer shell script so overrides via make were not possible. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-18 02:24:19 +00:00
Fabiano Fidêncio	eb2bfbf7ac	Merge pull request #11572 from stevenhorsman/RUSTSEC-2024-0384-remediate More crate bumps for security remediations	2025-07-17 22:35:05 +02:00
Zvonko Kaiser	cef9485634	Merge pull request #11450 from kata-containers/dependabot/cargo/src/agent/nix-0.27.1 build(deps): bump nix to 0.26.4 in agent, libs, runtime-rs	2025-07-17 14:22:40 -04:00
stevenhorsman	41a608e5ce	tools: Bump borsh, liboci-cli and oci-spec Bump these crates to remove the unmaintained dependency proc-macro-error and remediate RUSTSEC-2024-0370 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 18:23:19 +01:00
stevenhorsman	e56f493191	deps: Bump zbus, serial_test & async-std Bump these crates across various components to remove the dependency on unmaintained instant crate and remediate RUSTSEC-2024-0384 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 18:23:19 +01:00
stevenhorsman	bb820714cb	agent-ctl: Update borsh - Update borsh to remove the unmaintained dependency proc-macro-error and remediate RUSTSEC-2024-0370 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 18:23:19 +01:00
Steve Horsman	549fd2a196	Merge pull request #11581 from stevenhorsman/osv-scanner-action-permissions-fix workflow: Fix osv-scanner action	2025-07-17 18:18:16 +01:00
stevenhorsman	a7e27b9b68	workflow: Fix osv-scanner action - The github generated template had an old version which isn't valid for the pr-scan, so update to the latest - The action needs also `actions: read` and `contents:read` to run in kata-containers Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 17:29:35 +01:00
Steve Horsman	8741f2ab3d	Merge pull request #11580 from kata-containers/osv-scanner-action workflow: Add osv-scanner action	2025-07-17 17:00:34 +01:00
stevenhorsman	1a75c12651	workflow: Add osv-scanner action Add action to check for vulnerabilities in the project and on each PR Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 16:41:56 +01:00
stevenhorsman	4c776167e5	trace-forwarder: Add nix features Some of the nix apis we are using are now enabled by features, so add these to resolve the compilation issues Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 15:09:21 +01:00
dependabot[bot]	cd79108c77	build(deps): bump nix in /src/tools/trace-forwarder Bumps [nix](https://github.com/nix-rust/nix) from 0.23.1 to 0.30.1. - [Changelog](https://github.com/nix-rust/nix/blob/master/CHANGELOG.md) - [Commits](https://github.com/nix-rust/nix/compare/v0.23.1...v0.30.1) --- updated-dependencies: - dependency-name: nix dependency-version: 0.30.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-17 15:09:06 +01:00
stevenhorsman	9185ef1a67	runtime-rs: Bump nix to matching version runtime-rs needs the same version as libs, so sync this up as well. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 15:08:46 +01:00
dependabot[bot]	219ad505c2	build(deps): bump nix from 0.24.3 to 0.26.4 in /src/agent Nix needs to be in sync between libs and agent, so bump the agent to the libs version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 15:01:06 +01:00
dependabot[bot]	a4d22fe330	build(deps): bump nix from 0.24.2 to 0.26.4 in /src/libs --- updated-dependencies: - dependency-name: nix dependency-version: 0.26.4 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-17 15:01:06 +01:00
Fabiano Fidêncio	6dabb3683f	Merge pull request #10961 from zvonkok/shellcheck-zero shellcheck: fix kernel/build.sh	2025-07-17 12:59:00 +02:00
Steve Horsman	405f5283f0	Merge pull request #11573 from arvindskumar99/versions_comment OVMF: Making comment in versions.yaml for SEV-SNP	2025-07-17 10:11:58 +01:00
Fabiano Fidêncio	32d40849fa	Merge pull request #11577 from Xynnn007/bump-gc deps(chore): bump guest-components to candidate v0.14.0	2025-07-17 11:08:36 +02:00
Zvonko Kaiser	ca4f96ed00	shellcheck: fix kernel/build.sh Refactor code to make shellcheck happy Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-17 10:15:41 +02:00
Xynnn007	82b890349d	deps(chore): bump guest-components to candidate v0.14.0 This new version of gc fixes s390x attestation, also introduces registry configuration setting directly via initdata. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-07-17 10:19:02 +08:00
stevenhorsman	51f41b1669	ci: cache: Tag agent-ctl cache The peer pods project is using the agent-ctl tool in some tests, so tagging our cache will let them more easily identify development versions of kata for testing between releases. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-16 11:32:33 +01:00
Fupan Li	75d23b8884	Merge pull request #11504 from lifupan/fix_fd_leak agent: fix the issue of parent writer pipe fd leak	2025-07-16 18:29:24 +08:00
Fupan Li	83f54eec52	agent: fix the issue of parent writer pipe fd leak Sometimes, containers or execs do not use stdin, so there is no chance to add parent stdin to the process's writer hashmap, resulting in the parent stdin's fd not being closed when the process is cleaned up later. Therefore, when creating a process, first explicitly add parent stdin to the wirter hashmap. Make sure that the parent stdin's fd can be closed when the process is cleaned up later. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-16 16:15:31 +08:00
Fupan Li	752c8b611e	Merge pull request #11575 from Tim-Zhang/fix-runk-build runk: Fix build errors	2025-07-16 15:23:58 +08:00
Arvind Kumar	2a52351822	OVMF: Making comment in versions.yaml for SEV-SNP Adding comment to versions.yaml to indicate that the ovmf-sev is also used by AMD SEV-SNP, as per the discussion in https://github.com/kata-containers/kata-containers/pull/11561. Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-16 06:35:21 +02:00
Tim Zhang	c8183a2c14	runk: rename imported crate from users to uzers To adapt the new crate name and fix build errors introduced in the commit `39f51b4c6d` Fixes: #11574 Signed-off-by: Tim Zhang <tim@hyper.sh>	2025-07-16 11:35:39 +08:00
Fabiano Fidêncio	9cebbab29d	Merge pull request #11335 from zvonkok/fix-kata-deploy.sh gpu: Fix kata deploy.sh	2025-07-15 19:50:44 +02:00
Fabiano Fidêncio	c8b7a51d72	Merge pull request #11082 from zvonkok/debug-kernel kernel: debug config	2025-07-15 19:04:15 +02:00
Zvonko Kaiser	c56c896fc6	qemu: remove the experimental suffix for qemu-snp We switched to vanilla QEMU for the CPU SNP use-case. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-15 16:49:58 +02:00
Zvonko Kaiser	a282fa6865	gpu: Add TDX related runtime adjustments We have the QEMU adjustments for SNP but missing those for TDX Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-15 16:49:56 +02:00
Zvonko Kaiser	0d2993dcfd	kernel: bump kernel version Obligatory kernel version bump Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-15 16:48:23 +02:00
Zvonko Kaiser	a4597672c0	kernel: Add KERNEL_DEBUG_ENABLED to build scripts We want to be able to build a debug version of the kernel for various use-cases like debugging, tracing and others. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-15 16:48:03 +02:00
Fabiano Fidêncio	b7af7f344b	Merge pull request #11569 from Xynnn007/bump-coco deps(chore): update guest-components and trustee	2025-07-15 16:34:23 +02:00
Fabiano Fidêncio	aac555eeff	Merge pull request #11571 from fidencio/topic/fix-nvidia-gpu-initrd-cache build: Fix cache for nvidia-gpu-initrd builds	2025-07-15 16:28:03 +02:00
Fabiano Fidêncio	4415a47fff	Merge pull request #11557 from Apokleos/fix-initdata runtime-rs: Fix initdata length field missing when create block	2025-07-15 16:22:45 +02:00
Fabiano Fidêncio	11c744c5c3	Merge pull request #11567 from zvonkok/remove-gpu-admin-tools Remove gpu admin tools	2025-07-15 15:11:56 +02:00
Fabiano Fidêncio	fa7598f6ec	Merge pull request #11568 from zvonkok/tdx-qemu-path gpu: Add proper TDX config path	2025-07-15 14:54:13 +02:00
Fabiano Fidêncio	3e86f3a95c	build: Rename rootfs-nvidia-* to fix cache issues The convention for rootfs-* names is: * rootfs-${image_type}-${special_build} If this is not followed, cache will never work as expected, leading to building the initrd / image on every single build, which is specially constly when building the nvidia specific targets. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-15 14:48:45 +02:00
alex.lyn	56c0c172fa	runtime-rs: Fix initdata length field missing when create block The init data could not be read properly within kata-agent because the data length field was omitted, a consequence of a mismatch in the data write format. Fixes #11556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-15 19:22:17 +08:00
Fabiano Fidêncio	b76efa2a25	Merge pull request #11564 from BbolroC/make-qemu-coco-dev-s390x-required ci: Make qemu-coco-dev for s390x (zVSI) required again	2025-07-15 12:04:18 +02:00
Xynnn007	4da31bf2f9	agent: deliver initdata toml to attestation agent Now AA supports to receive initdata toml plaintext and deliver it in the attestation. This patch creates a file under '/run/confidential-containers/initdata' to store the initdata toml and give it to AA process. When we have a separate component to handle initdata, we will move the logic to that component. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-07-15 17:10:56 +08:00
Steve Horsman	d219fc20e1	Merge pull request #11555 from stevenhorsman/rust-advisory-fixes-pre-3.19.0 Rust advisory fixes pre 3.19.0	2025-07-15 09:11:33 +01:00
Hui Zhu	3577e4bb43	Merge pull request #11480 from teawater/update_ma mem-agent: Update to https://github.com/teawater/mem-agent/tree/kata-20250627	2025-07-15 15:22:10 +08:00
Xynnn007	19001af1e2	deps(chore): update guest-components and trustee to the version of pre v0.14.0 Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-07-15 09:12:47 +08:00
teawater	028f25ac84	mem-agent: Update to kata-20250627 Update to https://github.com/teawater/mem-agent/tree/kata-20250627. The commit list: 3854b3a Update nix version from 0.23.2 to 0.30.1 d9a4ced Update tokio version from 1.33 to 1.45.1 9115c4d run_eviction_single_config: Simplify check evicted pages after eviction 68b48d2 get_swappiness: Use a rounding method to obtain the swappiness value 14c4508 run_eviction_single_config: Add max_seq and min_seq check with each info 8a3a642 run_eviction_single_config: Move infov update to main loop b6d30cf memcg.rs: run_aging_single_config: Fix error of last_inc_time check 54fce7e memcg.rs: Update anon eviction code 41c31bf cgroup.rs: Fix build issue with musl 0d6aa77 Remove lazy_static from dependencies a66711d memcg.rs: update_and_add: Fix memcg not work after set memcg issue cb932b1 Add logs and change some level of some logs 93c7ad8 Add per-cgroup and per-numa config support 092a75b Remove all Cargo.lock to support different versions of rust 540bf04 Update mem-agent-srv, mem-agent-ctl and mem-agent-lib to v0.2.0 81f39b2 compact.rs: Change default value of compact_sec_max to 300 c455d47 compact.rs: Fix psi_path error with cgroup v2 issue 6016e86 misc.rs: Fix log error ded90e9 Set mem-agent-srv and mem-agent-ctl as bin Fixes: #11478 Signed-off-by: teawater <zhuhui@kylinos.cn>	2025-07-15 08:57:41 +08:00
Zvonko Kaiser	90bc749a19	gpu: Add proper TDX config path This was missed during the GPU TDX experimental enablement Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-14 23:26:28 +00:00
Zvonko Kaiser	da17b06d28	gpu: Pin toolkit version New versions have incompatibilites, pin toolkit to a working version Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-14 22:07:21 +00:00
Zvonko Kaiser	97a4a1574e	gpu: Remove gpu-admin-tools NVRC got a new feature reading the CC mode directly from register Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-14 21:59:31 +00:00
stevenhorsman	18597588c0	agent: Bump cdi version Bump cdi version to the pick up fixes to: - RUSTSEC-2025-0024 - RUSTSEC-2025-0023 - RUSTSEC-2024-0370 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-14 16:54:30 +01:00
stevenhorsman	661d88b11f	versions: Bump oci-spec Try bumping oci-spec to 0.8.1 as it included fixes for vulnerabilities including RUSTSEC-2024-0370 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-14 16:54:30 +01:00
Fabiano Fidêncio	579d373623	Merge pull request #11521 from stevenhorsman/idna-1.0.4-bump versions: Bump idna crate to >= 1.0.3	2025-07-14 17:39:30 +02:00
Fabiano Fidêncio	f5decea13e	Merge pull request #11550 from stevenhorsman/runtime-rs-bump-chrono-0.4.41 runtime-rs \| trace-forwarder: Bump chrono crate version	2025-07-14 16:45:58 +02:00
Steve Horsman	0fa2cd8202	Merge pull request #11519 from wainersm/tests_teardown_common tests/k8s: instrument some tests for debugging	2025-07-14 13:20:01 +01:00
Hyounggyu Choi	a224b4f9e4	ci: Make qemu-coco-dev for s390x (zVSI) required again As the following job has passed 10 days in a row for the nightly test: ``` kata-containers-ci-on-push / run-k8s-tests-on-zvsi / run-k8s-tests (nydus, qemu-coco-dev, kubeadm) ``` this commit makes the job required again. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-07-14 11:03:54 +02:00
Wainer dos Santos Moschetta	f0f1974e14	tests/k8s: call teardown_common in k8s-parallel.bats The teardown_common will print the description of the running pods, kill them all and print the system's syslogs afterwards. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta	8dfeed77cd	tests/k8s: add handler for Job in set_node() Set the node in the spec template of a Job manifest, allowing to use set_node() on tests like k8s-parallel.bats Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta	806d63d1d8	tests/k8s: call teardown_common in k8s-credentials-secrets.bats The teardown_common will print the description of the running pods, kill them all and print the system's syslogs afterwards. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta	c8f40fe12c	tests/k8s: call teardown_common in k8s-sandbox-vcpus-allocation.bats The teardown_common will print the description of the running pods, kill them all and print the system's syslogs afterwards. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-12 10:13:51 +01:00
Fabiano Fidêncio	4a79c2520d	Merge pull request #11491 from Apokleos/default-blk-driver runtime-rs: Change default block device driver from virtio-scsi to virtio-blk-*	2025-07-11 23:14:13 +02:00
alex.lyn	9cc14e4908	runtime-rs: Update block device driver docs within configuration The previous description for the `block_device_driver` was inaccurate or outdated. This commit updates the documentation to provide a more precise explanation of its function. Fixes #11488 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-11 17:40:58 +02:00
alex.lyn	92160c82ff	runtime-rs: Change block device driver defualt with virtio-blk-* When we run a kata pod with runtime-rs/qemu and with a default configuration toml, it will fail with error "unsupported driver type virtio-scsi". As virtio-scsi within runtime-rs is not so popular, we set default block device driver with `virtio-blk-*`. Fixes #11488 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-11 17:40:58 +02:00
Ankita Pareek	5f08cc75b3	agent: update the processes hashmap to use exec_id as primary key This patch changes the container process HashMap to use exec_id as the primary key instead of PID, preventing exec_id collisions that could be exploited in Confidential Computing scenarios where the host is less trusted than the guest. Key changes: - Changed `processes: HashMap<pid_t, Process>` to `HashMap<String, Process>` - Added exec_id collision detection in `start()` method - Updated process lookup operations to use exec_id directly - Simplified `get_process()` with direct HashMap access This prevents multiple exec operations from reusing the same exec_id, which could be problematic in CoCo use cases where process isolation and unique identification are critical for security. Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>	2025-07-11 10:10:23 +00:00
Steve Horsman	878e50f978	Merge pull request #11554 from fidencio/topic/fix-version-file-on-release gh: Fix released VERSION file	2025-07-11 09:20:06 +01:00
Fabiano Fidêncio	fb22e873cd	gh: Fix released VERSION file The `/opt/kata/VERSION` file, which is created using `git describe --tags`, requires the newly released tag to be updated in order to be accurate. To do so, let's add a `fetch-tags: true` to the checkout action used during the `create-kata-tarball` job. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-11 09:47:11 +02:00
Alex Lyn	87e41e2a09	Merge pull request #11549 from stevenhorsman/bump-remove_dir_all runtime-rs: Switch tempdir to tempfile	2025-07-11 13:46:12 +08:00
Alex Lyn	f22272b8f7	Merge pull request #11540 from Apokleos/coldplug-vfio-clh runtime-rs: Add vfio support with coldplug for cloud-hypervisor	2025-07-11 10:33:59 +08:00
RuoqingHe	7cd4e3278a	Merge pull request #11545 from RuoqingHe/remove-lockfile-for-libs libs: Remove lockfile for libs	2025-07-10 21:56:10 +08:00
stevenhorsman	c740896b1c	trace-forwarder: Bump chrono crate version Bump chrono version to drop time@0.1.43 and remediate vulnerability CVE-2020-26235 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-10 14:55:20 +01:00
stevenhorsman	3916507553	runtime-rs: Bump chrono crate version Bump chrono version to drop time@0.1.45 and remediate vulnerability CVE-2020-26235 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-10 13:47:05 +01:00
Wainer dos Santos Moschetta	3ab6a8462d	ci/gatekeeper: make run-k8s-tests-coco-nontee job required The CoCo non-TEE job (run-k8s-tests-coco-nontee) used to be required but we had to withdraw it to fix a problem (#11156). Now the job is back running and stable, so time to make it required again. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-10 12:19:19 +01:00
stevenhorsman	c5ceae887b	runtime-rs: Switch tempdir to tempfile tempdir hasn't been updated for seven years and pulls in remove_dir_all@0.5.3 which has security advisory GHSA-mc8h-8q98-g5hr, so replace this with using tempfile, which the crate got merged into and we use elsewhere in the project Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-10 12:16:35 +01:00
Ruoqing He	4039506740	libs: Ignore Cargo.lock in libs workspace Ignore Cargo.lock in `libs` to prevent developers from accidentally track lock files in `libs` workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-10 09:31:45 +00:00
alex.lyn	3fbe493edc	runtime-rs: Convert host devices within VmConfig for cloud-hypervisor This PR adds support for adding a network device before starting the cloud-hypervisor VM. This commit will get the host devices from NamedHypervisorConfig and assign it to VmConfig's devices which is for vfio devices when clh starts launching. And with this, it successfully finish the vfio devices conversion from a generic Hypervisor config to a clh specific VmConfig. Signed-off-by: alex.lyn <alex.lyn@antgroup.com> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-10 16:33:43 +08:00
alex.lyn	0b5b8f549d	runtime-rs: Introduce a field host_devices within NamedHypervisorConfig This commit introduce `host_devices` to help convert vfio devices from a generic hypervisor config to a cloud-hypervisor specific VmConfig. Signed-off-by: alex.lyn <alex.lyn@antgroup.com> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-10 16:33:41 +08:00
alex.lyn	d37183d754	runtime-rs: Add vfio support with coldplug for cloud-hypervisor This PR adds support for adding a vfio device before starting the cloud-hypervisor VM (or cold-plug vfio device). This commit changes "pending_devices" for clh implementation via adding DeviceType::Vfio() into pending_devices. And it will get shared host devices after correctly handling vfio devices (Specially for primary device). Signed-off-by: alex.lyn <alex.lyn@antgroup.com> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-10 16:32:21 +08:00
Ruoqing He	ffa3a5a15e	libs: Remove Cargo.lock crates in `libs` workspace do not ship binaries, they are just libraries for other workspace to reference, the `Cargo.lock` file hence would not take effect. Removing Cargo.lock for `libs` workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-10 03:14:55 +00:00
Fabiano Fidêncio	c68eb58f3f	Merge pull request #11529 from fidencio/topic/only-use-fixed-version-of-k0s-for-crio tests: k0s: Always use latest version, apart from CRI-O tests	2025-07-09 18:47:18 +02:00
Hyounggyu Choi	09297b7955	Merge pull request #11537 from BbolroC/set-sharedfs-to-none-for-ibm-sel runtime/runtime-rs: Set shared_fs to none for IBM SEL in config file	2025-07-09 18:30:08 +02:00
Hyounggyu Choi	bca31d5a4d	runtime/runtime-rs: Set shared_fs to none for IBM SEL in config file In line with configuration for other TEEs, shared_fs should be set to none for IBM SEL. This commit updates the value for runtime/runtime-rs. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-07-09 14:22:28 +02:00
Fabiano Fidêncio	5f17e61d11	tests: kata-deploy: Remove --wait from helm uninstall As we're using a `kubectl wait --timeout ...` to check whether the kata-deploy pod's been deleted or not, let's remove the `--wait` from the `helm uninstall ...` call as k0s tests were failing because the `kubectl wait --timeout...` was starting after the pod was deleted, making the test fail. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-09 14:01:30 +02:00
Fabiano Fidêncio	842e17b756	tests: k0s: Always use latest version, apart from CRI-O tests We've been pinning a specific version of k0s for CRI-O tests, which may make sense for CRI-O, but doesn't make sense at all when it comes to testing that we can install kata-deploy on latest k0s (and currently our test for that is broken). Let's bump to the latest, and from this point we start debugging, instead of debugging on an ancient version of the project. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-09 13:27:18 +02:00
Steve Horsman	7bc25b0259	Merge pull request #11494 from katexochen/p/opa-1.6 versions: bump opa 1.5.1 -> 1.6.0	2025-07-09 11:45:54 +01:00
Steve Horsman	967f66f677	Merge pull request #11380 from arvindskumar99/sev-deprecation Sev deprecation	2025-07-09 11:38:13 +01:00
stevenhorsman	f96b8fb690	kata-ctl: Update expected test failure message Update expected error after url crate bump Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-09 11:34:27 +01:00
stevenhorsman	b7bf46fdfa	versions: Bump idna crate to >= 1.0.4 Bump url, reqwests and idna crates in order to move away from idna <1.0.3 and remediate CVE-2024-12224. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-09 11:34:27 +01:00
Xuewei Niu	b8838140d0	Merge pull request #11527 from StevenFryto/fix-runtime-rootless-bugs runtime: Fix rootlessDir not correctly set in rootless VMM mode	2025-07-09 16:40:11 +08:00
Steve Horsman	990c4e68ee	Merge pull request #11523 from wainersm/ci_setup_kubectl workflows: adopting azure/setup-kubectl	2025-07-09 09:09:38 +01:00
stevenfryto	3c7a670129	runtime: Fix rootlessDir not correctly set in rootless VMM mode Previously, the rootlessDir variable in `src/runtime/virtcontainers/pkg/rootless.go` was initialized at package load time using `os.Getenv("XDG_RUNTIME_DIR")`. However, in rootless VMM mode, the correct value of $XDG_RUNTIME_DIR is set later during runtime using os.Setenv(), so rootlessDir remained empty. This patch defers the initialization of rootlessDir until the first call to `GetRootlessDir()`, ensuring it always reflects the current environment value of $XDG_RUNTIME_DIR. Fixes: #11526 Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>	2025-07-09 09:51:48 +08:00
Wainer dos Santos Moschetta	e4da3b84a3	workflows: adopting azure/setup-kubectl There are workflows that rely on `az aks install-cli` to get kubectl installed. There is a well-known problem on install-cli, related with API usage rate limit, that has recently caused the command to fail quite often. This is replacing install-cli with the azure/setup-kubectl github action which has no such as rate limit problem. While here, removed the install_cli() function from gha-run-k8s-common.sh so avoid developers using it by mistake in the future. Fixes #11463 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-08 15:15:54 -03:00
Alex Lyn	294b2c1c10	Merge pull request #11528 from Apokleos/remote-initdata runtime-rs: add initdata annotation for remote hypervisor	2025-07-08 09:13:13 +08:00
Arvind Kumar	afedad0965	kernel: Removing SEV kernel packages Removing kernel config files realting to SEV as part of the SEV deprecation efforts. Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-07 11:21:11 -05:00
Arvind Kumar	ecac3d2d28	runtime: Removing runtime logic for SEV Removing runtime SEV functionality, such as the kbs, ovmf, VMSA handling, and SEV configs as part of deprecating SEV from kata. Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-07 11:17:32 -05:00
Arvind Kumar	8eebcef8fb	tests: Removing testing framework for SEV Removing files pertaining to SEV from the CI framework. Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-07 11:17:32 -05:00
Arvind Kumar	675ea86aba	kata-deploy: Removing SEV from kata-deploy Removing files related to SEV, responsible for installing and configuring Kata containers. Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-07 11:17:32 -05:00
Paul Meyer	ff7ac58579	versions: bump opa 1.5.1 -> 1.6.0 Bumping opa to latest release. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-07-07 14:19:08 +02:00
alex.lyn	fcaade24f4	runtime-rs: add initdata annotation for remote hypervisor Add init data annotation within preparing remote hypervisor annotations when prepare vm, so that it can be passed within CreateVMRequest. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-07 12:46:05 +01:00
Fabiano Fidêncio	110f68a0f1	Merge pull request #11530 from fidencio/topic/tests-fix-runtime-class-check tests: runtimeclasses: Adjust gpu runtimeclasses	2025-07-07 13:42:46 +02:00
Fabiano Fidêncio	2c2995b7b0	tests: runtimeclasses: Adjust gpu runtimeclasses `679cc9d47c` was merged and bumped the podoverhead for the gpu related runtimeclasses. However, the bump on the `kata-runtimeClasses.yaml` as overlooked, making our tests fail due to that discrepancy. Let's just adjust the values here and move on. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-07 11:43:40 +02:00
Fabiano Fidêncio	ef545eed86	Merge pull request #11513 from lifupan/dragonball_6.12.x tools: port the dragonball kernel patch to 6.12.x	2025-07-07 10:31:49 +02:00
Steve Horsman	d291e9bda0	Merge pull request #11336 from zvonkok/fix-podoverhead gpu: Update runtimeClasses for correct podoverhead	2025-07-07 09:20:07 +01:00
Fabiano Fidêncio	a2faf93211	kernel: Bump to v6.12.36 As that's the latest releasesd LTS. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-06 23:48:20 +02:00
Fupan Li	fd21c9df59	tools: port the dragonball kernel patch to 6.12.x Backport the dragonball's kernel patches to 6.12.x kernel version. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-06 23:48:20 +02:00
Zvonko Kaiser	679cc9d47c	gpu: Update runtimeClasses for correct podoverhead We cannot only rely only on default_cpu and default_memory in the config, default is 1 and 2Gi but we need some overhead for QEMU and the other related binaries running as the pod overhead. Especially when QEMU is hot-plugging GPUs, CPUs, and memory it can consume more memory. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-04 12:20:15 -04:00
Steve Horsman	1c718dbcdd	Merge pull request #11506 from stevenhorsman/remove-atty-dependency Remove atty dependency	2025-07-04 10:46:28 +01:00
Alex Lyn	362ea54763	Merge pull request #11517 from zvonkok/fix-nvrc-build gpu: NVRC static build	2025-07-04 13:51:03 +08:00
Alex Lyn	2e35a8067d	Merge pull request #11482 from Apokleos/fix-force-guestpull runtime-rs: refactor and fix the implementation of guest-pull	2025-07-04 11:29:33 +08:00
stevenhorsman	6f23608e96	ci: Remove atty group atty is unmaintained, with the last release almost 3 years ago, so we don't need to check for updates, but instead will remove it from out dependency tree. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-04 09:43:34 +08:00
stevenhorsman	7ffbdf7b3a	mem-agent: Remove structopts crate structopt features were integrated into clap v3 and so is not actively updated and pulls in the atty crate which has a security advisory, so update clap, remove structopts, update the code that used it to remove the outdated dependencies. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-04 09:43:34 +08:00
stevenhorsman	7845129bdc	versions: Bump slog-term to 2.9.1 slog-term 2.9.0 included atty, which is unmaintained as has a security advisory GHSA-g98v-hv3f-hcfr, so bump the version across our components to remove this dependency. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-04 09:43:34 +08:00
Aurélien Bombo	fe532f9d04	Merge pull request #11475 from kata-containers/sprt/zizmor-fixes security: ci: Fixes for Zizmor GHA security scanning	2025-07-03 13:29:47 -05:00
Zvonko Kaiser	c3b2d69452	gpu: NVRC static build We had the proper config.toml configuration for static builds but were building the glibc target and not the musl target. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-03 15:31:00 +00:00
Aurélien Bombo	8723eedad2	gha: Remove path restriction for Zizmor workflow The way GH works, we can only require Zizmor results on ALL PR runs, or none, so remove the path filter. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-07-03 08:18:34 -05:00
Alex Lyn	c857f59a1a	Merge pull request #11510 from lifupan/sync_resize_vcpu runtime-rs: make the resize_vcpu api support sync	2025-07-03 17:35:08 +08:00
alex.lyn	2b95facc6f	kata-type: Relax Mandatory source Field Check in Guest-Pull Mode Previously, the source field was subject to mandatory checks. However, in guest-pull mode, this field doesn't consistently provide useful information. Our practical experience has shown that relying on this field for critical data isn't always necessary. In other aspect, not all cases need mandatory check for KataVirtualVolume. based on this fact, we'd better to make from_base64 do only one thing and remove the validate(). Of course, We also keep the previous capability to make it easy for possible cases which use such method and we rename it clearly with from_base64_and_validate. This commit relaxes the mandatory checks on the KataVirtualVolume specifically for guest-pull mode, acknowledging its diminished utility in this context. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-03 17:07:20 +08:00
alex.lyn	8f8b196705	runtime-rs: refactor merging metadata within image_pull refactor implementation for merging metadata. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-03 17:07:08 +08:00
Fupan Li	fb1c35335a	runtime-rs: make the resize_vcpu sync When hot plugging vcpu in dragonball hypervisor, use the synchronization interface and wait until the hot plug cpu is executed in the guest before returning. This ensures that the subsequent device hot plug will not conflict with the previous call. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-03 15:11:36 +08:00
Fupan Li	72a38457f0	dragonball: make the resize_vcpu api support sync Let dragonball's resize_vcpu api support synchronization, and only return after the hot-plug of the CPU is successfully executed in the guest kernel. This ensures that the subsequent device hot-plug operation can also proceed smoothly. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-03 15:11:36 +08:00
Alex Lyn	210844ce6b	Merge pull request #11509 from teawater/agent_test kata-agent: mount.rs: Fix warning of test	2025-07-03 15:05:04 +08:00
Alex Lyn	95d513b379	Merge pull request #11423 from zhaodiaoer/test test: fix broken testing code in libs	2025-07-03 11:15:39 +08:00
teawater	0347698c59	kata-agent: mount.rs: Fix warning of test Got follow warning with make test of kata-agent: Compiling rustjail v0.1.0 (/data/teawater/kata-containers/src/agent/rustjail) Compiling kata-agent v0.1.0 (/data/teawater/kata-containers/src/agent) warning: unused import: `std::os::unix::fs` --> rustjail/src/mount.rs:1147:9 \| 1147 \| use std::os::unix::fs; \| ^^^^^^^^^^^^^^^^^ \| = note: `#[warn(unused_imports)]` on by default This commit fixes it. Fixes: #11508 Signed-off-by: teawater <zhuhui@kylinos.cn>	2025-07-03 10:01:19 +08:00
alex.lyn	7a59d7f937	runtime-rs: Import the public const value from libs Introduce a const value `KATA_VIRTUAL_VOLUME_PREFIX` defined in the libs/kata-types, and it'll be better import such const value from there. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-03 09:42:17 +08:00
Aurélien Bombo	8d86bcea4b	Merge pull request #11499 from kata-containers/sprt/fix-commit-check gha: Eliminate use of force-skip-ci label	2025-07-02 10:53:55 -05:00
Aurélien Bombo	8d7d859e30	gha: Eliminate use of force-skip-ci label This was originally implemented as a Jenkins skip and is only used in a few workflows. Nowadays this would be better implemented via the gatekeeper. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-07-02 10:29:50 -05:00
Saul Paredes	e7b9eddced	Merge pull request #11248 from microsoft/archana1/storages genpolicy: add validation for storages	2025-07-01 10:02:10 -07:00
Fabiano Fidêncio	07b41c88de	Merge pull request #11490 from Apokleos/fix-noise runtime-rs: Fix noise with frequently appearing in unstaged changes	2025-07-01 17:43:41 +02:00
Archana Choudhary	6932beb01f	policy: fix parse errors in rules.rego This patch fixes the rules.rego file to ensure that the policy is correctly parsed and applied by opa. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 12:43:41 +00:00
Archana Choudhary	abbe1be69f	tests: enable confidential_guest setting for coco This commit updates the `tests_common.sh` script to enable the `confidential_guest` setting for the coco tests in the Kubernetes integration tests. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	9dd365fdb5	genpolicy: fix mount source check in rules.rego This commit fixes the mount source check in rules.rego. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	1cbea890f1	genpolicy: tests: update testcases for execprocess This patch removes storages from the testcases.json file for execprocess. This is because input storage objects are invalid for two reasons: 1. "io.katacontainers.fs-opt.layer=" is missing option in annotations. 2. by default, we don't have host-tarfs-dm-verity enabled, so the storage objects are not created in policy. Signed-off-by: Archana Choudhary <archana1@microsoft.com> ---	2025-07-01 10:35:20 +00:00
Archana Choudhary	6adec0737c	genpolicy: add rules for image_guest_pull storage This patch introduces some basic checks for the `image_guest_pull` storage type in the genpolicy tool. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	bd2dc1422e	genpolicy: add test for container images having volumes This patch adds a test case to genpolicy for container images that have volumes. Examples of such container images include: - quay.io/opstree/redis - https://github.com/kubernetes/examples/blob/master/cassandra/image/Dockerfile Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	d7f998fbd5	genpolicy: tests: update test for emptydir volumes This patch - updates testcases.json for emptydir volumes/storages Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	68c8c31718	genpolicy: tests: add test for config_map volumes This patch adds test for config_map volumes. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	9ebbc08d70	genpolicy: enable storage checks This patch - adds condition to add container image layers as storages - enable storage checks - fix CI policy test cases - update genpolicy-settings.json to enable storage checks - remove storage object addition in container image parsing Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	5b1459e623	genpolicy: test framework: enable config map usage This patch improves the test framework for the genpolicy tool by enabling the use of config maps. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Alex Lyn	8784cebb84	Merge pull request #10693 from Apokleos/guest-pullimage-timeout runtime-rs: support setting create_container timeout with request_timeout_ms for image pulling in guest	2025-07-01 11:40:19 +08:00
alex.lyn	b7c1d04a47	runtime-rs: Fix noise with frequently appearing in unstaged changes Fixes #11489 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-01 10:19:02 +08:00
alex.lyn	9839c17cad	build: add Makefile variable for create_container_timeout Add the definiation of variable DEFCREATECONTAINERTIMEOUT into Makefile target with default timeout 30s. Fixes: #485 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
alex.lyn	1a06bd1f08	kata-types: Introduce annotation *_RUNTIME_CREATE_CONTAINTER_TIMEOUT It's used to indicate timeout value set for image pulling in guest during creating container. This allows users to set this timeout with annotation according to the size of image to be pulled. Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
alex.lyn	f886e82f03	runtime-rs: support setting create_container_timeout It allows users to set this create container timeout within configuration.toml according to the size of image to be pulled inside guest. Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
alex.lyn	ce524a3958	kata-types: Give a more comprehensive definition of request_timeout_ms To better understand the impact of different timeout values on system behavior, this section provides a more comprehensive explanation of the request_timeout_ms: This timeout value is used to set the maximum duration for the agent to process a CreateContainerRequest. It's also used to ensure that workloads, especially those involving large image pulls within the guest, have sufficient time to complete. Based on explaination above, it's renamed with `create_container_timeout`, Specially, exposed in 'configuration.toml' Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
Steve Horsman	f04bb3f34c	Merge pull request #11479 from stevenhorsman/skip-weekly-coco-stability-tests workflows: Skip weekly coco stability tests	2025-06-30 09:05:14 +01:00
Fabiano Fidêncio	b024d8737c	Merge pull request #11481 from fidencio/topic/fix-passing-image-size-alignment build: Allow passing IMAGE_SIZE_ALIGNMENT_MB as an env var	2025-06-30 09:04:39 +02:00
Alex Lyn	69d2c078d1	Merge pull request #11484 from stevenhorsman/bump-nydus-snapshotter-0.15.2 version: Bump nydus-snapshotter	2025-06-30 14:44:01 +08:00
Alex Lyn	e66baf503b	Merge pull request #11474 from Apokleos/remote-annotation runtime-rs: Add GPU annotations for remote hypervisor	2025-06-30 14:05:15 +08:00
Fabiano Fidêncio	8d4e3b47b1	Merge pull request #11470 from fidencio/topic/runtime-rs-fix-odd-memory-size-calculation runtime-rs: Fix calculation of odd memory sizes	2025-06-30 07:26:30 +02:00
Champ-Goblem	91cadb7bfe	runtime-rs: Fix calculation of odd memory sizes An odd memory size leads to the runtime breaking during its startup, as shown below: ``` Warning FailedCreatePodSandBox 34s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox "708c81910f4e67e53b4170b6615083339b220154cb9a0c521b3232cdb40d50f9": failed to create containerd task: failed to create shim task: Others("failed to handle message start sandbox in task handler\n\nCaused by:\n 0: start vm\n 1: set vm base config\n 2: set vm configuration\n 3: Failed to set vm configuration VmConfigInfo { vcpu_count: 2, max_vcpu_count: 16, cpu_pm: \"on\", cpu_topology: CpuTopology { threads_per_core: 1, cores_per_die: 1, dies_per_socket: 1, sockets: 1 }, vpmu_feature: 0, mem_type: \"shmem\", mem_file_path: \"\", mem_size_mib: 4513, serial_path: Some(\"/run/kata/708c81910f4e67e53b4170b6615083339b220154cb9a0c521b3232cdb40d50f9/console.sock\"), pci_hotplug_enabled: true }\n 4: vmm action error: MachineConfig(InvalidMemorySize(4513))\n\nStack backtrace:\n 0: anyhow::error::<impl anyhow::Error>::msg\n 1: hypervisor::dragonball::vmm_instance::VmmInstance::handle_request\n 2: hypervisor::dragonball::vmm_instance::VmmInstance::set_vm_configuration\n 3: hypervisor::dragonball::inner::DragonballInner::set_vm_base_config\n 4: <hypervisor::dragonball::Dragonball as hypervisor::Hypervisor>::start_vm::{{closure}}::{{closure}}\n 5: <hypervisor::dragonball::Dragonball as hypervisor::Hypervisor>::start_vm::{{closure}}\n 6: <virt_container::sandbox::VirtSandbox as common::sandbox::Sandbox>::start::{{closure}}::{{closure}}\n 7: <virt_container::sandbox::VirtSandbox as common::sandbox::Sandbox>::start::{{closure}}\n 8: runtimes::manager::RuntimeHandlerManager::handler_task_message::{{closure}}::{{closure}}\n 9: runtimes::manager::RuntimeHandlerManager::handler_task_message::{{closure}}\n 10: <service::task_service::TaskService as containerd_shim_protos::shim::shim_ttrpc_async::Task>::create::{{closure}}\n 11: <containerd_shim_protos::shim::shim_ttrpc_async::CreateMethod as ttrpc::asynchronous::utils::MethodHandler>::handler::{{closure}}\n 12: <tokio::time::timeout::Timeout<T> as core::future::future::Future>::poll\n 13: ttrpc::asynchronous::server::HandlerContext::handle_msg::{{closure}}\n 14: <core::future::poll_fn::PollFn<F> as core::future::future::Future>::poll\n 15: <ttrpc::asynchronous::server::ServerReader as ttrpc::asynchronous::connection::ReaderDelegate>::handle_msg::{{closure}}::{{closure}}\n 16: tokio::runtime::task::core::Core<T,S>::poll\n 17: tokio::runtime::task::harness::Harness<T,S>::poll\n 18: tokio::runtime::scheduler::multi_thread::worker::Context::run_task\n 19: tokio::runtime::scheduler::multi_thread::worker::Context::run\n 20: tokio::runtime::context::runtime::enter_runtime\n 21: tokio::runtime::scheduler::multi_thread::worker::run\n 22: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll\n 23: tokio::runtime::task::core::Core<T,S>::poll\n 24: tokio::runtime::task::harness::Harness<T,S>::poll\n 25: tokio::runtime::blocking::pool::Inner::run\n 26: std::sys::backtrace::__rust_begin_short_backtrace\n 27: core::ops::function::FnOnce::call_once{{vtable.shim}}\n 28: std::sys::pal::unix::thread::Thread::new::thread_start") ``` As we cannot control what the users will set, let's just round it up to the next acceptable value. Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-28 14:29:18 +02:00
Fabiano Fidêncio	e2b93fff3f	build: Allow passing IMAGE_SIZE_ALIGNMENT_MB as an env var This helps considerably to avoid patching the code, and just adjusting the build environment to use a smaller alignment than the default one. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-28 00:05:20 +02:00
stevenhorsman	fe5d43b4bd	workflows: Skip weekly coco stability tests These tests are not passing, or being maintained, so as discussed on the AC meeting, we will skip them from automatically running until they can be reviewed and re-worked, so avoid wasting CI cycles. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-27 16:51:53 +01:00
stevenhorsman	61b12d4e1b	version: Bump nydus-snapshotter Bump to version v0.15.2 to pick up fix to mount source in https://github.com/containerd/nydus-snapshotter/pull/636 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-27 14:04:00 +01:00
RuoqingHe	a43e06e0eb	Merge pull request #11461 from stevenhorsman/bump-guest-components-4cd62c3 versions: Bump guest-components	2025-06-27 10:45:06 +08:00
Aurélien Bombo	d94085916e	ci: set Zizmor as required test This adds Zizmor GHA security scanning as a PR gate. Note that this does NOT require that Zizmor returns 0 alerts, but rather that Zizmor's invocation completes successfully (regardless of how many alerts it raises). I will set up the former after this commit is merged (through the GH UI). Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 12:36:41 -05:00
Aurélien Bombo	820c1389db	security: ci: remove overly broad permission This removes the permission from the workflow since it's already present at the job level. https://github.com/kata-containers/kata-containers/security/code-scanning/111 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 12:29:23 -05:00
Aurélien Bombo	bb2a427a8a	security: ci: fix template injection This fixes a Zizmor error where some variables are vulnerable to template injection. https://github.com/kata-containers/kata-containers/security/code-scanning/67 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 12:29:11 -05:00
Saul Paredes	8c57beb943	Merge pull request #11471 from microsoft/saulparedes/fix_kata_monitor_dockerfile tools: kata-monitor: update go version used to build in Dockerfile	2025-06-26 08:37:08 -07:00
Chao Wu	ac928218f3	Merge pull request #11434 from hsiangkao/erofs runtime: improve EROFS snapshotter support	2025-06-26 22:40:48 +08:00
Cameron McDermott	b6cd6e6914	Merge pull request #11469 from fidencio/topic/dragonball-set-default_maxvcpus-to-zero runtime-rs: Set default_maxvcpus to 0	2025-06-26 15:20:21 +01:00
Aurélien Bombo	a1aa3e79d4	Merge pull request #11392 from kata-containers/sprt/zizmor ci: Run zizmor for GHA security analysis	2025-06-26 08:55:22 -05:00
Fupan Li	1ff54a95d2	Merge pull request #11422 from lifupan/memory_hotplug runtime-rs: Add the memory and vcpu hotplug for cloud-hypervisor	2025-06-26 17:56:49 +08:00
Aurélien Bombo	34c8cd810d	ci: Run zizmor for GHA security analysis This runs the zizmor security lint [1] on our GH Actions. The initial workflow uses [2] as a base. [1] https://docs.zizmor.sh/ [2] https://docs.zizmor.sh/usage/#use-in-github-actions Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 10:52:28 +01:00
alex.lyn	e6e4cd91b8	runtime-rs: Enable GPU annotations in remote hypervisor configuration Enable GPU annotations by adding `default_gpus` and `default_gpu_model` into the list of valid annotations `enable_annotations`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:29:36 +08:00
alex.lyn	e5f44fae30	runtime-rs: Add GPU annotations during remote hypervisor preparation Add GPU specific annotations used by remote hypervisor for instance selection during `prepare_vm`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:41 +08:00
alex.lyn	866d3facba	kata-types: Introduce two GPU annotations for remote hypervisor Two annotations: `default_gpus and `default_gpu_model` as GPU annotations are introduced for Kata VM configurations to improve instance selection on remote hypervisors. By adding these annotations: (1) `default_gpus`: Allows users to specify the minimum number of GPUs a VM requires. This ensures that the remote hypervisor selects an instance with at least that many GPUs, preventing resource under-provisioning. (2) `default_gpu_model`: Lets users define the specific GPU model needed for the VM. This is crucial for workloads that depend on particular GPU archs or features, ensuring compatibility and optimal performance. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:41 +08:00
alex.lyn	ed0c0b2367	kata-types: Introduce GPU related fields in RemoteInfo To provide the remote hypervisor with the necessary intelligence to select the most appropriate instance for a given GPU instance, leading to better resource allocation, two fields `default_gpus` and `default_gpu_model` are introduced in `RemoteInfo`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:28 +08:00
Alex Lyn	9a1d4fc5d6	Merge pull request #11468 from Apokleos/fix-sharefs-none runtime-rs: Support shared fs with "none" on non-tee platforms	2025-06-26 15:37:44 +08:00
Gao Xiang	9079c8e598	runtime: improve EROFS snapshotter support To better support containerd 2.1 and later versions, remove the hardcoded `layer.erofs` and instead parse `/proc/mounts` to obtain the real mount source (and `/sys/block/loopX/loop/backing_file` if needed). If the mount source doesn't end with `layer.erofs`, it should be marked as unsupported, as it may be a filesystem meta file generated by later containerd versions for the EROFS flattened filesystem feature. Also check whether the filesystem type is `overlay` or not, since the containerd mount manager [1] may change it after being introduced. [1] https://github.com/containerd/containerd/issues/11303 Fixes: `f63ec50ba3` ("runtime: Add EROFS snapshotter with block device support") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-06-26 10:12:12 +08:00
Saul Paredes	d53c720ac1	tools: kata-monitor: update go version used to build in Dockerfile Current Dockerfile fails when trying to build from the root of the repo docker build -t kata-monitor -f tools/packaging/kata-monitor/Dockerfile . with "invalid go version '1.23.0': must match format 1.23" Using go 1.23 in the Dockerfile fixes the build error Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-06-25 15:32:41 -07:00
stevenhorsman	290fda9b97	agent-ctl: Bump image-rs version I notices that agent-ctl is including a 9 month old version of image-rs and the libs crates haven't been update for potentially many years, so bump all of these. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-25 16:30:58 +01:00
stevenhorsman	c7da62dd1e	versions: Bump guest-components Bump to pick up the new guest-components and matching trustee which use rust 1.85.1 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-25 15:05:07 +01:00
Fabiano Fidêncio	bebe377f0d	runtime-rs: Set default_maxvcpus to 0 Otherwise we just cannot start a container that requests more than 1 vcpu. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-25 14:36:46 +02:00
Steve Horsman	9ff30c6aeb	Merge pull request #11462 from kata-containers/add-scorecard-action ci: Add scorecard action	2025-06-25 12:48:11 +01:00
Fabiano Fidêncio	69c706b570	Merge pull request #11441 from stevenhorsman/protobuf-3.7.2-bump versions: Bump protobuf to 3.7.2	2025-06-25 13:47:28 +02:00
alex.lyn	eae62ca9ac	runtime-rs: Support shared fs with "none" on non-tee platforms This commit introduces the ability to run Pods without shared fs mechanism in Kata. The default shared fs can lead to unnecessary resource consumption and security risks for certain use cases. Specifically, scenarios where files only need to be copied into the VM once at Pod creation (e.g., non-tee envs) and don't require dynamic updates make the shared fs redundant and inefficient. By explicitly disabling shared fs functionality, we reduce resource overhead and shrink the attack surface. Users will need to employ alternative methods(e.g. guest-pull) to ensure container images are shared into the guest VM for these specific scenarios. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-25 17:36:57 +08:00
Fabiano Fidêncio	4719c08184	Merge pull request #11467 from lifupan/fixblockfile runtime-rs: fix the issue return the wrong volume	2025-06-25 09:56:28 +02:00
Fupan Li	48c8e0f296	runtime-rs: fix the issue return the wrong volume In the pre commit:74eccc54e7b31cc4c9abd8b6e4007c3a4c1d4dd4, it missed return the right rootfs volume. In the is_block_rootfs fn, if the rootfs is based on a block device such as devicemapper, it should clear the volume's source and let the device_manager to use the dev_id to get the device's host path. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-25 10:02:52 +08:00
Alex Lyn	648fef4f52	Merge pull request #11466 from lifupan/blockfile runtime-rs: add the blockfile based rootfs support	2025-06-25 09:46:54 +08:00
Dan Mihai	2d43b3f9fc	Merge pull request #11424 from katexochen/p/regorus-oras-cache ci/static-checks: use oras cache for regorus	2025-06-24 14:49:00 -07:00
Fupan Li	74eccc54e7	runtime-rs: add the blockfile based rootfs support For containerd's Blockfile Snapshotter, it will pass a rootfs mounts with a rawfile as a mount source and mount options with "loop" embeded. To support this type of rootfs, it is necessary to identify this as a blockfile rootfs through the "loop" flag, and then use the volume source of the rootfs as the source of the block device to hot-insert it into the guest. Fixes:#11464 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 22:31:54 +08:00
Paul Meyer	43739cefdf	ci/static-checks: use oras cache for regorus Instead of building it every time, we can store the regorus binary in OCI registry using oras and download it from there. This reduces the install time from ~1m40s to ~15s. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-24 13:14:18 +02:00
Fupan Li	9bdbd82690	Merge pull request #11181 from Apokleos/initdata-runtime-rs runtime-rs: Implement Initdata Spec Support in runtime-rs for CoCo	2025-06-24 18:59:34 +08:00
Fupan Li	1c59516d72	runtime-rs: add support resize_vcpu for cloud-hypervisor This commit add support of resize_vcpu for cloud-hypervisor using the it's vm resize api. It can support bothof vcpu hotplug and hot unplug. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	a3671b7a5c	runtime-rs: Add the memory hotplug for cloud-hypervisor For cloud-hypervisor, currently only hot plugging of memory is supported, but hot unplugging of memory is not supported. In addition, by default, cloud-hypervisor uses ACPI-based memory hot-plugging instead of virtio-mem based memory hot-plugging. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	7df29605a4	runtime-rs: add the vm resize and get vminfo api for clh Add API interfaces for get vminfo and resize. get vminfo can obtain the memory size and number of vCPUs from the cloud hypervisor vmm in real time. This interface provides information for the subsequent resize memory and vCPU. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	9a51ade4e2	runtime-rs: impl the Deserialize trait for MacAddr The system's own Deserialize cannot implement parsing from string to MacAddr, so we need to implement this trait ourself. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	ceaae3049c	runtime-rs: move the bytes_to_megs and megs_to_bytes to utils Since those two functions would be used by other hypervisors, thus move them into the utils crate. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
alex.lyn	871465f5d3	kata-agent: Allow unrecognized fields in InitData To make it flexibility and extensibility This change modifies the Kata Agent's handling of `InitData` to allow for unrecognized key-value pairs. The `InitData` field now directly utilizes `HashMap<String, String>`, enabling it to carry arbitrary metadata and information that may be consumed by other components Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	afcb042c28	runtime-rs: Specify the initdata to mrconfigid correctly During sandbox preparation, initdata should be specified to TdxConfig, specially mrconfigid, which is used to pass to tdx guest report for measurement. Fixes #11180 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	d6d8497b56	runtime-rs: Add host-data property to sev-snp-guest object SEV-SNP guest configuration utilizes a different set of properties compared to the existing 'sev-guest' object. This change introduces the `host-data` property within the sev-snp-guest object. This property allows for configuring an SEV-SNP guest with host-provided data, which is crucial for data integrity verification during attestation. The `host-data` property is specifically valid for SEV-SNP guests running on a capable platform. It is configured as a base64-encoded string when using the sev-snp-guest object. the example cmdline looks like: ```shell -object sev-snp-guest,id=sev-snp0,host-data=CGNkCHoBC5CcdGXir... ``` Fixes #11180 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	4a4361393c	runtime-rs: Introduce host-data in SevSnpConfig for validation To facilitate the transfer of initdata generated during `prepare_initdata_device_config`, a new parameter has been introduced into the `prepare_protection_device_config` function. Furthermore, to specifically pass initdata to SEV-SNP Guests, a `host_data` field has been added to the `SevSnpConfig` structure. However, this field is exclusively applicable to the SEV-SNP platform. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	5c8170dbb9	runtime-rs: Handle initdata block device config during sandbox start Retrieve the Initdata string content from the security_info of the Configuration. Based on the Protection Platform type, calculate the digest of the Initdata. Write the Initdata content to the block device. Subsequently, construct the BlockConfig based on this block device information. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	6ea1494701	runtime-rs: Add InitData Resource type for block device management To correctly manage initdata as a block device, a new InitData Resource type, inherently a block device, has been introduced within the ResourceManager. As a component of the Sandbox's resources, this InitData Resource needs to be appropriately handled by the Device Manager's handler. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	8c1482a221	runtime-rs: Introduce coco_data dir and initdata block Implement resource storage infrastructure with initial initdata support: 1. Create dedicated `coco_data` directory for: - Centralized management of CoCo resources; - Future expansion of CoCo artifacts; 2. Atomic initdata block as foundational component in `coco_data`, it will implement creation of compressed initdata blocks with: - Gzip compression with level customization (0-9) - Sector-aligned (512B) image format with magic header - Adaptive buffering (4KB-128KB) based on payload size - Temp-file atomic writes with 0o600 permissions Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	9b21d062c9	kata-types: Implement InitData retrieval from Pod annotation This commit implements the retrieval and processing of InitData provided via a Pod annotation. Specifically, it enables runtime-rs to: (1) Parse the "io.katacontainers.config.hypervisor.cc_init_data" annotation from the Pod YAML. (2) Perform reverse operations on the annotation value: base64 decoding followed by gzip decompression. (3) Deserialize the decompressed data into the internal InitData structure. (4) Serialize the resulting InitData into a string and store it in the Configuration. This allows users to inject configuration data into the TEE Guest by encoding and compressing it and passing it as an annotation in the Pod configuration. This mechanism supports scenarios where dynamic config is required for Confidential Containers. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	4ca394f4fc	kata-types: Implement Initdata Spec and Digest Calculation Logic This commit introduces the Initdata Spec and the logic for calculating its digest. It includes: (1) Define a `ProtectedPlatform` enum to represent major TEE platform types. (2) Create an `InitData` struct to support building and serializing initialization data in TOML format. (3) Implement adaptation for SHA-256, SHA-384, and SHA-512 digest algorithms. (4) Provide a platform-specific mechanism for adjusting digest lengths (zero-padding). (5) Supporting the decoding and verification of base64+gzip encoded Initdata. The core functionality ensures the integrity of data injected by the host through trusted algorithms, while also accommodating the measurement requirements of different TEE platforms. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	2603ee66b8	kata-types: Introduce initdata to SecurityInfo for data injection This commit introduces a new `initdata` field of type String to hypervisor `SecurityInfo`. In accordance with the Initdata Specification, this field will facilitate the injection of well-defined data from an untrusted host into the TEE. To ensure the integrity of this injected data, the TEE evidence's hostdata capability or the (v)TPM dynamic measurement capability will be leveraged, as outlined in the specification. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
Dan Mihai	89dcc8fb27	Merge pull request #11444 from microsoft/danmihai1/k8s-policy-rc tests: k8s-policy-rc: print pod descriptions	2025-06-23 16:14:56 -07:00
Dan Mihai	0a57e09259	Merge pull request #11426 from charludo/fix/genpolicy-corruption-of-layer-cache-file genpolicy: prevent corruption of the layer cache file	2025-06-23 14:00:45 -07:00
Dan Mihai	8aecf14b34	Merge pull request #11405 from kata-containers/dependabot/cargo/src/agent/clap-77d1155c52 build(deps): bump the clap group across 6 directories with 1 update	2025-06-23 13:05:59 -07:00
Dan Mihai	62c9845623	tests: k8s-policy-rc: print pod descriptions Don't use local launched_pods variable in test_rc_policy(), because teardown() needs to use this variable to print a description of the pods, for debugging purposes. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-23 16:23:26 +00:00
stevenhorsman	649e31340b	doc: Add scorecard badge Add our scorecard badge to our readme for transparency and to help motivate us to update our score Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-23 16:22:59 +01:00
stevenhorsman	6dd025d0ed	workflows: Add scorecard workflow Add a workflow to update our scorecard score on each change Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-23 16:09:14 +01:00
Steve Horsman	4f245df4a0	Merge pull request #11420 from kata-containers/pin-gha-actions workflows: Pin action hashes	2025-06-23 15:26:03 +01:00
charludo	4e57cc0ed2	genpolicy: keep layers cache in-memory to prevent corruption The locking mechanism around the layers cache file was insufficient to prevent corruption of the file. This commit moves the layers cache's management in-memory, only reading the cache file once at the beginning of `genpolicy`, and only writing to it once, at the end of `genpolicy`. In the case that obtaining a lock on the cache file fails, reading/writing to it is skipped, and the cache is not used/persisted. Signed-off-by: charludo <git@charlotteharludo.com>	2025-06-23 16:16:42 +02:00
RuoqingHe	8c1f6e827d	Merge pull request #11448 from RuoqingHe/remove-dup-ignore ci: Remove duplicated `rust-vmm` dependencies	2025-06-23 10:34:30 +08:00
Ruoqing He	1d2d2cc3d5	ci: Remove duplicated `rust-vmm` dependencies `vmm-sys-util` was duplicated while updating the `ignore` list of `rust-vmm` crates in #11431, remove duplicated one and sort the list. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-21 21:02:59 +00:00
stevenhorsman	9685e2aeca	trace-forwarder: Replace removed clap functions When moving from clap v2 to v4 a bunch of functions have been removed, so update the code to handle these replacements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-21 17:15:12 +01:00
stevenhorsman	e204847df5	agent-ctl: Replace removed clap functions When moving from clap v2 to v4 a bunch of functions have been removed, so update the code to handle these replacements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-21 17:15:12 +01:00
stevenhorsman	e11fc3334e	agent: Clap v4 updates AppSettings was removed, so refactor based on new documentation Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-21 17:15:12 +01:00
dependabot[bot]	0aa80313eb	build(deps): bump the clap group across 6 directories with 1 update Bumps the clap group with 1 update in the /src/agent directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/agent-ctl directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/genpolicy directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/kata-ctl directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/runk directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/trace-forwarder directory: [clap](https://github.com/clap-rs/clap). Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) --- updated-dependencies: - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap ... Signed-off-by: dependabot[bot] <support@github.com>	2025-06-21 17:15:12 +01:00
RuoqingHe	b22135f4e5	Merge pull request #11431 from RuoqingHe/udpate-rust-vmm-ignore-list ci: Update dependabot ignore list	2025-06-21 18:20:41 +08:00
Ruoqing He	6628ba3208	ci: Update dependabot ignore list Update dependabot ignore list in cargo ecosystem to ignore upgrades from rust-vmm crates, since those crates need to be managed carefully and manually. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-21 08:18:20 +01:00
stevenhorsman	9d3b9fb438	workflows: Pin action hashes Pin Github owned actions to specific hashes as recommended as tags are mutable see https://pin-gh-actions.kammel.dev/. This one of the recommendations that scorecard gives us. Note this was generated with `frizbee actions` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-21 08:14:13 +01:00
Steve Horsman	4bfa74c2a5	Merge pull request #11331 from stevenhorsman/helm-ghcr-login-update workflow: Remove code injection in helm login	2025-06-21 08:13:40 +01:00
Steve Horsman	353b4bc853	Merge pull request #11440 from stevenhorsman/osbuilder-fedora-42-update osbuilder: Update image-builder base to f42	2025-06-21 08:11:12 +01:00
Steve Horsman	cac1cb75ce	Merge pull request #11378 from kata-containers/dependabot/cargo/src/tools/agent-ctl/rustix-0.37.28 build(deps): bump rustix in various components	2025-06-21 08:05:21 +01:00
stevenhorsman	900d9be55e	build(deps): bump rustix in various components Bumps of rustix 0.36, 0.37 and 0.38 to resolve CVE-2024-43806 Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-20 14:52:43 -05:00
stevenhorsman	d9defd5102	osbuilder: Update image-builder base to f42 Fedora 40 is EoL, and I've seen the registry pull fail a few times recently, so let's bump to fedora 42 which has 10 months of support left. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-20 20:52:30 +01:00
stevenhorsman	0f1c326ca0	versions: Bump protobuf to 3.7.2 Now we are decoupled from the image-rs crate, we can bump the protobuf version across our project to resolve the GHSA-2gh3-rmm4-6rq5 advisory Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-20 20:52:04 +01:00
Saul Paredes	cc27966aa1	Merge pull request #11443 from microsoft/saulparedes/update_image tests: update container image for ci and unit test	2025-06-20 12:50:42 -07:00
Archana Choudhary	e093919b42	tests: update container image for ci and unit test This patch updates the container image for the CI test workloads: - `k8s-layered-sc-deployment.yaml` - `k8s-pod-sc-deployment.yaml` - `k8s-pod-sc-nobodyupdate-deployment.yaml` - `k8s-pod-sc-supplementalgroups-deployment.yaml` - `k8s-policy-deployment.yaml` Also updates unit tests: - `test_create_container_security_context` - `test_create_container_security_context_supplemental_groups` This fixes tests failing due to an image pull error as the previous image is no longer available in the container registry. Signed-off-by: Archana Choudhary <archana1@microsoft.com> Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-06-20 10:46:56 -07:00
stevenhorsman	776c89453c	workflow: Remove code injection in helm login In theory `github.actor` could be used for code injection, so swap it out. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-20 16:27:52 +01:00
Fabiano Fidêncio	6722ea2fd9	Merge pull request #11439 from stevenhorsman/multi-arch-manifest-permissions-fix release: Add more permissions	2025-06-19 12:45:37 +02:00
stevenhorsman	8da75bf55d	release: Add more permissions Add package: write to the multi-arch manifest upload to ghcr.io Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 11:04:29 +01:00
Fabiano Fidêncio	d0c1ce1367	Merge pull request #11438 from stevenhorsman/helm-upload-fix release: Fix helm push typo	2025-06-19 12:01:04 +02:00
stevenhorsman	eaf42b3e0f	release: Fix helm push typo Switch the hyper for an underscore, so the ghcr helm publish can work properly. Co-authored-by: Fabiano Fidêncio <fidencio@northflank.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 10:56:50 +01:00
Fabiano Fidêncio	f7d3ea0c55	Merge pull request #11437 from kata-containers/release-flow-permissions-fixes-iii workflows: Release permissions	2025-06-19 11:23:46 +02:00
stevenhorsman	19597b8950	workflows: Release permissions Add more permissions to the release workflow in order to enable `gh release` commands to run Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 10:05:23 +01:00
Fabiano Fidêncio	254ada2f6a	Merge pull request #11436 from kata-containers/release-flow-permission-fix-ii workflows: Add extra permissions	2025-06-19 10:45:26 +02:00
stevenhorsman	7c6c6f3c15	workflows: Add extra permissions Add permissions to the ppc release Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 09:39:01 +01:00
Steve Horsman	00c9e61b60	Merge pull request #11435 from kata-containers/release-flow-permissions-fix(es) workflows: Fix permissions	2025-06-19 09:35:23 +01:00
stevenhorsman	9adf989555	workflows: Fix permissions Add extra permissions for reusable workflow calls that need them later on Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 08:44:18 +01:00
Fabiano Fidêncio	e82de65d5d	Merge pull request #11425 from stevenhorsman/release-3.18.0-bump release: Bump version to 3.18.0	2025-06-18 21:39:51 +02:00
stevenhorsman	6fc622ef0f	release: Bump version to 3.18.0 Bump VERSION and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 19:09:42 +01:00
Steve Horsman	060faa3d1a	Merge pull request #11433 from kata-containers/cri-containerd-test-fast-fail-false workflows: Add fail-fast: false to cri-containerd tests	2025-06-18 19:08:59 +01:00
Steve Horsman	e0084a958c	Merge pull request #11432 from stevenhorsman/golang-1.23.10 versions: Bump golang to 1.23.10	2025-06-18 17:25:07 +01:00
Steve Horsman	4e3238b9dc	Merge pull request #11337 from zvonkok/fix-module-signing gpu: Fix module signing	2025-06-18 17:23:51 +01:00
Steve Horsman	547b6c5781	Merge pull request #11429 from stevenhorsman/cri-containerd-required-test-rename Cri containerd required test rename	2025-06-18 15:45:14 +01:00
Zvonko Kaiser	e2f18057a4	kernel: Add config option for signing Only sign the kernel if the user has provided the KBUILD_SIGN_PIN otherwise ignore. Whole here, let's move the functionality to the common fragments as it's not a GPU specific functionality. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-18 15:32:26 +02:00
stevenhorsman	73d7b4f258	workflows: Add fail-fast: false to cri-containerd tests At the moment if any of the tests in the matric fails then the rest of the jobs are cancelled, so we have to re-run everything. Add `fail-fast: false` to stop this behaviour. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 14:20:16 +01:00
stevenhorsman	aedbaa1545	versions: Bump golang to 1.23.10 Bump golang to fix CVEs GO-2025-3751 and GO-2025-3563 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 11:11:32 +01:00
stevenhorsman	b20f89b775	ci: required-tests: Remove test skip Remove the rule that causes gatekeeper to skip tests if we've only updated the required-tests.yaml list. Although update to just the required-tests.yaml doesn't change the outcome of any of the CI tests, it does change whether gatekeeper will still pass with the new rules. Although it's a bit of a hit to run the CI, it's probably worth it to keep gatekeeper validated. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 10:52:03 +01:00
stevenhorsman	d68b09a4f0	ci: required-tests: cri-containerd rename Update the names of the required jobs based on the changes done in #11019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 10:52:03 +01:00
Steve Horsman	0aca20986b	Merge pull request #11400 from miz060/mitchzhu/add-govulncheck ci: Add optional govulncheck security scanning to static checks	2025-06-18 10:34:56 +01:00
Steve Horsman	d754e3939b	Merge pull request #11427 from BbolroC/bump-rootfs-confidential-s390x rootfs: Bump rootfs-{image,initrd} to 24.04	2025-06-18 09:06:58 +01:00
Mitch Zhu	292c27130d	ci: Add optional govulncheck security scanning to static checks This adds govulncheck vulnerability scanning as a non-blocking check in the static checks workflow. The check scans Go runtime binaries for known vulnerabilities while filtering out verified false positives. Signed-off-by: Mitch Zhu <mitchzhu@microsoft.com>	2025-06-17 20:43:00 -07:00
Alex Lyn	b61b20eef3	Merge pull request #11394 from mythi/tdx-kata-deploy-bump kata-deploy: accept 25.04 as supported distro for TDX	2025-06-18 08:52:46 +08:00
Hyounggyu Choi	4be261f248	rootfs: Bump rootfs-{image,initrd} to 24.04 Since #11197 was merged, all confidential k8s e2e tests for s390x have been failing with the following errors: ``` attestation-agent: error while loading shared libraries: libcurl.so.4: cannot open shared object file libnghttp2.so.14: cannot open shared object file ``` In line with the update on x86_64, we need to upgrade the OS used in rootfs-{image,initrd} on s390x. This commit also bumps all 22.04 to 24.04 for all architectures. For s390x, this ensures the missing packages listed above are installed. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-06-17 22:03:26 +02:00
Steve Horsman	fd93e83a4f	Merge pull request #11019 from seungukshin/cri-containerd-tests-for-arm64 Enable cri-containerd-tests for arm64	2025-06-17 11:53:49 +01:00
Fupan Li	15b24b5be1	Merge pull request #10698 from Apokleos/kata-volume-rs runtime-rs: Support Pull Image in Guest with Kata Volume for CoCo	2025-06-17 15:00:02 +08:00
Lei Liu	71d1cdf40a	test: fix broken testing code in libs After commit `a3f973db3b` merged, protection::GuestProtection::[Snp,Sev] have changed to tuple variants, and can no longer be used in assert_eq marco without tuple values, or some errors will raised: ``` assert_eq!(actual.unwrap(), GuestProtection::Snp); \| ^^^^^^^^^^^^^^^^^^^^ expected \ `GuestProtection`, found enum constructor ``` Signed-off-by: Lei Liu <liulei.pt@bytedance.com>	2025-06-17 12:38:39 +08:00
Steve Horsman	a00f39e272	Merge pull request #11419 from katexochen/p/gitignore-direnv gitignore: ignore direnv	2025-06-16 17:26:10 +01:00
Seunguk Shin	4f9b7e4d4f	ci: Enable cri-containerd-tests for arm64 This change enables cri-containerd-test for arm64. Signed-off-by: Seunguk Shin <seunguk.shin@arm.com> Reviewed-by: Nick Connolly <nick.connolly@arm.com>	2025-06-16 15:12:17 +01:00
Paul Meyer	822f54c800	ci/static-checks: add dispatch trigger This simplifies executing the workflow on a fork during testing. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-16 16:12:10 +02:00
Seunguk Shin	203e3af94b	ci: Disable run-containerd-sandboxapi containerd-sandboxapi fails with `containerd v2.0.x` and passes with `containerd v1.7.x` regardless kata-containers. And it was not tested with `containerd v2.0.x` because `containerd v2.0.x` could not recognize `[plugins.cri.containerd]` in `config.toml`. Signed-off-by: Seunguk Shin <seunguk.shin@arm.com>	2025-06-16 15:02:07 +01:00
Mikko Ylinen	825b1cd233	kata-deploy: accept 25.04 as supported distro for TDX the latest Canonical TDX release supports 25.04 / Plucky as well. Users experimenting with the latest goodies in the 25.04 TDX enablement won't get Kata deployed properly. This change accepts 25.04 as supported distro for TDX. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-06-16 13:42:08 +01:00
Xuewei Niu	9b4518f742	Merge pull request #11359 from pawelbeza/fix-logs-on-virtiofs-shutdown Fix logging on virtiofs shutdown	2025-06-16 17:06:29 +08:00
Paul Meyer	b629b11ba0	gitignore: ignore direnv This allows contributors to setup direnv without having it detected by git. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-16 11:02:00 +02:00
Steve Horsman	64c95cb996	Merge pull request #11389 from kata-containers/checkout-persist-credentials-false workflows: Set persist-credentials: false on checkout	2025-06-16 09:58:22 +01:00
alex.lyn	cebb259e51	runtime-rs: Introduce force guest pulling image Container image integrity protection is a critical practice involving a multi-layered defense mechanism. While container images inherently offer basic integrity verification through Content-Addressable Storage (CAS) (ensuring pulled content matches stored hashes), a combination of other measures is crucial for production environments. These layers include: Encrypted Transport (HTTPS/TLS) to prevent tampering during transfer; Image Signing to confirm the image originates from a trusted source; Vulnerability Scanning to ensure the image content is "healthy"; and Trusted Registries with stringent access controls. In certain scenarios, such as when container image confidentiality requirements are not stringent, and integrity is already ensured via the aforementioned mechanisms (especially CAS and HTTPS/TLS), adopting "force guest pull" can be a viable option. This implies that even when pulling images from a container registry, their integrity remains guaranteed through content hashes and other built-in mechanisms, without relying on additional host-side verification or specialized transfer methods. Since this feature is already available in runtime-go and offers synergistic benefits with guest pull, we have chosen to support force guest pull. Fixes #10690 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
alex.lyn	2157075140	kata-types: Introduce a helper method to adjust rootfs mounts This commit introduces the `adjust_rootfs_mounts` function to manage root filesystem mounts for guest-pull scenarios. When the force guest-pull mechanism is active, this function ensures that the rootfs is exclusively configured via a dedicated `KataVirtualVolume`. It disregards any provided input mounts, instead generating a single, default `KataVirtualVolume`. This volume is then base64-encoded and set as the sole mount option for a new, singular `Mount` entry, which is returned as the only item in the `Vec<Mount>`. This change guarantees consistent and exclusive rootfs configuration when utilizing guest-pull for container images. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
alex.lyn	c9ffbaf30d	runtime-rs: Support handling Kata Virtual Volume in handle_rootfs In CoCo scenarios, there's no image pulling on host side, and it will disable such operations, that's to say, there's no files sharing between host and guest, especially for container rootfs. We introduce Kata Virtual Volume to help handle such cases: (1) Introduce is_kata_virtual_volume to ensure the volume is kata virtual volume. (2) Introduce VirtualVolume Handling logic in handle_rootfs when the mount is kata virtual volume. Fixes #10690 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
alex.lyn	2600fc6f43	runtime-rs: Add Spec annotation to help pass image information We need get the relevent image ref from OCI runtime Spec, especially the annotation of it. Fixes #10690 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
alex.lyn	d4e9369d3d	runtime-rs: Implement guest-pull rootfs via virtual volumes This commit introduces comprehensive support for rootfs mount mgmt through Kata Virtual Volumes, specifically enabling the guest-pull mechanism. It enhances the runtime's ability to: (1) Extract image references from container annotations (CRI/CRI-O). (2) Process `KataVirtualVolume` objects, configuring them for guest-pull operations. (3) Set up the agent's storage for guest-pulled images. This functionality streamlines the process of pulling container images directly within the guest for rootfs, aligning with guest-side image management strategies. Fixes #10690 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
Alex Lyn	a966d1be50	Merge pull request #11197 from Xynnn007/move-image-pull Move image pull abilities to CDH	2025-06-16 16:43:59 +08:00
Xynnn007	e0b4cd2dba	initrd/image: update x86_64 base to ubuntu 24.04 The Multistrap issue has been fixed in noble thus we can use the LTS. Also, this will fix the error reported by CDH ``` /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found ``` Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	0b3a8c0355	initdata: delete coco_as token section in initdata The new version of AA allows the config not having a coco_as token config. If not provided, it will mark as None. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	5bab460224	chore(deps): update guest-components This patch updates the guest-components to new version with better error logging for CDH. It also allows the config of AA not having a coco_as token config. Also, the new version of CDH requires to build aws-lc-sys thus needs to install cmake for build. See https://github.com/kata-containers/kata-containers/actions/runs/15327923347/job/43127108813?pr=11197#step:6:1609 for details. Besides, the new version of guest-components have some fixes for SNP stack, which requires the updates of trustee side. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	aae64fa3d6	agent: add agent.image_pull_timeout parameter This new parameter for kata-agent is used to control the timeout for a guest pull request. Note that sometimes an image can be really big, so we set default timeout to 1200 seconds (20 minutes). Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	93826ff90c	tests: update negative test log assertions After moving image pulling from kata-agent to CDH, the failed image pull error messages have been slightly changed. This commit is to apply for the change. Note that in original and current image-rs implementation, both no key or wrong key will result in a same error information. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	7420194ea8	build: abandon PULL_TYPE build env Now kata-agent by default supports both guest pull and host pull abilities, thus we do not need to specify the PULL_TYPE env when building kata-agent. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:53:55 +08:00
Xynnn007	44a6d1a6f7	docs: update guest pull document After moving guest pull abilities to CDH, the document of guest pull should be updated due to new workflow. Also, replace the diagram of PNG into a mermaid one for better maintaince. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	105cb47991	agent: always try to override oci process spec In previous version, only when the `guest-pull` feature is enabled during the build time, the OCI process will be tried to be overrided when the storage has a guest pull volume and also it is sandbox. After getting rid of the feature, whether it is guest-pull is runtimely determined thus we can always do this trying override, by checking if there is kata guest pull volume in storages and it's sandbox. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	6b1249186f	agent: embed ocicrypt config in rootfs by default Now the ocicrypt configuration used by CDH is always the same and it's not a good practics to write it into the rootfs during runtime by kata-agent. Thus we now move it to coco-guest-components build script. The config will be embedded into guest image/initrd together with CDH binary. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	22e65024ce	agent: get rid of pull-type option The feature `guest-pull` and `default-pull` are both removed, because both guest pull and host pull are supported in building time without without involving new dependencies like image-rs before. The guest pull will depend on the CDH process, not the build time feature. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	0e15b49369	agent: get rid of init_image_service we do not need to initialize image service in kata-agent now, as it's initialized in CDH. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	22c50cae7b	agent: let image_pull_handler call cdh to pull image This is a higher level calling to pull image inside guest. Now it should call confidential_data_hub's API. As the previous pull_image API does 1. check is sandbox 2. generate bundle_path inside the original logic, and the new API does not do them to keep the API semantice clean, thus before we call the API, we explicitly do the two things. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	39cd430994	agent: add ocicrypt_config envs for CDH process now image pull ability is moved to CDH, thus the CDH process needs environment variables of ocicrypt to help find the keyprovider(cdh) to decrypt images. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	f67f5c2b69	agent: remove image pull configs As image pull ability is moved to CDH, kata-agent does not need the confugurations of image pulling anymore. All these configurations reading from kernel cmdline is now implemented by CDH. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	4436fe6d99	agent: move guest pull abilities to Confidential Data Hub Image pull abilities are all moved to the separate component Confidential Data Hub (CDH) and we only left the auxiliary functions except pull_image in confidential_data_hub/image.rs Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:10:09 +08:00
Xynnn007	5067aafd56	agent: move cdh.rs and image.rs to a separate module confidential_data_hub This is a little refactoring commit that moves the mod `cdh.rs` and `image.rs` to a directory module `confidential_data_hub`. This is because the image pull ability will be moved into confidential data hub, thus it is better to handle image pull things in the confidential data hub submodule. Also, this commit does some changes upon the original code. It gets rid of a static variable for CDH timeout config and directly use the global config variable's member. Also, this changes the `is_cdh_client_initialized` function to sync version as it does not need to be async. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:10:09 +08:00
Xynnn007	997a1f35ab	agent: add PullImage to CDH proto file CDH provides the image pull api. This commit adds the declaration of the API in the CDH proto file. This will be used in following commits. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:10:09 +08:00
Xuewei Niu	c27116fa8e	Merge pull request #11416 from lifupan/prealloc runtime-rs: add the memory prealloc support for qemu/ch	2025-06-15 11:01:05 +08:00
Xuewei Niu	b43a61e2c8	Merge pull request #11418 from microsoft/saulparedes/flag_secure_mount agent: add feature flag to secure_mount method	2025-06-15 10:59:20 +08:00
Saul Paredes	cdfc9fd2d9	agent: add feature flag to secure_mount method This method is not used when guest-pull is not used. Add a flag that prevents a compile error when building with rust version > 1.84.0 and not using guest-pull Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-06-13 11:25:58 -07:00
Fabiano Fidêncio	6f0ea595b7	Merge pull request #11402 from microsoft/danmihai1/disable-nvdimm runtime: build variable for disable_image_nvdimm=true	2025-06-13 16:35:57 +02:00
Dan Mihai	0f8e453518	Merge pull request #11412 from katexochen/rego-v1 genpolicy: fix rules syntax issues, rego v1 compatibility; ci: checks for rego parsing	2025-06-13 07:30:34 -07:00
Paweł Bęza	91db41227f	runtime: Fix logging on virtiofs shutdown Fixes a confusing log message shown when Virtio-FS is disabled. Previously we logged “The virtiofsd had stopped” regardless of whether Virtio-FS was actually enabled or not. Signed-off-by: Paweł Bęza <pawel.beza99@gmail.com>	2025-06-13 15:59:52 +02:00
Fupan Li	5163156676	runtime-rs: add the memory prealloc support for cloud-hypervisor Add the memory prealloc support for cloud hypervisor too. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-13 16:26:11 +08:00
Fupan Li	fb7cfcd2fb	runtime-rs: add the memory prealloc support for qemu Add the memory prealloc support for qemu hypervisor. When it was enabled, all of the memory will be allocated and locked. This is useful when you want to reserve all the memory upfront or in the cases where you want memory latencies to be very predictable. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-13 16:26:03 +08:00
Steve Horsman	707b8b8a98	Merge pull request #11374 from kata-containers/dependabot/cargo/src/dragonball/tracing-1900da1d01 build(deps): bump the tracing group across 7 directories with 1 update	2025-06-13 08:30:37 +01:00
dependabot[bot]	1e6962e4a8	build(deps): bump the tracing group across 7 directories with 1 update Bumps the tracing group with 1 update in the /src/dragonball directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/libs directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/agent-ctl directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/genpolicy directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/kata-ctl directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/runk directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/trace-forwarder directory: [tracing](https://github.com/tokio-rs/tracing). Updates `tracing` from 0.1.37 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.34 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.37 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.37 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.40 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.40 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.29 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) --- updated-dependencies: - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: indirect update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing ... Signed-off-by: dependabot[bot] <support@github.com>	2025-06-12 15:45:35 +00:00
Steve Horsman	6bdc0cf495	Merge pull request #11417 from kata-containers/sprt/revert-validate-ok-to-test Revert "ci: gha: Remove ok-to-test label on every push"	2025-06-12 15:04:44 +01:00
Aurélien Bombo	5200034642	Revert "ci: gha: Remove ok-to-test label on every push" This reverts commit `2ee3470627`. This is mostly redundant given we already have workflow approval for external contributors. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-12 08:40:06 -05:00
Paul Meyer	64906e6973	tests/static-checks: parse rego with opa and regorus Ensure rego policies in tree can be parsed using opa and regorus. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-12 14:59:39 +02:00
Paul Meyer	107e7dfdf6	ci/static-checks: install regorus Make regorus available for static checks as prerequisite for rego checks. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-12 14:59:39 +02:00
Steve Horsman	843655c352	Merge pull request #11411 from stevenhorsman/runk-users-crate-switch runk: Switch users crate	2025-06-12 10:35:31 +01:00
Paul Meyer	71796f7b12	ci/static-checks: install opa Make open-policy-agent available for static checks as prerequisite for rego checks. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-12 10:46:43 +02:00
Paul Meyer	5baea34fff	genpolicy/rules: rego v1 compatibility Migrate policy to rego v1. See https://www.openpolicyagent.org/docs/v0-upgrade#changes-to-rego-in-opa-v10 Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-12 10:46:43 +02:00
Fupan Li	7c1f8c9009	Merge pull request #10697 from Apokleos/no-sharefs runtime-rs: Support shared_fs = "none" for CoCo	2025-06-12 11:48:00 +08:00
Fupan Li	a495dec9f4	Merge pull request #11305 from RuoqingHe/bump-rust-1.85.1 versions: Bump Rust from 1.80.0 to 1.85.1	2025-06-12 10:21:38 +08:00
Ruoqing He	26c7f941aa	versions: Bump rust to 1.85.1 As discussed in 2025-05-22's AC call, bump rust toolchian to 1.85.1. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	5011253818	agent-ctl: Bump `ttrpc-codegen` related dependencies Bump `ttrpc-codegen` related dependencies in response to `ttrpc-codegen` bump in `libs/protocol`. Relates: #11376 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	ba75b3299f	dragonball: Fix clippy `elided_named_lifetimes` Manually fix `elided_named_lifetimes` clippy warning reported by rust 1.85.1. ```console error: elided lifetime has a name --> src/vm/aarch64.rs:113:10 \| 107 \| fn get_fdt_vm_info<'a>( \| -- lifetime `'a` declared here ... 113 \| ) -> FdtVmInfo { \| ^^^^^^^^^ this elided lifetime gets resolved as `'a` \| = note: `-D elided-named-lifetimes` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(elided_named_lifetimes)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	1bbedb8def	dragonball: Fix clippy `repr_packed_without_abi` Fix `repr_packed_without_abi` clippy warning as suggested by rust 1.85.1. ```console error: item uses `packed` representation without ABI-qualification --> dbs_pci/src/msi.rs:468:1 \| 466 \| #[repr(packed)] \| ------ `packed` representation set here 467 \| #[derive(Clone, Copy, Default, PartialEq)] 468 \| / pub struct MsiState { 469 \| \| msg_ctl: u16, 470 \| \| msg_addr_lo: u32, 471 \| \| msg_addr_hi: u32, 472 \| \| msg_data: u16, 473 \| \| mask_bits: u32, 474 \| \| } \| \|_^ \| = warning: unqualified `#[repr(packed)]` defaults to `#[repr(Rust, packed)]`, which has no stable ABI = help: qualify the desired ABI explicity via `#[repr(C, packed)]` or `#[repr(Rust, packed)]` = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#repr_packed_without_abi = note: `-D clippy::repr-packed-without-abi` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::repr_packed_without_abi)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	e8be3c13fb	dragonball: Fix clippy `missing_docs` Fix `missing_docs` clippy warning as suggested by rust 1.85.1. ```console error: missing documentation for an associated function --> src/device_manager/mod.rs:1299:9 \| 1299 \| pub fn new_test_mgr() -> Self { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: `-D missing-docs` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(missing_docs)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	ceff1ed98d	dragonball: Fix clippy `needless_lifetimes` Fix `needless_lifetimes` clippy warning as suggested by rust 1.85.1. ```console error: the following explicit lifetimes could be elided: 'a --> dbs_virtio_devices/src/vhost/vhost_user/connection.rs:137:6 \| 137 \| impl<'a, AS: GuestAddressSpace, Q: QueueT, R: GuestMemoryRegion> EndpointParam<'a, AS, Q, R> { \| ^^ ^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes = note: `-D clippy::needless-lifetimes` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::needless_lifetimes)]` help: elide the lifetimes \| 137 - impl<'a, AS: GuestAddressSpace, Q: QueueT, R: GuestMemoryRegion> EndpointParam<'a, AS, Q, R> { 137 + impl<AS: GuestAddressSpace, Q: QueueT, R: GuestMemoryRegion> EndpointParam<'_, AS, Q, R> { \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	c04f1048d5	dragonball: Fix clippy `unnecessary_lazy_evaluations` Fix `unnecessary_lazy_evaluations` clippy warning as suggested by rust 1.85.1. ```console error: unnecessary closure used to substitute value for `Option::None` --> dbs_virtio_devices/src/vhost/vhost_user/block.rs:225:28 \| 225 \| let vhost_socket = config_path \| ____________________________^ 226 \| \| .strip_prefix("spdk://") 227 \| \| .ok_or_else(\|\| VirtIoError::InvalidInput)? \| \|_____________________________________________________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_lazy_evaluations = note: `-D clippy::unnecessary-lazy-evaluations` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::unnecessary_lazy_evaluations)]` help: use `ok_or` instead \| 227 \| .ok_or(VirtIoError::InvalidInput)? \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn> unnecessary_lazy_evaluations Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	16b45462a1	dragonball: Fix clippy `manual_inspect` Manually fix `manual_inspect` clippy warning reported by rust 1.85.1. ```console error: using `map_err` over `inspect_err` --> dbs_virtio_devices/src/net.rs:753:52 \| 753 \| self.device_info.read_config(offset, data).map_err(\|e\| { \| ^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_inspect = note: `-D clippy::manual-inspect` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_inspect)]` help: try \| 753 ~ self.device_info.read_config(offset, data).inspect_err(\|e\| { 754 ~ self.metrics.cfg_fails.inc(); \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	5e80293bfc	dragonball: Fix clippy `empty_line_after_doc_comments` Fix `empty_line_after_doc_comments` clippy warning as suggested by rust 1.85.1. ```console error: empty line after doc comment --> dbs_boot/src/x86_64/layout.rs:11:1 \| 11 \| / /// Magic addresses externally used to lay out x86_64 VMs. 12 \| \| \| \|_^ 13 \| /// Global Descriptor Table Offset 14 \| pub const BOOT_GDT_OFFSET: u64 = 0x500; \| ------------------------------ the comment documents this constant \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_doc_comments = note: `-D clippy::empty-line-after-doc-comments` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::empty_line_after_doc_comments)]` = help: if the empty line is unintentional remove it help: if the documentation should include the empty line include it in the comment \| 12 \| /// \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	bb13b6696e	dragonball: Fix clippy `manual_div_ceil` Fix `manual_div_ceil` clippy warning as suggested by rust 1.85.1. ```console error: manually reimplementing `div_ceil` --> dbs_interrupt/src/kvm/mod.rs:202:24 \| 202 \| let elem_cnt = (total_sz + elem_sz - 1) / elem_sz; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using `.div_ceil()`: `total_sz.div_ceil(elem_sz)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_div_ceil = note: `-D clippy::manual-div-ceil` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_div_ceil)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	e58bd52dd8	dragonball: Fix clippy `precedence` Fix `precedence` clippy warning as suggested by rust 1.85.1. ```console error: operator precedence can trip the unwary --> dbs_interrupt/src/kvm/mod.rs:169:6 \| 169 \| (u64::from(type1) << 48 \| u64::from(entry.type_) << 32) \| u64::from(entry.gsi) \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider parenthesizing your expression: `(u64::from(type1) << 48) \| (u64::from(entry.type_) << 32)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#precedence = note: `-D clippy::precedence` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::precedence)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	44142b13d3	genpolicy: Fix clippy `unstable_name_collisions` Manually fix `unstable_name_collisions` clippy warning reported by rust 1.85.1. ```console error: a method with this name may be added to the standard library in the future --> src/registry.rs:646:10 \| 646 \| file.unlock()?; \| ^^^^^^ \| = warning: once this associated item is added to the standard library, the ambiguity may cause an error or change in behavior! = note: for more information, see issue #48919 <https://github.com/rust-lang/rust/issues/48919> = help: call with fully qualified syntax `fs2::FileExt::unlock(...)` to keep using the current method = note: `-D unstable-name-collisions` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unstable_name_collisions)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	366d293141	genpolicy: Fix clippy `manual_unwrap_or_default` Manually fix `manual_unwrap_or_default` clippy warning reported by rust 1.85.1. ```console error: if let can be simplified with `.unwrap_or_default()` --> src/registry.rs:619:37 \| 619 \| let mut data: Vec<ImageLayer> = if let Ok(vec) = serde_json::from_reader(read_file) { \| _____________________________________^ 620 \| \| vec 621 \| \| } else { ... \| 624 \| \| }; \| \|_____^ help: replace it with: `serde_json::from_reader(read_file).unwrap_or_default()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_unwrap_or_default = note: `-D clippy::manual-unwrap-or-default` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_unwrap_or_default)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	a71a77bfa3	genpolicy: Fix clippy `manual_div_ceil` Manually fix `manual_div_ceil` clippy warning reported by rust 1.85.1. ```console error: manually reimplementing `div_ceil` --> src/verity.rs:73:25 \| 73 \| let count = (data_size + entry_size - 1) / entry_size; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using `.div_ceil()`: `data_size.div_ceil(entry_size)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_div_ceil = note: `-D clippy::manual-div-ceil` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_div_ceil)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	5d491bd4f4	genpolicy: Bump `ttrpc-codegen` related dependencies Bump `ttrpc-codegen` related dependencies in response to `ttrpc-codegen` bump in `libs/protocol`. Relates: #11376 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	965f1d799c	kata-ctl: Fix clippy `empty_line_after_outer_attr` Manually fix `empty_line_after_outer_attr` clippy warning reported by rust 1.85.1. ```console error: empty line after outer attribute --> src/check.rs:515:9 \| 515 \| / #[allow(dead_code)] 516 \| \| \| \|_^ 517 \| struct TestData<'a> { \| ------------------- the attribute applies to this struct \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_outer_attr = note: `-D clippy::empty-line-after-outer-attr` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::empty_line_after_outer_attr)]` = help: if the empty line is unintentional remove it ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	3d64b11454	kata-ctl: Fix clippy `question_mark` Manually fix `question_mark` clippy warning reported by rust 1.85.1. ```console error: this `match` expression can be replaced with `?` --> src/ops/check_ops.rs:49:13 \| 49 \| let f = match get_builtin_check_func(check) { \| _____________^ 50 \| \| Ok(fp) => fp, 51 \| \| Err(e) => return Err(e), 52 \| \| }; \| \|_____^ help: try instead: `get_builtin_check_func(check)?` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#question_mark = note: `-D clippy::question-mark` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::question_mark)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	702ba4033e	kata-ctl: Bump `ttrpc-codegen` related dependencies Bump `ttrpc-codegen` related dependencies in response to `ttrpc-codegen` bump in `libs/protocol`. Relates: #11376 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	f70c17660a	runtime-rs: Fix clippy `unnecessary_map_or` Fix `unnecessary_map_or` clippy warning as suggested by rust 1.85.1. error: this `map_or` can be simplified --> crates/hypervisor/src/ch/inner_hypervisor.rs:1054:24 \| 1054 \| let have_tdx = fs::read(TDX_KVM_PARAMETER_PATH) \| ________________________^ 1055 \| \| .map_or(false, \|content\| !content.is_empty() && content[0] == b'Y'); \| \|_______________________________________________________________________________^ help: use is_ok_and instead: `fs::read(TDX_KVM_PARAMETER_PATH).is_ok_and(\|content\| !content.is_empty() && content[0] == b'Y')` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_map_or = note: `-D clippy::unnecessary-map-or` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::unnecessary_map_or)]` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	d7dfab92be	runtime-rs: Fix clippy `manual_inspect` Manually fix `manual_inspect` clippy warning reported by rust 1.85.1. ```console error: using `map` over `inspect` --> crates/resource/src/cdi_devices/container_device.rs:50:10 \| 50 \| .map(\|device\| { \| ^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_inspect = note: `-D clippy::manual-inspect` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_inspect)]` help: try \| 50 ~ .inspect(\|device\| { 51 \| // push every device's Device to agent_devices 52 ~ devices_agent.push(device.device.clone()); \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	4c467f57de	runtime-rs: Fix clippy `needless_return` Fix `needless_return` clippy warning as suggested by rust 1.85.1. ```console error: unneeded `return` statement --> crates/resource/src/rootfs/nydus_rootfs.rs:199:5 \| 199 \| return Some(prefetch_list_path.display().to_string()); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_return = note: `-D clippy::needless-return` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::needless_return)]` help: remove `return` \| 199 - return Some(prefetch_list_path.display().to_string()); 199 + Some(prefetch_list_path.display().to_string()) \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	23365fc7e2	runtime-rs: Bump `ttrpc-codegen` related dependencies Bump `ttrpc-codegen` related dependencies in response to `ttrpc-codegen` bump in `libs/protocol`. Relates: #11376 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	bd4d9cf67c	agent: Fix clippy `empty_line_after_doc_comments` Manually fix `empty_line_after_doc_comments` clippy warning reported by rust 1.85.1. ```console error: empty line after doc comment --> src/linux_abi.rs:8:1 \| 8 \| / /// Linux ABI related constants. 9 \| \| \| \|_^ 10 \| #[cfg(target_arch = "aarch64")] 11 \| use std::fs; \| ------- the comment documents this import \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_doc_comments = note: `-D clippy::empty-line-after-doc-comments` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::empty_line_after_doc_comments)]` = help: if the empty line is unintentional remove it ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Paul Meyer	d488c998c7	genpolicy/rules: fix syntax issue Policy wan't parsable with OPA due to surplus whitespace. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-11 14:48:36 +02:00
Steve Horsman	c8fcda0d73	Merge pull request #11407 from Champ-Goblem/fix/nvidia-rootfs-only-copy-opa-when-agent-policy-enabled nvidia-rootfs: only copy `kata-opa` if `AGENT_POLICY` is enabled	2025-06-11 13:39:07 +01:00
stevenhorsman	39f51b4c6d	runk: Switch users crate The users@0.11.0 has a high severity CVE-2025-5791 and doesn't seem to be maintained, so switch to uzers which forked it. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-11 12:03:28 +01:00
Champ-Goblem	d6c45027f5	nvidia-rootfs: only copy `kata-opa` if `AGENT_POLICY` is enabled In the nvidia rootfs build, only copy in `kata-opa` if `AGENT_POLICY` is enabled. This fixes builds when `AGENT_POLICY` is disabled and opa is not built. Signed-off-by: Champ-Goblem <cameron@northflank.com>	2025-06-11 11:25:10 +02:00
Ruoqing He	2ccb306c0b	agent: Fix clippy `precedence` Fix `precedence` clippy warning as suggested by rust 1.85.1. ```console warning: operator precedence can trip the unwary --> src/pci.rs:54:19 \| 54 \| Ok(SlotFn(ss8 << FUNCTION_BITS \| f8)) \| ^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider parenthesizing your expression: `(ss8 << FUNCTION_BITS) \| f8` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#precedence ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	048178bc5e	agent: Fix clippy `unnecessary_get_then_check` Manually fix `unnecessary_get_then_check` clippy warning as suggested by rust 1.85.1. ```console warning: unnecessary use of `get(&shared_mount.src_ctr).is_none()` --> src/sandbox.rs:431:25 \| 431 \| if src_ctrs.get(&shared_mount.src_ctr).is_none() { \| ---------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| \| \| help: replace it with: `!src_ctrs.contains_key(&shared_mount.src_ctr)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_get_then_check ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	54ec432178	agent: Fix clippy `partialeq_to_none` Fix `partialeq_to_none` clippy warning as suggested by rust 1.85.1. ```console warning: binary comparison to literal `Option::None` --> src/sandbox.rs:431:16 \| 431 \| if src_ctrs.get(&shared_mount.src_ctr) == None { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use `Option::is_none()` instead: `src_ctrs.get(&shared_mount.src_ctr).is_none()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#partialeq_to_none ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	95dca31ecc	agent: Fix clippy `question_mark` Fix `question_mark` clippy warning as suggested by rust 1.85.1. ```console warning: this `match` expression can be replaced with `?` --> rustjail/src/cgroups/fs/mod.rs:1327:20 \| 1327 \| let dev_type = match DeviceType::from_char(d.typ().as_str().chars().next()) { \| ____________________^ 1328 \| \| Some(t) => t, 1329 \| \| None => return None, 1330 \| \| }; \| \|_____^ help: try instead: `DeviceType::from_char(d.typ().as_str().chars().next())?` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#question_mark ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	5a95a65604	agent: Fix clippy `unnecessary_map_or` Fix `unnecessary_map_or` clippy warning as suggested by rust 1.85.1. ```console warning: this `map_or` can be simplified --> rustjail/src/container.rs:1424:20 \| 1424 \| if namespace \| ____________________^ 1425 \| \| .path() 1426 \| \| .as_ref() 1427 \| \| .map_or(true, \|p\| p.as_os_str().is_empty()) \| \|_______________________________________________________________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_map_or help: use is_none_or instead \| 1424 ~ if namespace 1425 + .path() 1426 + .as_ref().is_none_or(\|p\| p.as_os_str().is_empty()) \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	f9c76edd23	agent: Fix clippy `manual_inspect` Manually fix `manual_inspect` clippy warning reported by rust 1.85.1. ```console warning: using `map_err` over `inspect_err` --> rustjail/src/mount.rs:881:6 \| 881 \| .map_err(\|e\| { \| ^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_inspect help: try \| 881 ~ .inspect_err(\|&e\| { 882 ~ log_child!(cfd_log, "mount error: {:?}", e); \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	7ff34f00c2	agent: Fix clippy `single_match` Fix `single_match` clippy warning as suggested by rust 1.85.1. ```console warning: you seem to be trying to use `match` for destructuring a single pattern. Consider using `if let` --> src/image.rs:241:9 \| 241 \| / match oci.annotations() { 242 \| \| Some(a) => { 243 \| \| if ImageService::is_sandbox(a) { 244 \| \| return ImageService::get_pause_image_process(); ... \| 247 \| \| None => {} 248 \| \| } \| \|_________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#single_match help: try \| 241 ~ if let Some(a) = oci.annotations() { 242 + if ImageService::is_sandbox(a) { 243 + return ImageService::get_pause_image_process(); 244 + } 245 + } \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Alex Lyn	e99070afb4	Merge pull request #11343 from Apokleos/cc-blk-sharefs Enables block device and disable virtio-fs	2025-06-11 11:52:52 +08:00
Alex Lyn	2d570db08b	Merge pull request #11179 from Apokleos/tdx-qemu-rs runtime-rs: Add TDX Support to runtime-rs for Confidential Containers (CoCo)	2025-06-11 10:27:36 +08:00
alex.lyn	2e9d27c500	runtime-rs: Enables block device and disable virtio-fs via capabilities Kata runtime employs a CapabilityBits mechanism for VMM capability governance. Fundamentally, this mechanism utilizes predefined feature flags to manage the VMM's operational boundaries. To meet demands for storage performance and security, it's necessary to explicitly enable capability flags such as `BlockDeviceSupport` (basic block device support) and `BlockDeviceHotplugSupport` (block device hotplug) which ensures the VMM provides the expected caps. In CoCo scenarios, due to the potential risks of sensitive data leaks or side-channel attacks introduced by virtio-fs through shared file systems, the `FsSharingSupport` flag must be forcibly disabled. This disables the virtio-fs feature at the capability set level, blocking insecure data channels. Fixes #11341 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-11 10:19:13 +08:00
alex.lyn	23340b6b5f	runtime-rs: Support cold plug of block devices via virtio-blk for Qemu Two key important scenarios: (1) Support `virtio-blk-pci` cold plug capability for confidential guests instead of nvdimm device in CVM due to security constraints in CoCo cases. (2) Push initdata payload into compressed raw block device and insert it in CVM through `virtio-blk-pci` cold plug mechanism. Fixes #11341 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-11 10:19:13 +08:00
RuoqingHe	7916db9613	Merge pull request #11345 from Apokleos/fix-noise protocols: Fix the noise caused by non-formatted codes in protocols	2025-06-11 09:50:02 +08:00
Aurélien Bombo	66ae9473cb	Merge pull request #11397 from kata-containers/sprt/validate-ok-to-test ci: gha: Remove ok-to-test label on every push	2025-06-10 16:42:54 -05:00
Aurélien Bombo	31288ea7fc	Merge pull request #11398 from kata-containers/sprt/undo-mariner-hotfix Revert "ci: Fix Mariner rootfs build failure"	2025-06-10 16:09:08 -05:00
Aurélien Bombo	f34010cc94	Merge pull request #11388 from kata-containers/sprt/azure-oidc ci: Use OIDC to log into Azure	2025-06-10 13:08:44 -05:00
Steve Horsman	6424055eeb	Merge pull request #11393 from stevenhorsman/bump-chrono-0.4.41 libs: Bump chrono package	2025-06-10 16:47:18 +01:00
stevenhorsman	99e70100c7	workflows: Set persist-credentials: false on checkout By default the checkout action leave the credentials in the checked-out repo's `.git/config`, which means they could get exposed. Use persist-credentials: false to prevent this happening. Note: static-checks.yaml does use git diff after the checkout, but the git docs state that git diff is just local, so doesn't need authentication. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-10 10:33:41 +01:00
RuoqingHe	5b8f7b2e3c	Merge pull request #11391 from RuoqingHe/disable-runtime-rs-test-on-riscv runtime-rs: Skip test on RISC-V architecture	2025-06-10 17:28:12 +08:00
Xuewei Niu	ac6779428f	Merge pull request #11377 from justxuewei/hvsock-logging	2025-06-10 16:45:59 +08:00
alex.lyn	c8433c6b70	kata-sys-util: Update TDX platform detection for newer TDX platforms On newer TDX platforms, checking `/sys/firmware/tdx` for `major_version` and `minor_version` is no longer necessary. Instead, we only need to verify that `/sys/module/kvm_intel/parameters/tdx` is set to `'Y'`. This commit addresses the following: (1) Removes the outdated check and corrects related code, primarily impacting `cloud-hypervisor`. (2) Refines the TDX platform detection logic within `arch_guest_protection`. Fixes #11177 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	8652aa7417	kata-types: Enable QGS port via configuration Currently, the TDX Quote Generation Service (QGS) connection in QEMU with default vsock port 4050 for TD attestation. To make it flexible for users to modify the QGS port. Based on the introduced qgs_port, This commit supports the QGS port to be configured via configuration Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	f8d1ee8b1c	kata-types: Introduce QGS port for TD attestation in Hypervisor config Currently, the TDX Quote Generation Service (QGS) connection in QEMU is hardcoded to vsock port 4050, which limits flexibility for TD attestation. While the users will be able to modify the QGS port. To address this inflexibility, this commit introduces a new qgs_port field within security info and make it default with 4050. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	49ced4d43c	runtime-rs: Prepare Tdx protection device in start sandbox During the prepare for `start sandbox` phase, this commit ensures the correct `ProtectionDeviceConfig` is prepared based on the `GuestProtection` type in a TEE platform. Specifically, for the TDX platform, this commit sets the essential parameters within the ProtectionDeviceConfig, including the TDX ID, firmware path, and the default QGS port (4050). This information is then passed to the underlying VMM for further processing using the existing ResourceManager and DeviceManager infrastructure. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	bab77e2d65	runtime-rs: Introduce Tdx Protection Device and add it into cmdline This patch introduces TdxConfig with key fields, firmare, qgs_port, mrconfigid, and other useful things. With this config, a new ProtectionDeviceConfig type `Tdx(TdxConfig)` is added. With this new type supported, we finally add tdx protection device into the cmdline to launch a TDX-based CVM. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	09fddac2c4	runtime-rs: Introduce 'tdx-guest' object and its builder for TDX CVMs This commit introduces the `tdx-guest` designed to facilitate the launch of CVMs leveraging Intel's TDX. Launching a TDX-based CVM requires various properties, including `quote-generation-socket`, and `mrconfigid`,`sept-ve-disable` .etc. (1) The `quote-generation-socket` property is added to the `tdx-guest` object, which is of type `SocketAddress`, specifies the address of the Quote Generation Service (QGS). (2) The `mrconfigid` property, representing the SHA384 hash for non-owner-defined configurations of the guest TD, is introduced as a runtime or OS configuration parameter. (3) And the `sept-ve-disable` property allows control over whether EPT violation conversions to #VE exceptions are disabled when the guest TD accesses PENDING pages. With the introduction of the `tdx-guest` object and its associated properties, launching TDX-based CVMs is now supported. For example, a TDX guest can be configured via the command line as follows: ```shell -object {"qom-type":"tdx-guest", "id":"tdx", "sept-ve-disable":true,\ "mrconfigid":"vHswGkzG4B3Kikg96sLQ5vPCYx4AtuB4Ubfzz9UOXvZtCGat8b8ok7Ubz4AxDDHh",\ "quote-generation-socket":{"type":"vsock","cid":"2","port":"4050"} \ -machine q35,accel=kvm,confidential-guest-support=tdx ``` Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	1d4ffe6af3	runtime-rs: Implement serializable SocketAddress with Serde This enables consistent JSON representation of socket addresses across system components: (1) Add serde serialization/deserialization with standardized field naming convention. (2) Enforce string-based port/cid and unix/path representation for protocol compatibility. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	65931fb75f	protocols: Fix the noise caused by non-formatted codes in protocols ``` - decoded.strip_prefix("CAP_").unwrap_or(decoded) + decoded + .strip_prefix("CAP_") + .unwrap_or(decoded) .parse::<oci::Capability>() .unwrap_or_else(\|_\| panic!("Failed to parse {:?} to Enum Capability", cap)) }) @@ -1318,8 +1320,6 @@ mod tests { #[test] #[should_panic] fn test_cap_vec2hashset_bad() { - cap_vec2hashset(vec![ - "CAP_DOES_NOT_EXIST".to_string(), - ]); + cap_vec2hashset(vec!["CAP_DOES_NOT_EXIST".to_string()]); ``` Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:30:33 +08:00
alex.lyn	f3c8ef9200	kata-types: Support disabled sharefs with config of shared_fs = "none" For CoCo, shared_fs is prohibited as we cannot guarantee the security of guest/host sharing. Therefore, this PR enables administrators to configure shared_fs = "none" via the configuration.toml file, thereby enforcing the disablement of sharing. Fixes #10677 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:30:01 +08:00
Dan Mihai	d37feac679	tests: test mariner with disable_image_nvdimm=true Run the k8s tests on mariner with annotation disable_image_nvdimm=true, to use virtio-blk instead of nvdimm for the guest rootfs block device. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-10 02:03:31 +00:00
Dan Mihai	1aeef52bae	clh: runtime: add disable_image_nvdimm support Allow users to build using DEFDISABLEIMAGENVDIMM=true if they want to set disable_image_nvdimm=true in configuration-clh.toml. disable_image_nvdimm=false is the default config value. Also, use virtio-blk instead of nvdimm if disable_image_nvdimm=true in configuration-clh.toml. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-10 02:00:52 +00:00
Dan Mihai	0dd9325264	qemu: runtime: build variable for disable_image_nvdimm=true Allow users to build using DEFDISABLEIMAGENVDIMM=true if they want to set disable_image_nvdimm=true in configuration-qemu*.toml. disable_image_nvdimm=false is the default configuration value. Note that the value of disable_image_nvdimm gets ignored for platforms using "confidential_guest = true". Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-10 01:57:42 +00:00
Dan Mihai	d51e0c9875	snp: gpu: comment out disable_image_nvdimm config Comment out "disable_image_nvdimm = true" in: - configuration-qemu-snp.toml - configuration-qemu-nvidia-gpu-snp.toml for consistency with the other configuration-qemu*.toml files. Those two platforms are using "confidential_guest = true", and therefore the value of disable_image_nvdimm gets ignored. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-10 01:44:51 +00:00
stevenhorsman	ac9d3eb7be	libs: Bump chrono package Bump chrono package to 0.4.41 and thereby remove the time 0.1.43 dependency and remediate CVE-2020-26235 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-09 21:01:27 +01:00
Aurélien Bombo	004c1a4595	Revert "ci: Fix Mariner rootfs build failure" This reverts commit `dfa25a42ff`. The original issue was fixed: https://github.com/microsoft/azurelinux/issues/13971#issuecomment-2956384627	2025-06-09 14:06:07 -05:00
Aurélien Bombo	2ee3470627	ci: gha: Remove ok-to-test label on every push This removes the ok-to-test label on every push, except if the PR author has write access to the repo (ie. permission to modify labels). This protects against attackers who would initially open a genuine PR, then push malicious code after the initial review. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-09 12:37:06 -05:00
Aurélien Bombo	9488ce822d	Merge pull request #11396 from kata-containers/sprt/fix-mariner-image ci: Fix Mariner rootfs build failure	2025-06-09 12:32:14 -05:00
Aurélien Bombo	dfa25a42ff	ci: Fix Mariner rootfs build failure This implements a workaround for microsoft/azurelinux#13971 to unblock the CI. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-09 10:56:10 -05:00
Alex Lyn	2979312f7b	Merge pull request #11381 from RuoqingHe/log-instead-of-format runtime-rs: Log error instead of format	2025-06-09 11:54:13 +08:00
Ruoqing He	e290587f9c	runtime-rs: Skip test on RISC-V architecture Full set test on RISC-V architecture is not yet supported, skip it for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-09 01:49:47 +00:00
Ruoqing He	781510202a	runtime-rs: Log error instead of format Log on error condition when `umount` operation fail instead of `format!` error message. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-08 08:28:22 +00:00
Xuewei Niu	17b2daf0a7	Merge pull request #11357 from justxuewei/nxw/remove-dcode dragonball: Remove a useless dead_code attribute	2025-06-08 16:07:03 +08:00
Dan Mihai	e067a1be64	Merge pull request #11358 from burgerdev/gid-warning genpolicy: improvements to /etc/passwd checks	2025-06-06 17:04:27 -07:00
Aurélien Bombo	9dd3807467	ci: Use OIDC to log into Azure This completely eliminates the Azure secret from the repo, following the below guidance: https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/configuring-openid-connect-in-azure The federated identity is scoped to the `ci` environment, meaning: * I had to specify this environment in some YAMLs. I don't believe there's any downside to this. * As previously, the CI works seamlessly both from PRs and in the manual workflow. I also deleted the tools/packaging/kata-deploy/action folder as it doesn't seem to be used anymore, and it contains a reference to the secret. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-06 15:26:10 -05:00
Steve Horsman	31a8944da1	Merge pull request #11334 from kata-containers/remove-inherit-secrets workflows: Replace secrets: inherit	2025-06-06 16:41:13 +01:00
Steve Horsman	9555f2ce08	Merge pull request #11387 from burgerdev/riscv-artifact-name ci: fix artifact name of RISC-V tarball	2025-06-06 15:50:21 +01:00
stevenhorsman	66ef1c1198	workflows: Replace secrets: inherit Having secrets unconditionally being inherited is bad practice, so update the workflows to only pass through the minimal secrets that are needed Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-06 09:56:46 +01:00
stevenhorsman	89d038d2b4	workflows: Switch QUAY_DEPLOYER_USERNAME to var QUAY_DEPLOYER_USERNAME isn't sensitive, so update the secret for a var to simplify the workflows Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-06 09:49:14 +01:00
stevenhorsman	2eda21180a	workflows: Switch AUTHENTICATED_IMAGE_USER to var AUTHENTICATED_IMAGE_USER isn't sensitive, so update the secret for a var to simplify the workflows Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-06 09:49:14 +01:00
Markus Rudy	9ffed463a1	ci: fix artifact name of RISC-V tarball The artifact name accidentally referred to ARM64, which caused a clash in CI runs. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-06 08:29:48 +02:00
RuoqingHe	567296119d	Merge pull request #11317 from kimullaa/remove-obsolete-parameters runtime: remove hotplug_vfio_on_root_bus from config.toml	2025-06-06 04:03:03 +02:00
Steve Horsman	9ff650b641	Merge pull request #11383 from stevenhorsman/remove-docker-hub-publish Switch docker hub mirroring to ghcr.io	2025-06-05 17:16:18 +01:00
Shunsuke Kimura	5193cfedca	runtime: remove hotplug_vfio_on_root_bus from toml In this commit, hotplug_vfio_on_root_bus parameter is removed. <`dd422ccb69`> pcie_root_port parameter description (`This value is valid when hotplug_vfio_on_root_bus is true and machine_type is "q35"`) will have no value, and not completely valid, since vrit or DB as also support for root-ports and CLH as well. so removed. Fixes: #11316 Co-authored-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-06-05 21:53:06 +09:00
Steve Horsman	0f8104a2df	Merge pull request #11376 from RuoqingHe/upgrade-ttrpc-0.5.0 Upgrade `ttrpc-codegen` and `protobuf` to kill `#![allow(box_pointers)]`	2025-06-05 13:02:13 +01:00
stevenhorsman	6c6e16eef3	workflows: Remove docker hub registry publishing As docker hub has rate limiting issues, inside mirror quay.io to ghcr.io instead Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-05 11:46:51 +01:00
Markus Rudy	1c240de58d	genpolicy: don't parse /etc/passwd in a loop Instead of looping over the users per group and parsing passwd for each user, we can do the reverse lookup uid->user up front and then compare the names directly. This has the nice side-effect of silencing warnings about non-existent users mentioned in /etc/group, which is not relevant for policy decisions. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-04 17:54:57 +02:00
Markus Rudy	a1baaf6fe2	genpolicy: ignore groups with same name as user containerd does not automatically add groups to the list of additional GIDs when the groups have the same name as the user: https://github.com/containerd/containerd/blob/f482992/pkg/oci/spec_opts.go#L852-L854 This is a bug and should be corrected, but it has been present since at least 1.6.0 and thus affects almost all containerd deployments in existence. Thus, we adopt the same behavior and ignore groups with the same name as the user when calculating additional GIDs. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-04 10:29:49 +02:00
Xuewei Niu	77ca2fe88b	runtime-rs: Reduce the number of duplicate log entries being printed When connecting to guest through vsock, a log is printed for each failure. The failure comes from two main reasons: (1) the guest is not ready or (2) some real errors happen. Printing logs for the first case leads to log clutter, and your logs will like this: ``` Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... ``` To avoid this, the sock implmentations save the last error and return it after all retries are exhausted. Users are able to check all errors by setting the log level to trace. Reorganize the log format to "{sock type}: {message}" to make it clearer. Apart from that, errors return by the socks use `self`, instead of `ConnectConfig`, since the `ConnectConfig` doesn't provide any useful information. Disable infinite loop for the log forwarder. There is retry logic in the sock implmentations. We can consider the agent-log unavailable if `sock.connect()` encounters an error. Fixes: #10847 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-06-04 12:25:32 +08:00
Xuewei Niu	3f8dd821e6	dragonball: Remove a useless dead_code attribute The vhost-user-fs has been added to Dragonball, so we can remove `update_memory`'s dead_code attribute. Fixes: #8691 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-06-04 11:34:16 +08:00
Ruoqing He	77e68b164e	agent: Upgrade `ttrpc-codegen` to 0.5.0 Propagate `ttrpc-codegen` upgrade from `libs/protocols` to `agent`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-04 01:16:46 +00:00
Ryan Savino	1e686dbca7	agent: Remove casting and fix Arc declaration Removed unnecessary dynamic dispatch for services. Properly dereferenced service Box values and stored in Arc. Co-authored-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-06-04 01:16:46 +00:00
Ruoqing He	0471f01074	libs: Bump `ttrpc-codegen` and `protobuf` Previous version of `ttrpc-codegen` is generating outdated `#![allow(box_pointers)]` which was deprecated. Bump `ttrpc-codegen` from v0.4.2 to v0.5.0 and `protobuf` from vx to v3.7.1 to get rid of this. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-04 01:16:18 +00:00
Markus Rudy	eeb3d1384b	genpolicy: compare additionalGIDs as sets The additional GIDs are handled by genpolicy as a BTreeSet. This set is then serialized to an ordered JSON array. On the containerd side, the GIDs are added to a list in the order they are discovered in /etc/group, and the main GID of the user is prepended to that list. This means that we don't have any guarantees that the input GIDs will be sorted. Since the order does not matter here, comparing the list of GIDs as sets is close enough. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-03 20:18:35 +02:00
Aurélien Bombo	8c3f8f8e21	Merge pull request #11339 from kata-containers/sprt/require-agent-ctl ci: Require agent-ctl tests	2025-06-03 11:58:33 -04:00
Steve Horsman	74e47382f8	Merge pull request #11016 from stevenhorsman/dependabot-configuration workflows: Add dependabot config	2025-06-03 15:12:32 +01:00
Steve Horsman	8176eefdac	Merge pull request #10748 from zvonkok/helm-doc doc: Add Helm Chart entry	2025-06-03 14:48:19 +01:00
Markus Rudy	02ad39ddf1	genpolicy: push down warning about missing passwd file The warning used to trigger even if the passwd file was not needed. This commit moves it down to where it actually matters. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-03 11:19:29 +02:00
Markus Rudy	ec969e4dcd	genpolicy: remove redundant group check https://github.com/kata-containers/kata-containers/pull/11077 established that the GID from the image config is never used for deriving the primary group of the container process. This commit removes the associated logic that derived a GID from a named group. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-03 10:59:10 +02:00
Zvonko Kaiser	985e965adb	doc: Added Helm Chart README.md We need more and accurate documentation. Let's start by providing an Helm Chart install doc and as a second step remove the kustomize steps. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Co-authored-by: Steve Horsman <steven@uk.ibm.com>	2025-06-02 23:26:16 +00:00
Dan Mihai	dc0da567cd	Merge pull request #11340 from microsoft/danmihai1/image-size-alignment image: custom guest rootfs image file size alignment	2025-06-02 14:33:21 -07:00
Dan Mihai	c2c194d860	kata-deploy: smaller guest image file for mariner Align up the mariner Guest image file size to 2M instead of the default 128M alignment. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-02 16:15:17 +00:00
Dan Mihai	65385a5bf9	image: custom guest rootfs image file size alignment The Guest rootfs image file size is aligned up to 128M boundary, since commmit `2b0d5b2`. This change allows users to use a custom alignment value - e.g., to align up to 2M, users will be able to specify IMAGE_SIZE_ALIGNMENT_MB=2 for image_builder.sh. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-02 16:15:17 +00:00
Steve Horsman	c575048aa7	Merge pull request #11329 from Xynnn007/fix-initdata-snp Fix \| Support initdata for SNP	2025-06-02 15:24:12 +01:00
stevenhorsman	ae352e7e34	ci: Add dependabot groups - Create groups for commonly seen cargo packages so that rather than getting up to 9 PRs for each rust components, bumps to the same package are grouped together. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-02 14:45:31 +01:00
stevenhorsman	a94388cf61	ci: Add dependabot config - Create a dependabot configuration to check for updates to our rust and golang packages each day and our github actions each month Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-02 14:45:31 +01:00
Xynnn007	8750eadff2	test: turn SNP on for initdata tests After the last commit, the initdata test on SNP should be ok. Thus we turn on this flag for CI. Fixes #11300 Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-02 20:33:19 +08:00
Xynnn007	39aa481da1	runtime: fix initdata support for SNP the qemu commandline of SNP should start with `sev-snp-guest`, and then following other parameters separeted by ','. This patch fixes the parameter order. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-02 20:33:19 +08:00
Fabiano Fidêncio	57f3cb8b3b	Merge pull request #11344 from fidencio/topic/kernel-add-tuntap-move-memagent-stuff kernel: Add CONFIG_TUN (needed for VPNs) and move mem-agent related configs to common	2025-06-01 21:32:07 +02:00
RuoqingHe	51cc960cdd	Merge pull request #11346 from fidencio/topic/bump-cgroups-rs rust: Update cgroups-rs to its v0.3.5 release	2025-05-31 04:13:05 +02:00
Fabiano Fidêncio	48f8496209	Merge pull request #11327 from Champ-Goblem/agent/increase-limit-nofile agent: increase LimitNOFILE in the systemd service	2025-05-30 21:56:01 +02:00
Fabiano Fidêncio	02c46471fd	rust: Update cgroups-rs to its v0.3.5 release We're switching to using a rev as it may take some time for the package to be updated on crates.io. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-30 21:49:50 +02:00
Fabiano Fidêncio	dadbfd42c8	kernel: Move mem-agent configs to the common kernel build There's no benefit on keeping those restricted to the dragonball build, when they can be used with other VMMs as well (as long as they support the mem-agent). Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-30 21:49:22 +02:00
Champ-Goblem	a37080917d	kernel: Add CONFIG_TUN for VPN services TUN/TAP is a must for VPN related services. Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-30 21:49:22 +02:00
Fabiano Fidêncio	b8a7350a3d	Merge pull request #11324 from Champ-Goblem/runtime/fix-cgroup-deletion runtime: fix cgroupv2 deletion when sandbox_cgroup_only=false	2025-05-30 21:23:07 +02:00
Champ-Goblem	ef642fe890	runtime: fix cgroupv2 deletion when sandbox_cgroup_only=false Currently, when a new sandbox resource controller is created with cgroupsv2 and sandbox_cgroup_only is disabled, the cgroup management falls back to cgroupfs. During deletion, `IsSystemdCgroup` checks if the path contains `:` and tries to delete the cgroup via systemd. However, the cgroup was originally set up via cgroupfs and this process fails with `lstat /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/....scope: no such file or directory`. This patch updates the deletion logic to take in to account the sandbox_cgroup_only=false option and in this case uses the cgroupfs delete. Fixes: #11036 Signed-off-by: Champ-Goblem <cameron@northflank.com>	2025-05-30 17:51:31 +02:00
Champ-Goblem	f4007e5dc1	agent: increase LimitNOFILE in the systemd service Increase the NOFILE limit in the systemd service, this helps with running databases in the Kata runtime. Signed-off-by: Champ-Goblem <cameron@northflank.com>	2025-05-30 17:49:29 +02:00
Fabiano Fidêncio	3f5dc87284	Merge pull request #11333 from stevenhorsman/csi-driver-permissions-fix workflow: add packages: write to csi-driver publish	2025-05-30 17:45:47 +02:00
Zvonko Kaiser	4586511c01	doc: Add Helm Chart entry Since 3.12 we're shipping the helm-chart per default with each release. Update the documentation to use helm rather then the kata-deploy manifests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-05-30 14:45:01 +00:00
Aurélien Bombo	c03b38c7e3	ci: Require agent-ctl tests This adds `run-kata-agent-apis` to the list of required tests. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-05-29 14:09:42 -05:00
stevenhorsman	586d9adfe5	workflow: add packages: write to csi-driver publish This one was missed in the earlier PR Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-29 15:57:07 +01:00
Steve Horsman	3da213a8c8	Merge pull request #11326 from kata-containers/top-level-workflow-permissions Top level workflow permissions	2025-05-29 10:03:06 +01:00
stevenhorsman	c34416f53a	workflows: Add explicit permissions where needed We have a number of jobs that either need,or nest workflows that need gh permissions, such as for pushing to ghcr, or doing attest build provenance. This means they need write permissions on things like `packages`, `id-token` and `attestations`, so we need to set these permissions at the job-level (along with `contents: read`), so they are not restricted by our safe defaults. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 19:34:28 +01:00
stevenhorsman	088e97075c	workflow: Add top-level permissions Set: ``` permissions: contents: read ``` as the default top-level permissions explicitly to conform to recommended security practices e.g. https://github.com/ossf/scorecard/blob/main/docs/checks.md#token-permissions	2025-05-28 19:34:28 +01:00
Dan Mihai	353d0822fd	Merge pull request #11314 from katexochen/p/svc-name-regex genpolicy: fix svc_name regex	2025-05-28 10:08:38 -07:00
Steve Horsman	7a9d919e3e	Merge pull request #11322 from kata-containers/workflow-permissions workflows: Add explicit permissions for attestation	2025-05-28 17:28:22 +01:00
Steve Horsman	2667d4a345	Merge pull request #11323 from stevenhorsman/gatekeeper-workflow-permissions-ii workflow: Update gatekeeper permissions	2025-05-28 17:05:24 +01:00
stevenhorsman	4d4fb86d34	workflow: Update gatekeeper permissions I shortsightedly forgot that gatekeeper would need to read more than just the commit content in it's python scripts, so add read permissions to actions issues which it uses in it's processing Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 15:58:27 +01:00
Steve Horsman	fed63e0801	Merge pull request #11319 from stevenhorsman/remove-old-workflows workflows: Delete workflows	2025-05-28 15:38:19 +01:00
Steve Horsman	49f86aaa0d	Merge pull request #11320 from stevenhorsman/gatekeeper-workflow-permissions workflows: gatekeeper: Update permissions	2025-05-28 15:38:06 +01:00
stevenhorsman	3ff602c1e8	workflows: Add explicit permissions for attestation We have a number of jobs that nest the build-static-tarball workflows later on. Due to these doing attest build provenance, and pushing to ghcr.io, t hey need write permissions on `packages`, `id-token` and `attestations`, so we need to set these permissions on the top-level jobs (along with `contents: read`), so they are not blocked. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 12:56:52 +01:00
stevenhorsman	2f0dc2ae24	workflows: gatekeeper: Update permissions Restrict the permissions of gatekeeper flow to read contents only for better security Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 09:57:19 +01:00
stevenhorsman	f900b0b776	workflows: Delete workflows Some legacy workflows require write access to github which is a security weakness and don't provide much value, so lets remove them. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 09:45:42 +01:00
Alex Lyn	aab6caa141	Merge pull request #10362 from Apokleos/vfio-hotplug-runtime-rs runtime-rs: add support hotplugging vfio device for qemu-rs	2025-05-28 13:21:58 +08:00
Fabiano Fidêncio	ac934e001e	Merge pull request #11244 from katexochen/p/guest-pull-config runtime: add option to force guest pull	2025-05-27 16:00:09 +02:00
alex.lyn	e69a4d203a	runtime-rs: Increase QMP read timeout to mitigate failures It frequently causes "Resource Temporarily Unavailable (OS Error 11)" with the original 250ms read timeout When passing through devices via VFIO in QEMU. The root cause lies in synchronization timeout windows failing to accommodate inherent delays during critical hardware init phases in kernel space. This commit would increase the timeout to 5000ms which was determined through some tests. While not guaranteeing complete resolution for all hardware combinations, this change significantly reduces timeout failures. Fixes # 10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-27 21:06:57 +08:00
Paul Meyer	c4815eb3ad	runtime: add option to force guest pull This enables guest pull via config, without the need of any external snapshotter. When the config enables runtime.experimental_force_guest_pull, instead of relying on annotations to select the way to share the root FS, we always use guest pull. Co-authored-by: Markus Rudy <mr@edgeless.systems> Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-05-27 12:42:00 +02:00
Fabiano Fidêncio	d3f81ec337	Merge pull request #11240 from Apokleos/copydir runtime-rs: Propagate k8s configs correctly when sharedfs is disabled	2025-05-27 12:41:21 +02:00
Paul Meyer	8de8b8185e	genpolicy: rename svc_name to svc_name_downward_env Just to be more explicit what this matches. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-05-27 10:13:43 +02:00
Paul Meyer	78eb65bb0b	genpolicy: fix svc_name regex The service name is specified as RFC 1035 lable name [1]. The svc_name regex in the genpolicy settings is applied to the downward API env variables created based on the service name. So it tries to match RFC 1035 labels after they are transformed to downward API variable names [2]. So the set of lower case alphanumerics and dashes is transformed to upper case alphanumerics and underscores. The previous regex wronly permitted use of numbers, but did allow dot and dash, which shouldn't be allowed (dot not because they aren't conform with RFC 1035, dash not because it is transformed to underscore). We have to take care not to also try to use the regex in places where we actually want to check for RFC 1035 label instead of the downward API transformed version of it. Further, we should consider using a format like JSON5/JSONC for the policy settings, as these are far from trivial and would highly benefit from proper documentation through comments. [1]: https://kubernetes.io/docs/concepts/services-networking/service/#defining-a-service [2]: `b2dfba4151/pkg/kubelet/envvars/envvars.go (L29-L70)` Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-05-27 08:43:25 +02:00
RuoqingHe	139dc13bdc	Merge pull request #11301 from lifupan/fix_cgroup runtime-rs: fix the issue of delete cgroup failed	2025-05-27 05:05:32 +02:00
Wainer Moschetta	d77e33babf	Merge pull request #11266 from ldoktor/ci-pp-retry ci.ocp: A couple of peer-pods setup improvements	2025-05-26 14:22:11 -03:00
Wainer Moschetta	c249769bb8	Merge pull request #11270 from ldoktor/gk tools.testing: Add methods to simplify gatekeeper development	2025-05-26 12:04:07 -03:00
Fabiano Fidêncio	20d3bc6f37	Merge pull request #10964 from hsiangkao/drop_outdated_patches Drop outdated erofs patches for 6.1.y kernels & fix a dragonball vsock issue	2025-05-26 13:00:25 +02:00
Gao Xiang	b441890749	kernel: drop outdated erofs patches for 6.1.y kernels Patches 0001..0004 have been included upstream as dependencies since Linux 6.1.113. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-05-26 15:48:24 +08:00
Xingru Li	71b6acfd7e	dragonball: vsock: support single descriptor Since kernel v6.3 the vsock packet is not split over two descriptors and is instead included in a single one. Therefore, we currently decide the specific method of obtaining BufWrapper based on the length of descriptor. Refer: `a2752fe04f` https://git.kernel.org/torvalds/c/71dc9ec9ac7d Signed-off-by: Xingru Li <lixingru.lxr@linux.alibaba.com> [ Gao Xiang: port this patch from the internal branch to address Linux 6.1.63+. ] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-05-26 15:48:19 +08:00
RuoqingHe	b6cafba5f6	Merge pull request #11308 from hsiangkao/enable_tmpfs_xattr kernel: support `CONFIG_TMPFS_XATTR=y`	2025-05-26 05:00:26 +02:00
Gao Xiang	b681dfb594	kernel: support `CONFIG_TMPFS_XATTR=y` Currently, Kata EROFS support needs it, otherwise it will: [ 0.564610] erofs: (device sda): mounted with root inode @ nid 36. [ 0.564858] overlayfs: failed to set xattr on upper [ 0.564859] overlayfs: ...falling back to index=off,metacopy=off. [ 0.564860] overlayfs: ...falling back to xino=off. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-05-24 20:43:35 +08:00
RuoqingHe	a9ffdfc2ae	Merge pull request #11294 from wainersm/delint_confidential_kbs tests/k8s: delint confidential_kbs.sh	2025-05-23 17:00:28 +02:00
Fupan Li	e9b45126fc	Merge pull request #11254 from sampleyang/main runtime-rs: fix vfio pci address domain 0001 problem	2025-05-23 18:13:10 +08:00
yangsong	06c7c5bccb	runtime-rs: fix vfio pci address domain 0001 problem Some nvidia gpu pci address domain with 0001, current runtime default deal with 0000:bdf, which cause address errors during device initialization and address conflicts during device registration. Fixes #11252 Signed-off-by: yangsong <yunya.ys@antgroup.com>	2025-05-23 14:33:06 +08:00
Wainer dos Santos Moschetta	ddf333feaf	tests/k8s: fix shellcheck SC1091 in confidential_kbs.sh Fixed "note: Not following: ./../../../tools/packaging/guest-image/lib_se.sh: openBinaryFile: does not exist (No such file or directory) [SC1091]" Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-22 15:38:27 -03:00
Wainer dos Santos Moschetta	c9fb0b9c85	tests/k8s: fix shellcheck SC2154 in confidential_kbs.sh Fixed "warning: HKD_PATH is referenced but not assigned. [SC2154]" Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-22 15:02:20 -03:00
Wainer dos Santos Moschetta	68d91d759a	tests/k8s: add `set -e` to confidential_ksh.sh Although the script will inherit that setting from the caller scripts, expliciting it in the file will vanish shellcheck "warning: Use 'pushd ... \|\| exit' or 'pushd ... \|\| return' in case pushd fails. [SC2164]" Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-22 14:55:24 -03:00
Wainer dos Santos Moschetta	b4adfcb3cb	tests/k8s: apply shellcheck tips to confidential_kbs.sh Addressed the following shellcheck advices: SC2046 (warning): Quote this to prevent word splitting. SC2248 (style): Prefer double quoting even when variables don't contain special characters SC2250 (style): Prefer putting braces around variable references even when not strictly required. SC2292 (style): Prefer [[ ]] over [ ] for tests in Bash/Ksh Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-22 14:52:38 -03:00
alex.lyn	043bab3d3e	runtime-rs: Handle port allocation in PCIe topology for vfio devices It's import to handle port allocation in a PCIe topology before vfio deivce hotplug via QMP. The code ensures that VFIO devices are properly allocated to available ports (either root ports or switch ports) and updates the device's bus and port information accordingly. It'll first retrieves the PCIe port type from the topology using pcie_topo.get_pcie_port(). And then, searches for an available node in the PCIe topology with RootPort or SwitchPort type and allocates the VFIO device to the found available port. Finally, Updates the device's bus with the allocated port's ID and type. Fixes # 10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-22 18:58:41 +08:00
alex.lyn	01b822de16	runtime-rs: Get available port node in the PCIe topology This commit implements the `find_available_node` function, which searches the PCIe topology for the first available `TopologyPortDevice` or `SwitchDownPort`. If no available node is found in either the `pcie_port_devices` or the connected switches' downstream ports, the function returns `None`. Fixes # 10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-22 18:58:41 +08:00
alex.lyn	533d07a2c3	runtime-rs: Introduce qemu-rs vfio device hotplug handler This commit note that the current implementation restriction where 'multifunction=on' is temporarily unsupported. While the feature isn't available in the present version, we explicitly acknowledge this limitation and commit to addressing it in future iterations to enhance functional completeness. Tracking issue #11292 has been created to monitor progress towards full multifunction support. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-22 18:58:06 +08:00
Steve Horsman	91f2e97aae	Merge pull request #11267 from Rtoax/p001-fix-osbuilder-lib.sh-indent osbuilder: lib.sh: Fix indent	2025-05-22 09:54:18 +01:00
alex.lyn	f1796fe9ba	runtime-rs: Add more fields in VfioDevice to express vfio devices To support port devices for vfio devices, more fields need to be introduced to help pass port type, bus and other information. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-22 16:00:40 +08:00
Fupan Li	15cbc545ca	runtime-rs: fix the issue of delete cgroup failed When try to delete a cgroup, it's needed to move all of the tasks/procs in the cgroup into root cgroup and then delete it. Since for cgroup v2, it doesn't support to move thread into root cgroup, thus move the processes instead of moving tasks can fix this issue. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-22 12:15:02 +08:00
Steve Horsman	9356ed59d5	Merge pull request #11130 from wainersm/tests-better-report tests/k8s: better tests reporting for CI	2025-05-21 17:21:35 +01:00
Steve Horsman	b519e9fdff	Merge pull request #11293 from wainersm/tests_increase_kbs_timeout tests/k8s: increase wait time of KBS service ingress	2025-05-21 17:14:52 +01:00
Steve Horsman	a897bce29f	Merge pull request #11298 from stevenhorsman/release-3.17.0-bump release: Bump version to 3.17.0	2025-05-21 12:06:24 +01:00
stevenhorsman	7b90ff3c01	release: Bump version to 3.17.0 Bump VERSION and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-21 12:04:39 +01:00
Fabiano Fidêncio	5378e581d8	Merge pull request #11144 from Apokleos/hotplug-block-qemu-rs Support hot-plug block device in qemu-rs with QMP	2025-05-21 11:31:48 +02:00
Lukáš Doktor	67ee9f3425	ci.ocp: Improve logging of extra new resources this script relies on temporary subscriptions and won't cleanup any resources. Let's improve the logging to better describe what resources were created and how to clean them, if the user needs to do so. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-21 11:02:36 +02:00
Lukáš Doktor	32dbc5d2a9	ci.ocp: Use SCRIPT_DIR to allow execution from any folder We used hardcoded "ci/openshift-ci/cluster" location which expects this script to be only executed from the root. Let's use SCRIPT_DIR instead to allow execution from elsewhere eg. by user bisecting a failed CI run. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-21 10:30:03 +02:00
Lukáš Doktor	0e4fb62bb4	ci.ocp: Retry first az command as login takes time to propagate In CI we hit problem where just after `az login` the first `az network vnet list` command fails due to permission. We see "insufficient permissions" or "pending permissions", suggesting we should retry later. Manual tests and successful runs indicate we do have the permissions, but not immediately after login. Azure docs suggest using extra `az account set` but still the propagation might take some time. Add a loop retrying the first command a few times before declaring failure. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-21 10:28:01 +02:00
Fabiano Fidêncio	6c9b199ef1	Merge pull request #11289 from BbolroC/fix-vfio-coldplug runtime: Preserve hotplug devices for vfio-coldplug mode	2025-05-21 09:48:25 +02:00
Wainer dos Santos Moschetta	fdcf11d090	tests/k8s: increase wait time of KBS service ingress kbs_k8s_svc_host() returns the ingress IP when the KBS service is exposed via an ingress. In Azure AKS the ingress can time a while to be fully ready and recently we have noticed on CI that kbs_k8s_svc_host() has returned empty value. Maybe the problem is on current timeout being too low, so let's increase it to 50 seconds to see if the situation improves. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-20 15:20:08 -03:00
Wainer dos Santos Moschetta	80a816db9d	workflows/run-k8s-tests-coco-nontee: add step to report tests Run `gha-run.sh report-tests` to generate the report of the tests. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-20 14:43:38 -03:00
Wainer dos Santos Moschetta	8c4637d629	tests/k8s: print tests report Added 'report-tests' command to gha-run.sh to print to stdout a report of the tests executed. For example: ``` SUMMARY (2025-02-17-14:43:53): Pass: 0 Fail: 1 STATUSES: not_ok foo.bats OUTPUTS: ::group::foo.bats 1..3 not ok 1 test 1 not ok 2 test 2 ok 3 test 3 1..2 not ok 1 test 1 not ok 2 test 2 ::endgroup:: ``` Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-20 14:43:38 -03:00
Wainer dos Santos Moschetta	5e3b8a019a	tests/k8s: split and save bats outputs in files Currently run_kubernetes_tests.sh sends all the bats outputs to stdout which can be very difficult to browse to find a problem, mainly on CI. With this change, each bats execution have its output sent to 'reports/yyy-mm-dd-hh:mm:ss/<status>-<bats file>.log' where <status> is either 'ok' (tests passed) or 'not_ok' (some tests failed). Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-20 14:43:38 -03:00
Steve Horsman	f8c5aa6df6	Merge pull request #11259 from fitzthum/bump-gc-0140 Update Trustee and Guest Components for CoCo v0.14.0	2025-05-20 18:05:17 +01:00
Lukáš Doktor	c203d7eba6	ci.ocp: Set peer-pods-azure license We forgot to add the license header when introducing this test. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-20 17:03:48 +02:00
Steve Horsman	b4aa1e3fbd	Merge pull request #11279 from skazi0/repo-components osbuilder: ubuntu: Add REPO_COMPONENTS setting	2025-05-20 16:03:48 +01:00
Lukáš Doktor	b97b20295b	ci.ocp: Make peer-pods setup executable set permissions of the peer-pods-azure.sh script to executable Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-20 17:03:48 +02:00
Sumedh Alok Sharma	9a4432d197	Merge pull request #11233 from Ankita13-code/ankitapareek/execprocess-additional-input-validation genpolicy: validate input process fields for ExecProcessRequest	2025-05-20 20:11:41 +05:30
Jacek Tomasiak	91fb4353f6	osbuilder: ubuntu: Add REPO_COMPONENTS setting Added variable REPO_COMPONENTS (default: "main") which sets components used by mmdebstrap for rootfs building. This is useful for custom image builders who want to include EXTRA_PKGS from components other than the default "main" (e.g. "universe"). Fixes: #11278 Signed-off-by: Jacek Tomasiak <jtomasiak@arista.com> Signed-off-by: Jacek Tomasiak <jacek.tomasiak@gmail.com>	2025-05-20 14:01:48 +02:00
Fabiano Fidêncio	29099d139b	Merge pull request #11280 from kata-containers/dependabot/cargo/src/tools/kata-ctl/ring-0.17.14 build(deps): bump ring from 0.17.5 to 0.17.14 in /src/tools/kata-ctl	2025-05-20 13:47:22 +02:00
Fabiano Fidêncio	0bc0623037	Merge pull request #11277 from skazi0/repo-url osbuilder: ubuntu: Expose REPO_URL variables	2025-05-20 13:46:01 +02:00
Ankita Pareek	ad75595dc8	genpolicy: Add tests for various input validations for ExecProcessRequest These additional tests cover edge cases specific to- - Terminal validation - Capabilities validation - Working directory (Cwd) validation - NoNewPrivileges validation - User validation - Environment variables validation Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>	2025-05-20 11:19:55 +00:00
Saul Paredes	1e466bf39c	genpolicy: fix validation of env variables sourced from metadata.namespace Use $(sandbox-namespace) wildcard in case none is specified in yaml. If wildcard is present, compare input against annotation value. Fixes regression introduced in https://github.com/microsoft/kata-containers/pull/273 where samples that use metadata.namespace env var were no longer working. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-05-20 11:19:46 +00:00
Dan Mihai	a113b9eefd	genpolicy: validate probe process fields Validate more process fields for k8s probe commands - e.g., livenessProbe, readinessProbe, etc. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-05-20 11:15:30 +00:00
Dan Mihai	c0b8c6ed5e	genpolicy: validate process for commands from settings Validate more process fields for commands enabled using the ExecProcessRequest "commands" and/or "regex" fields from the settings file. Add function to get the container from state based on container_id matching instead of matching it against every policy container data Signed-off-by: Dan Mihai <dmihai@microsoft.com> Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>	2025-05-20 11:15:30 +00:00
Dan Mihai	6f78aaa411	genpolicy: use process inputs for allow_process() Using process data inputs for allow_process() is easier to read/understand compared with the older OCI data inputs. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-05-20 11:15:30 +00:00
Steve Horsman	2871c31162	Merge pull request #11273 from mythi/tdx-qemu-params config: update QEMU TDX configuration	2025-05-20 10:22:59 +01:00
Steve Horsman	4b317dddfa	Merge pull request #11271 from stevenhorsman/gatekeeper-truncate-names ci: gatekeeper: Require names update	2025-05-20 10:20:05 +01:00
alex.lyn	4b27ca9233	runtime-rs: Implement volume copy allowlist check For security reasons, we have restricted directory copying. Introduces the `is_allowlisted_copy_volume` function to verify if a given volume path is present in an allowed copy directory. This enhances security by ensuring only permitted volumes are copied Currently, only directories under the path `/var/lib/kubelet/pods/<uid>/volumes/{kubernetes.io~configmap, kubernetes.io~secret, kubernetes.io~downward-api, kubernetes.io~projected}` are allowed to be copied into the guest. Copying of other directories will be prohibited. Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:57:10 +08:00
alex.lyn	8910bddce8	kata-types: Introduce k8s special volumes for projected and downward-api Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	6fa409df1a	kata-agent: Improve file sync handling and address symlink issues When synchronizing file changes on the host, a "symlink AlreadyExists" issue occurs, primarily due to improper handling of symbolic links (symlinks). Additionally, there are other related problems. This patch will try to address these problems. (1) Handle symlink target existence (files, dirs, symlinks) during host file sync. Use appropriate removal methods (unlink, remove_file, remove_dir_all). (2) Enhance temporary file handling for safer operations and implement truncate only at offset 0 for resume support. (3) Set permissions and ownership for parent directories. (4) Check and clean target path for regular files before rename. Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	654e6db91f	runtime-rs: Add inotify-based real-time directory synchronization Introduce event-driven file sync mechanism between host and guest when sharedfs is disabled, which will help monitor the host path in time and do sync files changes: 1. Introduce FsWatcher to monitor directory changes via inotify; 2. Support recursive watching with configurable filters; 3. Add debounce logic (default 500ms cooldown) to handle burst events; 4. Trigger `copy_dir_recursively` on stable state; 5. Handle CREATE/MODIFY/DELETE/MOVED/CLOSE_WRITE events; Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	79b832b2f5	runtime-rs: Propagate k8s configs correctly when sharedfs is disabled In Kubernetes (k8s), while Kata Pods often use virtiofs for injecting Service Accounts, Secrets, and ConfigMaps, security-sensitive environments like CoCo disable host-guest sharing. Consequently, when SharedFs is disabled, we propagate these configurations into the guest via file copy and bind mount for correct container access. Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	8da7cd1611	runtime-rs: Impl recursive directory copy with metadata preservation Add async directory traversal using BFS algorithm: (1) Support file type handling: Regular files (S_IFREG) with content streaming; Directories (S_IFDIR) with mode preservation; Symbolic links (S_IFLNK) with target recreation; (2) Maintain POSIX metadata: UID/GID preservation,File mode bits, and Directory permissions (3) Implement async I/O operations for: Directory enumeration, file reading, symlink target resolution Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	378d04bdf0	runtime-rs: Add hotplug block device type with QMP There's several cases that block device plays very import roles: 1. Direct Volume: In Kata cases, to achieve high-performance I/O, raw files on the host are typically passed directly to the Guest via virtio-blk, and then bond/mounted within the Guest for container usage. 2. Trusted Storage In CoCo scenarios, particularly in Guest image pull mode, images are typically pulled directly from the registry within the Guest. However, due to constrained memory resources (prioritized for containers), CoCo leverages externally attached encrypted storage to store images, requiring hot-plug capability for block devices. and as other vmms, like dragonball and cloud-hypervisor in runtime-rs or qemu in kata-runtime have already supported such capabilities, we need support block device with hot-plug method (QMP) in qemu-rs. Let's do it. Fixes #11143 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:46:54 +08:00
alex.lyn	2405301e2e	runtime-rs: Support hotplugging block device via QMP This commit introduces block device hotplugging capability using QMP commands. The implementation enables attaching raw block devices to a running VM through the following steps: 1.Block Device Configuration Uses `blockdev-add` QMP command to define a raw block backend with (1) Direct I/O mode (2) Configurable read-only flag (3) Host file/block device path (`/path/to/block`) 2.PCI Device Attachment, Attaches the block device via `device_add` QMP command as a `virtio-blk-pci` device: (1) Dynamically allocates PCI slots using `find_free_slot()` (2) Binds to user-specified PCIe bus (e.g., `pcie.1`) (3) Returns PCI path for further management Fixes #11143 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:46:54 +08:00
alex.lyn	80bd71bfcc	runtime-rs: Iterates through PCI devices to find a match with qdev_id The get_pci_path_by_qdev_id function is designed to search for a PCI device within a given list of devices based on a specified qdev_id. It tracks the device's path in the PCI topology by recording the slot values of the devices traversed during the search. If the device is located behind a PCI bridge, the function recursively explores the bridge's device list to find the target device. The function returns the matching device along with its updated path if found, otherwise, it returns None. Fixes #11143 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:46:54 +08:00
Fupan Li	9a03815f18	Merge pull request #11095 from lifupan/ephemeral_volume runtime-rs: add the ephemeral memory based volume support	2025-05-20 09:18:34 +08:00
RuoqingHe	5b5c71510e	Merge pull request #11093 from kimullaa/fix-err-when-containerd-conf-does-not-exist kata-deploy: fix bug when config does not exist	2025-05-19 18:12:50 +02:00
Steve Horsman	cfdccaacb3	Merge pull request #11283 from Rtoax/p002-fix-typo config: Fix typos	2025-05-19 14:59:37 +01:00
RuoqingHe	93b44f920c	Merge pull request #11287 from bpradipt/remote-hyp-logging runtime: Fix logging for remote hypervisor	2025-05-19 15:49:15 +02:00
Shunsuke Kimura	9a8d64d6b1	kata-deploy: execute in the host environment `containerd` command should be executed in the host environment. (To generate the config that matches the host's containerd version.) Fixes: #11092 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-19 21:42:21 +09:00
Shunsuke Kimura	d3edc90d80	kata-deploy: Fix condition always true if config.toml does not exist, `[ -x $(command -v containerd) ]` will always True (Because it is not enclosed in ""). ``` // current code $ [ -x $(command -v containerd_notfound) ] $ echo $? 0 // maybe expected code $ [ -x "$(command -v containerd_notfound)" ] $ echo $? 1 ``` Fixes: #11092 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-19 21:42:21 +09:00
Hyounggyu Choi	2fd2cd4a9b	runtime: Preserve hotplug devices for vfio-coldplug mode Fixes: #11288 This commit appends hotplug devices (e.g., persistent volume) to deviceInfos when `vfio_mod` is `vfio` and `cold_plug_vfio` is set to one except `no-port`. For details, please visit the issue. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-05-19 13:46:49 +02:00
Pradipta Banerjee	9f9841492e	runtime: Fix logging for remote hypervisor Need to use hvLogger Fixes: #11286 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2025-05-19 07:01:59 -04:00
Jacek Tomasiak	da6860a632	osbuilder: ubuntu: Expose REPO_URL variables This exposes REPO_URL and adds REPO_URL_X86_64 which can be set to use custom Ubuntu repo for building rootfs. If only one architecture is built, REPO_URL can be set. Otherwise, REPO_URL_X86_64 is used for x86_64 arch and REPO_URL for others. Fixes: #11276 Signed-off-by: Jacek Tomasiak <jtomasiak@arista.com> Signed-off-by: Jacek Tomasiak <jacek.tomasiak@gmail.com>	2025-05-19 12:41:49 +02:00
Rong Tao	914730d948	config: Fix typos devie should be device Signed-off-by: Rong Tao <rongtao@cestc.cn>	2025-05-19 14:19:22 +08:00
Alex Lyn	305a5f5e41	Merge pull request #10578 from Apokleos/pcie-port-devices runtime-rs: Introduce PCIe Port devices in runtime-rs for qemu-rs	2025-05-18 21:10:25 +08:00
Dan Mihai	b9651eadab	Merge pull request #11214 from microsoft/cameronbaird/address-gid-mismatch-additionalgids genpolicy: Enable AdditionalGids checks in rules.rego	2025-05-16 10:15:53 -07:00
dependabot[bot]	a2c7e48e0e	build(deps): bump ring from 0.17.5 to 0.17.14 in /src/tools/kata-ctl Bumps [ring](https://github.com/briansmith/ring) from 0.17.5 to 0.17.14. - [Changelog](https://github.com/briansmith/ring/blob/main/RELEASES.md) - [Commits](https://github.com/briansmith/ring/commits) --- updated-dependencies: - dependency-name: ring dependency-version: 0.17.14 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-05-16 14:51:20 +00:00
Fabiano Fidêncio	9e11b2e577	Merge pull request #11274 from fidencio/topic/arm-ci-k8s-enable-hotplug-tests ci: k8s: arm: Enable skipped tests	2025-05-16 13:19:18 +02:00
Fabiano Fidêncio	219d6e8ea6	Merge pull request #11257 from mythi/coco-guest-hardening confidential guest kernel hardening changes	2025-05-16 08:52:36 +02:00
Fabiano Fidêncio	86d2d96d4a	ci: k8s: arm: Enable skipped tests Now that memory hotplug should work, as we're using a firmware that supports that, let's re-enable the tests that rely on hotplug. Fixes: #10926, #10927 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-16 03:02:32 +02:00
Fabiano Fidêncio	02ce395a69	Merge pull request #11272 from seungukshin/enable-edk2-for-arm64 Enable edk2 for arm64	2025-05-15 20:59:56 +02:00
Cameron Baird	7bba7374ec	genpolicy: Add retries to policy generation As the genpolicy from_files call makes network requests to container registries, it has a chance to fail. Harden us against flakes due to network by introducing a 6x retry loop in genpolicy tests. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-15 18:12:50 +00:00
Steve Horsman	d21d2a0657	Merge pull request #11265 from chathuryaadapa/bumpalo-crate-bump Bump: libz-sys crate to address CVE	2025-05-15 16:18:00 +01:00
Mikko Ylinen	ff851202e6	config: update QEMU TDX configuration Drop '-vmx-rdseed-exit' from '-cpu host' QEMU options. The history of it is unknown but it's likely related to early TDX enablement. TD pods start up fine without it (tested by manually editing the configuration file) and it's also not used elsewhere. Keep TDXCPUFEATURES for now in case a need for it shows up later. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-15 15:43:24 +03:00
Fabiano Fidêncio	676e66ae49	Merge pull request #11246 from skazi0/mmdebstrap osbuilder: ubuntu: Switch from multistrap to mmdebstrap	2025-05-15 14:15:37 +02:00
alex.lyn	07533522b8	runtime-rs: Handle PortDevice devices when invoke start_vm with Qemu Extract PortDevice relevant information, and then invoke different processing methods based on the device type. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	c109328097	runtime-rs: Introduce pcie root port and switch port in qemu-rs cmdline. Some data structures and methods are introduced to help handle vfio devices. And mothods add_pcie_root_ports and add_pcie_switch_ports follow runtime's related implementations of vfio devices. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	47c7ba8672	runtime-rs: Prepare pcie port devices before start sandbox Prepare pcie port devices before starting VM with the help of device manager and PCIe Topology. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	d435712ccb	runtime-rs: Introduce PortDevice in resource manager in sandbox A new resource type `PortDevice` is introduced which is dedicated for handling root ports/switch ports during sandbox creation(VM). Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	1d670bb46c	runtime-rs: handle useless Device match arms in dragonball vmm case Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	f08fdd25d8	runtime-rs: Introduce device type of PordDevice in device manager PortDevice is for handling root ports or switch ports in PCIe Topology. It will make it easy pass the root ports/switch ports information during create VM with requirements of PCIe devices. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	694a849eaa	runtime-rs: Add PCIe topology mgmt for Root Port and Switch Port This commit introduces an implementation for managing PCIe topologies, focusing on the relationship between Root Ports and Switch Ports. The design supports two strategies for generating Switch Ports: Let's take the requirement of 4 switch ports as an example. There'll be three possible solutions as below: (1) Single Root Port + Single PCIe Switch: Uses 1 Root Port and 1 Switch with 4 Downstream Ports. (2) Multiple Root Ports + Multiple PCIe Switches: Uses 2 Root Ports and 2 Switches, each with 2 Downstream Ports. The recommended strategy is Option 1 due to its simplicity, efficiency, and scalability. The implementation includes data structures (PcieTopology, RootPort, PcieSwitch, SwitchPort) and operations (add_pcie_root_port, add_switch_to_root_port, add_switch_port_to_switch) to manage the topology effectively. Fxies #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	2f5ee0ec6d	kata-types: Support switch port config via annotation and configuration Support setting switch ports with annotatation or configuration.toml Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	a42d16a6a4	kata-types: Introduce pcie_switch_port in configuration (1) Introduce new field `pcie_switch_port` for switch ports. (2) Add related checking logics in vmms(dragonball, qemu) Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
Fabiano Fidêncio	af3c601a92	Merge pull request #11258 from fidencio/topic/second-try-fix-multi-install-prefix kata-deploy: Avoid changing any component path in case of restart	2025-05-15 11:21:15 +02:00
Seunguk Shin	560e718979	runtime: Add edk2 to configuration-qemu.toml for arm64 The edk2 is required for memory hot plug on qemu for arm64. This adds the edk2 to configuration-qemu.toml for arm64. Signed-off-by: Seunguk Shin <seunguk.shin@arm.com> Reviewed-by: Nick Connolly <nick.connolly@arm.com>	2025-05-15 10:12:31 +01:00
Seunguk Shin	5cabce1a25	packaging: Build edk2 for arm64 The edk2 is required for memory hot plug on qemu for arm64. This adds the edk2 to static tarball for arm64. Signed-off-by: Seunguk Shin <seunguk.shin@arm.com> Reviewed-by: Nick Connolly <nick.connolly@arm.com>	2025-05-15 10:12:24 +01:00
stevenhorsman	c09291a9c7	ci: gatekeeper: Require names update The github rest api truncated job names that are >100 characters (which doesn't seem to be documented). There doesn't seem to be a way to easily make gatekeeper handle this automatically, so lets update the required-tests to expect the truncated job names Fixes: #11176 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-15 10:07:41 +01:00
Steve Horsman	95e5e0ec49	Merge pull request #11264 from fidencio/topic/helm-to-ci helm: release: Publish our helm charts to the OCI registries	2025-05-15 09:47:33 +01:00
Lukáš Doktor	9f8c8ea851	tools.testing: Add way to re-play recorded queries in gatekeeper to simplify gatekeeper development add support for DEBUG_INPUT which can be used to report content from files gathered in DEBUG run. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-15 10:32:10 +02:00
Lukáš Doktor	1a15990ee1	tools.testing: Add DEBUG support for gatekeeper to avoid manual curling to analyze GK issues let's add a way to dump all GK requests in a directory when the use specifies "DEBUG" env variable. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-15 10:32:10 +02:00
Fabiano Fidêncio	71e8c1b4f0	helm: release: Publish our helm charts to the OCI registries Let's take advantage that helm take and OCI registry as the charts, and upload our charts to the OCI registries we've been using so far. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-14 20:20:35 +02:00
RuoqingHe	393cc61153	Merge pull request #11241 from kata-containers/dependabot/cargo/src/tools/agent-ctl/ring-0.17.14 build(deps): bump ring from 0.17.8 to 0.17.14 in /src/tools/agent-ctl	2025-05-14 16:20:33 +02:00
Adapa Chathurya	3d284d3b4e	versions: Bump libz-sys version Bump libz-sys version to update and remediate CVE-2025-1744. Signed-off-by: Adapa Chathurya <adapa.chathurya1@ibm.com>	2025-05-14 19:48:10 +05:30
Fabiano Fidêncio	82928d1480	kata-deploy: Avoid changing any component path in case of restart The previous attempt to fix this issue only took in consideration the QEMU binary, as I completely forgot that there were other pieces of the config that we also adjusted. Now, let's just check one of the configs before trying to adjust anything else, and only do the changes if the suffix added with the multi-install suffix is not yet added.{ Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-14 15:41:13 +02:00
Jacek Tomasiak	e20fb377fc	osbuilder: ubuntu: Switch from multistrap to mmdebstrap Multistrap requires usrmerge package which was dropped in Ubuntu 24.04 (Noble). Based on details from [0], the rootfs build process was switched to mmdebstrap. Some additional minor tweaks were needed around chrony as the version from Noble has very strict systemd sandboxing configured and it doesn't work with readonly root by default. [0] https://lists.debian.org/debian-dpkg/2023/05/msg00080.html Fixes: #11245 Signed-off-by: Jacek Tomasiak <jtomasiak@arista.com> Signed-off-by: Jacek Tomasiak <jacek.tomasiak@gmail.com>	2025-05-14 11:46:19 +02:00
Steve Horsman	711fcd8f51	Merge pull request #11251 from stevenhorsman/rust-vulns-9th-may-2025 Rust vulns 9th may 2025	2025-05-14 09:58:12 +01:00
Tobin Feldman-Fitzthum	be708f410e	tests: fixup error assert in pull image test Guest components is now less verbose with its error messages. This will be fixed after the release but for now switch to a more generic error message that is still found in the logs. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 20:17:02 -05:00
Tobin Feldman-Fitzthum	806abeefb9	tests: fixup error asserts in init-data test Guest components is less verbose with its error message now. This will be fixed after the release, but for now, update the tests with the new more general message. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 20:16:50 -05:00
Tobin Feldman-Fitzthum	e2e503eb33	tests: fixup error string for signature tests Guets components is less verbose with its error messages. This will be fixed after the release, but for now let's replace this with a more generic message. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 16:54:06 -05:00
Cameron Baird	090497f520	genpolicy: Add test cases for fsGroup and supplementalGroup fields Fix up genpolicy test inputs to include required additionalGids Include a test for the pod_container container in security_context tests as these containers follow slightly different paths in containerd. Introduce a test for fsGroup/supplementalGroups fields in the security context. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-13 21:48:58 +00:00
Cameron Baird	19d502de76	ci: Add test cases for fsGroup and supplementalGroup fields Introduce new test case to the security context bats file which verifies that policy works properly for a deployment yaml containing fsGroup and supplementalGroup configuration. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-13 21:48:58 +00:00
Cameron Baird	d3cd1af593	genpolicy: Enable AdditionalGids checks in rules.rego With added support for parsing these fields in genpolicy, we can now enable policy verification of AdditionalGids. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-13 21:48:58 +00:00
Tobin Feldman-Fitzthum	ef98f39b6d	tests: update error message for authenticated guest pull Some changes in guest components have obscured the error message that we show when we fail to get the credentials for an authenticated image. The new error message is a little bit misleading since it references decrypting an image. This will be udpated in a future release, but for now look for this message. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 16:46:32 -05:00
Cameron Baird	29ee46c186	genpolicy: Handle PodSecurityContext.fsGroup\|supplementalGroups Policy enforcement for additionalGids, A list of groups applied to the first process run in each container. Manifests in OCI struct as additionalGids: Consists of container's GID, fsGroup, and supplementalGroups. https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#PodSecurityContext-v1-core Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-13 21:44:51 +00:00
Tobin Feldman-Fitzthum	e10aa4e49c	tests: update error message for encrypted image test Guest components prints out a different error when failing to decrypt an image. Update the test to look for this new error. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 12:33:37 -05:00
RuoqingHe	cd4c3e89e1	Merge pull request #11243 from kata-containers/dependabot/go_modules/src/runtime/github.com/opencontainers/runc-1.2.0 build(deps): bump github.com/opencontainers/runc from 1.1.12 to 1.2.0 in /src/runtime	2025-05-13 17:02:35 +02:00
RuoqingHe	268197957d	Merge pull request #11253 from stevenhorsman/golang.org/x/oauth2v0.27.0-bump versions: Bump golang.org/x/oauth2	2025-05-13 15:03:24 +02:00
stevenhorsman	b3825829d8	versions: Bump golang.org/x/oauth2 Update module to remediate [CVE-2025-22868](https://www.cve.org/CVERecord?id=CVE-2025-22868) Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-13 11:00:35 +01:00
Rong Tao	37a16c19d1	osbuilder: lib.sh: Fix indent Replace 4 spaces to [tab]. Signed-off-by: Rong Tao <rongtao@cestc.cn>	2025-05-13 16:56:54 +08:00
Steve Horsman	299fb3b77b	Merge pull request #11255 from stevenhorsman/skip-docker-tests ci: gatekeeper: skip docker tests	2025-05-13 09:18:09 +01:00
Zvonko Kaiser	842ec6a32e	Merge pull request #11262 from BbolroC/add-vfio-config-for-sel-runtime runtime/config: Add VFIO config for IBM SEL	2025-05-12 10:59:09 -04:00
Zvonko Kaiser	5cc098ae43	Merge pull request #11242 from houstar/qing/safe-path agent: use safe-path to replace secure_join	2025-05-12 10:58:19 -04:00
Mikko Ylinen	ab29c8c979	runtime: do not add virtio-rng-pci device for confidential guests Adding: "-object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0" for confidential guests is not necessary as the RNG source cannot be trusted and the guest kernel has the driver already disable as well. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-12 17:14:51 +03:00
Mikko Ylinen	a44dfb8d37	versions: bump LTS kernel 6.12.28 has been released, let's bump to it. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-12 17:14:51 +03:00
Mikko Ylinen	eb326477fc	kernel: disable virtio RNG for confidential guests Linux CoCo x86 guest is hardened to ensure RDRAND provides enough entropy to initialize Linux RNG. A failure will panic the guest. For confidential guests any other RNG source is untrusted so disable them. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-12 17:12:44 +03:00
Hyounggyu Choi	4fac1293bd	runtime/config: Add VFIO config for IBM SEL With #11076 merged, a VFIO configuration is needed in the runtime when IBM SEL is involved (e.g., qemu-se or qemu-se-runtime-rs). For the Go runtime, we already have a nightly test (e.g., https://github.com/kata-containers/kata-containers/actions/runs/14964175872/job/42031097043) in which this change has been applied. For the Rust runtime, the feature has not yet been migrated. Thus, this change serves as a placeholder and a reminder for future implementation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-05-12 14:58:29 +02:00
Qingyuan Hou	c0ceaf661a	agent: use safe-path to replace secure_join This patch use safe-path library to safely handle filesystem paths. Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com>	2025-05-12 09:06:55 +00:00
Tobin Feldman-Fitzthum	de6f4ae99c	versions: update Trustee version for CoCo v0.14.0 This hash will be tagged as Trustee v0.13.0 after the CoCo release is finished. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-09 13:40:28 -05:00
Tobin Feldman-Fitzthum	f9a9967e21	versions: update guest-components for CoCo v0.14.0 Pick up changes to guest components. This hash is right before the changes to GC to support image pull via the CDH. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-09 13:40:28 -05:00
Tobin Feldman-Fitzthum	d714eb2472	agent: update image-rs for CoCo v0.14.0 We might be able to eliminate this dependency soon, but for now let's update image-rs. I massaged the dependencies with: cargo update idna_adapter@1.2.1 --precise 1.2.0 cargo update litemap@0.7.5 --precise 0.7.4 cargo update zerofrom@0.1.6 --precise 0.1.5 cargo update astral-tokio-tar@0.5.2 --precise 0.5.1 cargo update base64ct@1.7.3 --precise 1.6.0 cargo update generic-array@1.2.0 --precise 1.1.1 Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-09 13:39:52 -05:00
stevenhorsman	35ed3a2a3a	versions: Bump bumpalo version Bump bumpalo version to remediate RUSTSEC-2022-0078 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 16:09:22 +01:00
stevenhorsman	fcc60b514b	versions: Bump hyper version Bump hyper version to update and remediate CVE-2023-26964 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 16:09:22 +01:00
stevenhorsman	7807e6c29a	versions: Bump byte-unit and rust_decimal Bump the crates to update them and pull in a newer version of borsh to remediate RUSTSEC-2023-0033 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 16:09:22 +01:00
Mikko Ylinen	96d922fc27	kernel: disable virtio MMIO for confidential guests As the comment in the fragment suggests, this is for the firecracker builds and not relevant for confidential guests, for example. Exlude mmio.conf fragment by adding the new !confidential tag to drop virtio MMIO transport for the confidential guest kernel (as virtio PCI is enough for the use cases today). Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-09 17:53:22 +03:00
Mikko Ylinen	31d6839eb5	tools: let confidential guest kernel builds to exclude fragments build-kernel.sh supports exluding fragments from the common base set based on the kernel target architecture. However, there are also cases where the base set must be stripped down for other reason. For example, confidential guest builds want to exclude some drivers the untrusted host may try to add devices (e.g., virtio-rng). Make build-kernel.sh to skip fragments tagged using '!confidential' when confidential guest kernels are built. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-09 17:53:22 +03:00
Zvonko Kaiser	78ff72a386	Merge pull request #11199 from fidencio/topic/kata-deploy-fix-multiInstallSufix-behaviour-during-restarts helm: Avoid appending the multiInstallSuffix several times	2025-05-09 10:32:23 -04:00
Zvonko Kaiser	26a3cb4fd1	Merge pull request #11250 from stevenhorsman/tempfile-3.19.1-bump versions: Update tempfile crate	2025-05-09 09:51:49 -04:00
stevenhorsman	a09a76a4f5	ci: gatekeeper: skip docker tests It looks like the 22.04 image got updated and broke the docker tests (see #11247), so make these un-required until we can get a resolution Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 13:57:23 +01:00
Markus Rudy	835f59df2f	Merge pull request #10986 from 3u13r/euler/feat/genpolicy/env-from-secret genpolicy: support secrets to be referenced for pod envs	2025-05-09 13:29:35 +02:00
stevenhorsman	787198f8bb	versions: Update tempfile crate Update the tempfile crate to resolve security issue [WS-2023-0045](`7247a8b6ee`) that came with the remove_dir_all dependency in prior versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 09:57:28 +01:00
Leonard Cohnen	b23ff6fc68	genpolicy: refactor policy test workdir setup This aligns the workdir preparation more closely with the workdir preparation for the generate integration test. Most notably, we clean up the temporary directory before we execute the tests in it. This way we better isolate different runs. Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com> Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-05-09 09:45:28 +02:00
Leonard Cohnen	bad0cd0003	genpolicy: add cli integration tests Add a new type of integration test to genpolicy. Now we can test flag handling and how the CLI behaves with certain yaml inputs. The first tests cover the case when a Pod references a Kubernetes secret of config map in another file. Those need to be explicitly added via the --config-files flag. In the future we can easily add test suites that cover that all yaml fields of all resources are understood by genpolicy. Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com>	2025-05-09 09:45:28 +02:00
Leonard Cohnen	61ee330029	genpolicy: move policy enforcement integration test to separate folder In preparation for adding more types of integration tests, moving the policy enforcements test into a separate folder. Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com> Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-05-09 09:45:28 +02:00
Leonard Cohnen	2ea57aefbc	genpolicy: remove unused function Remove function that became unused in the last commit. Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com>	2025-05-09 09:41:43 +02:00
Aurélien Bombo	4bb441965f	genpolicy: support arbitrary resources with -c This allows passing config maps and secrets (as well as any other resource kinds relevant in the future) using the -c flag. Fixes: #10033 Co-authored-by: Leonard Cohnen <leonard.cohnen@gmail.com> Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com> Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-05-09 09:41:43 +02:00
Hyounggyu Choi	a286a5aee8	Merge pull request #11076 from Jakob-Naucke/ap-bind-assoc Bind/associate for VFIO-AP	2025-05-09 09:32:46 +02:00
Saul Paredes	1e09dfb0df	Merge pull request #11127 from microsoft/archana1/mount-tc genpolicy: improve validation for mounts	2025-05-08 15:41:23 -07:00
stevenhorsman	17843e50bb	runtime: Switch userns packages Switch imports to resolve: ``` SA1019: "github.com/opencontainers/runc/libcontainer/userns" is deprecated: use github.com/moby/sys/userns ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-08 11:04:11 +01:00
dependabot[bot]	2c80a3edce	build(deps): bump github.com/opencontainers/runc in /src/runtime Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.1.12 to 1.2.0. - [Release notes](https://github.com/opencontainers/runc/releases) - [Changelog](https://github.com/opencontainers/runc/blob/main/CHANGELOG.md) - [Commits](https://github.com/opencontainers/runc/compare/v1.1.12...v1.2.0) --- updated-dependencies: - dependency-name: github.com/opencontainers/runc dependency-version: 1.2.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-08 11:02:16 +01:00
Steve Horsman	e3e0007bf7	Merge pull request #11141 from stevenhorsman/k8s-cpu-ns-exec-retry tests: k8s: Retry output of kubectl exec in k8s-cpu-ns	2025-05-07 17:11:25 +01:00
Fabiano Fidêncio	f981e8a904	Merge pull request #10833 from stevenhorsman/crio-annotations-update Crio annotations update	2025-05-07 16:05:24 +02:00
dependabot[bot]	96885a8449	build(deps): bump ring from 0.17.8 to 0.17.14 in /src/tools/agent-ctl Bumps [ring](https://github.com/briansmith/ring) from 0.17.8 to 0.17.14. - [Changelog](https://github.com/briansmith/ring/blob/main/RELEASES.md) - [Commits](https://github.com/briansmith/ring/commits) --- updated-dependencies: - dependency-name: ring dependency-version: 0.17.14 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-05-07 12:18:56 +00:00
RuoqingHe	be75391953	Merge pull request #11235 from kata-containers/dependabot/cargo/src/tools/kata-ctl/openssl-0.10.72 build(deps): bump openssl from 0.10.60 to 0.10.72 in /src/tools/kata-ctl	2025-05-07 20:17:42 +08:00
RuoqingHe	d4d737a73e	Merge pull request #10512 from ncppd/riscv64-agent agent: Support RISC-V 64-bit architecture	2025-05-07 10:56:10 +08:00
RuoqingHe	7bdfea0041	Merge pull request #11123 from kimullaa/add-path-for-kata-deploy runtime: Add Path for kata-deploy	2025-05-07 00:25:12 +08:00
RuoqingHe	b5e45601f6	Merge pull request #11116 from kimullaa/more-robust-script-path-resolution kata-debug: Make path resolution more robust	2025-05-07 00:19:04 +08:00
stevenhorsman	5472662b33	runtime: Fix Incorrect conversion between integer types Fix the high severity codeql issue by checking the value is in bounds before converting Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 15:18:37 +01:00
stevenhorsman	4de79b9821	runtime: Ignoring deprecated warning. In the latest oci-spec, the prestart hook is deprecated. However, the docker & nerdctl tests failed when I switched to one of the newer hooks which don't run at quite the same time, so ignore the deprecation warnings for now to unblock the security fix Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 15:18:37 +01:00
stevenhorsman	37dda6060c	runtime: Re-vendor Re-run `make vendor` after the podman -> crio annotations change Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 15:18:37 +01:00
stevenhorsman	3740ce6e7b	runtime: Update crio annotations We've been using the github.com/containers/podman/v4/pkg/annotations module to get cri-o annotations, which has some major CVEs in, but in v5 most of the annotations were moved into crio (from 1.30) (see https://github.com/cri-o/cri-o/pull/7867). Let's switch to use the cri-o annotations module instead and remediate CVE-2024-3056. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 15:18:37 +01:00
dependabot[bot]	70b481e1ee	build(deps): bump openssl from 0.10.60 to 0.10.72 in /src/tools/kata-ctl Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.60 to 0.10.72. - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.60...openssl-v0.10.72) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.72 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-05-06 13:56:33 +00:00
RuoqingHe	4f97e5fed3	Merge pull request #11226 from kata-containers/dependabot/cargo/src/agent/tokio-1.44.2 build(deps): bump tokio from 1.44.0 to 1.44.2	2025-05-06 21:55:18 +08:00
Fabiano Fidêncio	78bf9d7500	Merge pull request #11232 from lifupan/mtu runtime: add the mtu support for updating routes	2025-05-06 15:55:04 +02:00
Shunsuke Kimura	7177ab3827	runtime: execute using abs path Fixes: #11123 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-06 21:17:06 +09:00
Shunsuke Kimura	ddccbd4764	runtime: Add Path for kata-deploy When installing with kata-deploy, usually `/opt/kata/bin` is not in the PATH. Therefore, it will fail to execute. so add it to the PATH. Fixes: #11122 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-05-06 21:17:06 +09:00
Shunsuke Kimura	5c156a24e8	kata-debug: Make path resolution more robust Enabled to run from other scripts as source, etc. Fixes: #11115 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-06 21:16:25 +09:00
stevenhorsman	6030a64f0c	build(deps): bump tokio to 1.44.2 Bumps [tokio](https://github.com/tokio-rs/tokio) from to 1.44.2 in all components to resolve the security vuln throughout our repo Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 11:38:52 +01:00
RuoqingHe	89685c0cd0	Merge pull request #11225 from kata-containers/dependabot/cargo/src/dragonball/openssl-0.10.72 build(deps): bump openssl from 0.10.57 to 0.10.72	2025-05-06 18:27:45 +08:00
Fabiano Fidêncio	fb5f3eae3b	Merge pull request #11172 from ChengyuZhu6/erofs EROFS Snapshotter Support in Kata	2025-05-06 11:14:19 +02:00
Ruoqing He	384d335419	ci: Enable build-check for agent on riscv64 Enable build-check for `agent` component for riscv64 platforms. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-06 01:48:37 +00:00
Ruoqing He	7f9b2c0af1	ci: Enable `install_libseccomp.sh` for riscv64 `musl` target is not yet available for riscv64 as of 1.80.0 rust toolchain, set `FORTIFY_SOURCE` to 1 on riscv64 platforms. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-06 01:48:37 +00:00
Nikos Ch. Papadopoulos	0f2c0d38f5	agent: Create pci_root_bus_path for riscv64 `create_pci_root_bus_path` needs to be enabled on riscv64 for agent to compile and work on those platforms. Signed-off-by: Nikos Ch. Papadopoulos <ncpapad@cslab.ece.ntua.gr>	2025-05-06 01:48:37 +00:00
Fupan Li	29f9015caf	runtime-rs: rm the obsoleted ephemeral volume processing Since the ephemeral volume already has a separate volume type for processing, the processing in the virtiofs share volume can be deleted. Moreover, it is not appropriate to process the ephemeral in the share fs. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-06 09:45:35 +08:00
Fupan Li	6e5f3cbbeb	runtime-rs: add the ephemeral memory based volume support For k8s, there's two type of volumes based on ephemral memory, one is emptydir volume based on ephemeral memory, and the other one is used for shm device such as /dev/shm. Thus add a new volume type ephemeral volume to support those two type volumes and remove the legacy shm volume. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-06 09:45:24 +08:00
ChengyuZhu6	d07b279bf1	agent:storage: Add directory creation support Implementing directory creation logic in the OverlayfsHandler to process driver options with the KATA_VOLUME_OVERLAYFS_CREATE_DIR prefix Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>	2025-05-05 23:51:44 +02:00
ChengyuZhu6	f63ec50ba3	runtime: Add EROFS snapshotter with block device support - Detection of EROFS options in container rootfs - Creation of necessary EROFS devices - Sharing of rootfs with EROFS via overlayfs Fixes: #11163 Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>	2025-05-05 23:51:13 +02:00
Archana Choudhary	fb815b77c1	genpolicy: add test for volumeMounts This patch: - adds a count check on mounts - adds various test scenarios for mounts with emptyDir volume source Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-05-05 15:17:50 +00:00
RuoqingHe	1cb34c4d0a	Merge pull request #11202 from RuoqingHe/2025-04-28-upgrade-rtnetlink runtime-rs: Upgrade `rust-netlink` crates	2025-05-05 21:35:45 +08:00
Fupan Li	492329fc02	runtime: add the mtu support for updating routes Some cni plugins will set the MTU of some routes, such as cilium will modify the MTU of the default route. If the mtu of the route is not set correctly, it may cause excessive fragmentation or even packet loss of network packets. Therefore, this PR adds the setting of the MTU of the route. First, when obtaining the route, if the MTU is set, the MTU will also be obtained and set to the route in the guest. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-04 23:12:57 +02:00
Ruoqing He	2d0f32ff96	runtime-rs: Upgrade crates from `rust-netlink` Bump `netlink-sys` to v0.8, `netlink-packet-route` to v0.22 and `rtnetlink` to v0.16 to reach a consistent state of `rust-netlink` dependencies. `bitflags` is bumped to v2.9.0 since those crates requires it. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-03 02:31:02 +00:00
Ruoqing He	09700478eb	runtime-rs: Group Dependencies from `rust-netlink` `rtnetlink`, `netlink-sys` and `netlink-packet-route` are from the same organization, and some of them are depending on the others, which implies the version of those crates should be chosen and dealt with carefully, group them to provide better management. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-03 02:29:43 +00:00
Fabiano Fidêncio	fbf7faa9f4	Merge pull request #11227 from fidencio/topic/agent-only-try-ipv6-if-stack-is-supported agent: netlink: Only add an ipv6 address if ipv6 is enabled	2025-05-02 12:31:40 +02:00
Xuewei Niu	a9b3c6a5a5	Merge pull request #11209 from lifupan/fix_slog shimv2: fix the issue logger write failed	2025-05-02 17:25:44 +08:00
Fabiano Fidêncio	79ad68cce5	Merge pull request #11230 from kimullaa/remove-wrong-qemu-option runtime: remove wrong qemu-system-x86_64 option	2025-05-02 11:18:45 +02:00
stevenhorsman	21498d401f	build(deps): bump openssl from to 0.10.72 Bumps [openssl](https://github.com/sfackler/rust-openssl) to 0.10.72. - [Release notes](https://github.com/sfackler/rust-openssl/releases) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.72 dependency-type: indirect ... Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-02 09:36:50 +01:00
Fabiano Fidêncio	4ce00ea434	agent: netlink: Only add an ipv6 address if ipv6 is enabled When running Kata Containers on CSPs, the CSPs may enforce their clusters to be IPv4-only. Checking the OCI spec passed down to container, on a GKE cluster, we can see: ``` "sysctl": { ... "net.ipv6.conf.all.disable_ipv6": "1", "net.ipv6.conf.default.disable_ipv6": "1", ... }, ``` Even with ipv6 being explicitly disabled (behind our back ;-)), we've noticed that IPv6 addresses would be received, but then as IPv6 was disabled we'd break on CreatePodSandbox with the following error: ``` Warning FailedCreatePodSandBox 4s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: "update interface: Failed to add address fe80::c44c:1cff:fe84:f6b7: NetlinkError(ErrorMessage { code: Some(-13), header: [64, 0, 0, 0, 20, 0, 5, 5, 19, 0, 0, 0, 0, 0, 0, 0, 10, 64, 0, 0, 2, 0, 0, 0, 20, 0, 1, 0, 254, 128, 0, 0, 0, 0, 0, 0, 196, 76, 28, 255, 254, 132, 246, 183, 20, 0, 2, 0, 254, 128, 0, 0, 0, 0, 0, 0, 196, 76, 28, 255, 254, 132, 246, 183] })\n\nStack backtrace:\n 0: <unknown>\n 1: <unknown>\n 2: <unknown>\n 3: <unknown>\n 4: <unknown>\n 5: <unknown>\n 6: <unknown>\n 7: <unknown>\n 8: <unknown>\n 9: <unknown>\n 10: <unknown>": unknown ``` A huge shoutout to Fupan Li for helping with the debug on this one! Fixes: #11200 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-02 09:10:45 +02:00
Shunsuke Kimura	3dba8ddd98	runtime: remove wrong qemu-system-x86_64 option qemu-system-x86_64 does not support "-machine virt". (this is only supported by arm,aarch64) <https://people.redhat.com/~cohuck/2022/01/05/qemu-machine-types.html> Fixes: #11229 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-02 04:37:12 +09:00
Fabiano Fidêncio	7e404dd13f	Merge pull request #11228 from zvonkok/fix-kernel-modules-build gpu: Set the ARCH explicilty for driver builds	2025-05-01 21:07:20 +02:00
Zvonko Kaiser	445cad7754	gpu: Set the ARCH explicilty for driver builds Kernel Makefiles changed how to deduce the right arch lets set it explicilty to enable arm and amd builds. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-05-01 17:13:20 +00:00
RuoqingHe	049a4ef3a8	Merge pull request #11146 from RuoqingHe/2025-04-14-dragonball-centralize-dbs dragonball: Put local dependencies into workspace	2025-05-01 22:06:51 +08:00
RuoqingHe	bd1071aff8	Merge pull request #11174 from kata-containers/dependabot/cargo/src/mem-agent/crossbeam-channel-0.5.15 build(deps): bump crossbeam-channel from 0.5.13 to 0.5.15 in /src/mem-agent	2025-05-01 16:53:42 +08:00
Ruoqing He	61f2b6a733	dragonball: Put local dependencies into workspace Put local dependencies (mostly `dbs` crates) into workspace to avoid complex path dependencies all over the workspace. Simplify path dependency referencing. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-01 08:40:22 +00:00
RuoqingHe	33c69fc8bf	Merge pull request #11204 from stevenhorsman/go-security-bump-april-25 versions: Bump golang.org/x/net	2025-05-01 16:36:24 +08:00
Fabiano Fidêncio	bc66d75fe9	Merge pull request #11217 from stevenhorsman/runtime-rs-centralise-workspace-config Runtime rs centralise workspace config	2025-05-01 10:36:07 +02:00
Fupan Li	9924fbbc70	shimv2: fix the issue logger write failed It's better to open the log pipe file with read & write option, otherwise, once the containerd reboot and closed the read endpoint, kata shim would write the log pipe with broken pipe error. Fixes: #11207 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-01 16:15:18 +08:00
Fabiano Fidêncio	3dfabd42c2	Merge pull request #11206 from kimullaa/fix-xfs-rootfs-type runtime: remove wrong xfs options	2025-05-01 09:05:17 +02:00
Fabiano Fidêncio	a2fbc598b8	Merge pull request #11223 from microsoft/cameronbaird/revert-aks-extension-pin ci: revert temp: ci: Fix AKS cluster creation	2025-05-01 08:33:12 +02:00
Shunsuke Kimura	62639c861e	runtime: remove wrong xfs options "data=ordered" and "errors=remount-ro" are wrong options in xfs. (they are ext4 options) <https://manpages.ubuntu.com/manpages/focal/man5/xfs.5.html> Fixes: #11205 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-01 07:56:39 +09:00
Cameron Baird	6e21d14334	Revert "temp: ci: Fix AKS cluster creation" This reverts commit `1de466fe84`. The latest release of the az aks extension fixes the issue https://github.com/Azure/azure-cli-extensions/blob/main/src/aks-preview/HISTORY.rst#1400b5 Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-30 21:24:42 +00:00
stevenhorsman	a126884953	runtime-rs: Share workspace config Update the runtime-rs workspace packages to use workspace package versions where applicable to centralise the config and reduce maintenance when updating these Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 19:40:47 +01:00
stevenhorsman	f8fcd032ef	workflow: Set RUST_LIB_BACKTRACE=0 As discussed in #9538, with anyhow >=1.0.77 we have test failures due to backtrace behaviour changing, so set RUST_LIB_BACKTRACE=0, so that we only have backtrace on panics Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 19:38:13 +01:00
stevenhorsman	ffbaa793a3	versions: Update crossbeam-channel Update all crossbeam-channel for all non-agent packages (it was done separately in #11175) to 0.5.15 to get them on latest version and remove the versions with a vulnerability Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 19:36:40 +01:00
Steve Horsman	b97bc03ecb	Merge pull request #11211 from stevenhorsman/dragonball-lockfiles dragonball: Remove package lockfiles	2025-04-30 19:34:58 +01:00
stevenhorsman	f910c7535a	ci: Workaround cargo deny issue When a PR has no new files the cargo deny runner fails with: ``` [cargo-deny-generator.sh:17] ERROR: changed_files_status= ``` so add `\|\| true` to try and help this Co-authored-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 16:27:25 +01:00
stevenhorsman	f2a2117252	tests: k8s: Retry output of kubectl exec in k8s-cpu-ns We are seeing failures in this test, where the output of the kubectl exec command seems to be blank, so try retrying the exec like #11024 Fixes: #11133 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 15:01:08 +01:00
stevenhorsman	97f7d49e8e	dragonball: Remove package lockfiles Since #10780 the dbs crates are managed as members of the dragonball workspace, so we can remove the lockfile as it's now workspace managed now Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 09:14:07 +01:00
Steve Horsman	8045cb982c	Merge pull request #11208 from kata-containers/dependabot/cargo/src/runtime-rs/tokio-1.38.2 build(deps): bump tokio from 1.38.0 to 1.38.2 in /src/runtime-rs	2025-04-30 08:44:51 +01:00
Aurélien Bombo	46af7cf817	Merge pull request #11077 from microsoft/cameronbaird/address-gid-mismatch genpolicy: Align GID behavior with CRI and enable GID policy checks.	2025-04-29 22:23:23 +01:00
Aurélien Bombo	19371e2d3b	Merge pull request #11164 from wainersm/fix_kbs_on_aks tests/k8s: fix kbs installation on Azure AKS	2025-04-29 18:25:14 +01:00
Steve Horsman	6c1fafb651	Merge pull request #11210 from kata-containers/dependabot/cargo/src/tools/runk/tokio-1.44.2 build(deps): bump tokio from 1.38.0 to 1.44.2 in /src/tools/runk	2025-04-29 16:43:58 +01:00
Steve Horsman	3c8cc0cdbf	Merge pull request #11212 from BbolroC/add-cc-vfio-ap-test-s390x GHA: Add VFIO-AP to s390x nightly tests for CoCo	2025-04-29 16:15:00 +01:00
Steve Horsman	a6d1dc7df3	Merge pull request #10940 from ldoktor/peer-pods ci.ocp: Add peer-pods setup script	2025-04-29 15:57:30 +01:00
Hyounggyu Choi	63b9ae3ed0	GHA: Add VFIO-AP to s390x nightly tests for CoCo As #11076 introduces VFIO-AP bind/associate funtions for IBM Secure Execution (SEL), a new internal nightly test has been established. This PR adds a new entry `cc-vfio-ap-e2e-tests` to the existing matrix to share the test result. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-04-29 16:06:12 +02:00
Steve Horsman	8b32846519	Merge pull request #10882 from stevenhorsman/kbs-logging-on-failure tests: confidential: Add KBS logging	2025-04-29 13:29:21 +01:00
dependabot[bot]	7163d7d89b	build(deps): bump tokio from 1.38.0 to 1.38.2 in /src/runtime-rs Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.38.0 to 1.38.2. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.38.0...tokio-1.38.2) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.38.2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-04-29 12:21:58 +00:00
dependabot[bot]	2992a279ab	build(deps): bump tokio from 1.38.0 to 1.44.2 in /src/tools/runk Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.38.0 to 1.44.2. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.38.0...tokio-1.44.2) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.44.2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-04-29 12:14:41 +00:00
Fabiano Fidêncio	e5cc9acab8	Merge pull request #11175 from kata-containers/dependabot/cargo/src/agent/crossbeam-channel-0.5.15 build(deps): bump crossbeam-channel from 0.5.14 to 0.5.15 in /src/agent	2025-04-29 14:13:25 +02:00
Fabiano Fidêncio	a9893e83b8	Merge pull request #11203 from stevenhorsman/high-severity-security-bumps-april-25 rust: High severity security bumps april 25	2025-04-29 14:10:05 +02:00
stevenhorsman	52b2662b75	tests: confidential: Add KBS logging For help with debugging add, logging of the KBS, like the container system logs if the confidential test fails Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-29 09:48:18 +01:00
stevenhorsman	bcffe938ca	versions: Bump golang.org/x/net Bump golang.org/x/net to 0.38.0 as dependabot isn't doing it for these packages to remediate CVE-2025-22872 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-29 09:46:48 +01:00
Steve Horsman	57527c1ce4	Merge pull request #11161 from kata-containers/dependabot/go_modules/src/runtime/golang.org/x/net-0.38.0 build(deps): bump golang.org/x/net from 0.33.0 to 0.38.0 in /src/runtime	2025-04-29 09:39:30 +01:00
Cameron Baird	70ef0376fb	genpolicy: Introduce special handling for clusters using nydus Nydus+guest_pull has specific behavior where it improperly handles image layers on the host, causing the CRI to not find /etc/passwd and /etc/group files on container images which have them. The unfortunately causes different outcomes w.r.t. GID used which we are trying to enforce with policy. This behavior is observed/explained in https://github.com/kata-containers/kata-containers/issues/11162 Handle this exception with a config.settings.cluster_config.guest_pull field. When this is true, simply ignore the /etc/* files in the container image as they will not be parsed by the CRI. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 20:18:42 +00:00
Cameron Baird	d3b652014a	genpolicy: Introduce genpolicy tests for security contexts Add security context testcases for genpolicy, verifying that UID and GID configurations controlled by the kubernetes security context are enforced. Also, fix the other CreateContainerRequest tests' expected contents to reflect our new genpolicy parsing/enforcement of GIDs. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	fc75aee13a	ci: Add CI tests for runAsGroup, GID policy Introduce tests to check for policy correctness on a redis deployment with 1. a pod-level securityContext 2. a container-level securityContext which shadows the pod-level securityContext 3. a pod-level securityContext which selects an existing user (nobody), causing a new GID to be selected. Redis is an interesting container image to test with because it includes a /etc/passwd file with existing user/group configuration of 1000:1000 baked in. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	938ddeaf1e	genpolicy: Enable GID checks in rules.rego With fixes to align policy GID parsing with the CRI behavior, we can now enable policy verification of GIDs. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	eb2c7f4150	genpolicy: Integrate /etc/passwd from OCI container when setting GIDs The GID used for the running process in an OCI container is a function of 1. The securityContext.runAsGroup specified in a pod yaml, 2. The UID:GID mapping in /etc/passwd, if present in the container image layers, 3. Zero, even if the userstr specifies a GID. Make our policy engine align with this behavior by: 1. At the registry level, always obtain the GID from the /etc/passwd file if present. Ignore GIDs specified in the userstr encoded in the OCI container. 2. After an update to UID due to securityContexts, perform one final check against the /etc/passwd file if present. The GID used for the running process is the mapping in this file from UID->GID. 3. Override everything above with the GID of the securityContext configuration if provided Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	c13d7796ee	genpolicy: Parse secContext runAsGroup and allowPrivilegeEscalation Our policy should cover these fields for securityContexts at the pod or container level of granularity. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	349ce8c339	genpolicy: Refactor registry user/group parsing to account for all cases The get_process logic in registry.rs did not account for all cases (username:groupname), did not defer to contents of /etc/group, /etc/passwd when it should, and was difficult to read. Clean this implementation up, factoring the string parsing for user/group strings into their own functions. Enable the registry::Container class to query /etc/passwd and /etc/group, if they exist. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:29 +00:00
Wainer dos Santos Moschetta	460c3394dd	gha: run CoCo non-TEE tests on "all" host type By running on "all" host type there are two consequences: 1) run the "normal" tests too (until now, it's only "small" tests), so increasing the coverage 2) create AKS cluster with larger VMs. This is a new requirement due to the current ingress controller for the KBS service eating too much vCPUs and lefting only few for the tests (resulting on failures) Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-28 12:08:31 -03:00
Wainer dos Santos Moschetta	945482ff6e	tests: make _print_instance_type() to handle "all" host type _print_instance_type() returns the instance type of the AKS nodes, based on the host type. Tests are grouped per host type in "small" and "normal" sets based on the CPU requirements: "small" tests require few CPUs and "normal" more. There is an 3rd case: "all" host type maps to the union of "small" and "normal" tests, which should be handled by _print_instance_type() properly. In this case, it should return the largest instance type possible because "normal" tests will be executed too. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-28 12:08:31 -03:00
Wainer dos Santos Moschetta	a66aac0d77	tests/k8s: optimize nginx ingress for AKS small VM It's used an AKS managed ingress controller which keeps two nginx pod replicas where both request 500m of CPU. On small VMs like we've used on CI for running the CoCo non-TEE tests, it left only a few amount of CPU for the tests. Actually, one of these pod replicas won't even get started. So let's patch the ingress controller to have only one replica of nginx. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-28 12:08:31 -03:00
Wainer dos Santos Moschetta	14e74b8fc9	tests/k8s: fix kbs installation on Azure AKS The Azure AKS addon-http-application-routing add-on is deprecated and cannot be enabled on new clusters which has caused some CI jobs to fail. Migrated our code to use approuting instead. Unlike addon-http-application-routing, this add-on doesn't configure a managed cluster DNS zone, but the created ingress has a public IP. To avoid having to deal with DNS setup, we will be using that address from now on. Thus, some functions no longer used are deleted. Fixes #11156 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-28 12:08:31 -03:00
Fabiano Fidêncio	03ab774ed5	helm: Avoid appending the multiInstallSuffix several times Once the multiInstallSuffix has been taken into account, we should not keep appending it on every re-run/restart, as that would lead to a path that does not exist. Fixes: #11187 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-28 16:36:38 +02:00
stevenhorsman	c938c75af0	versions: kata-ctl: Bump rustls Bump rustls version to > 0.21.11 to remediate high severity CVE-2024-32650 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-28 14:55:59 +01:00
stevenhorsman	2ee7ef6aa3	versions: agent-ctl: Bump hashbrown Bump hashbrown to >= 0.15.1 to remediate the high severity security alert that was in v0.15.0 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-28 14:55:46 +01:00
stevenhorsman	e3d3a2843f	versions: Bump mio to at least 0.8.11 Ensure that all the versions of mio we use are at least 0.8.11 to remediate CVE-2024-27308 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-28 14:55:46 +01:00
stevenhorsman	973bd7c2b6	build(deps): bump golang.org/x/net from 0.33.0 to 0.38.0 in /src/runtime Bumps [golang.org/x/net](https://github.com/golang/net) from 0.33.0 to 0.38.0. - [Commits](https://github.com/golang/net/compare/v0.33.0...v0.38.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-version: 0.38.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-28 14:09:54 +01:00
Steve Horsman	9248634baa	Merge pull request #11098 from stevenhorsman/golang-1.23.7 versions: Bump golang version	2025-04-28 13:46:11 +01:00
Fabiano Fidêncio	ee344aa4e9	Merge pull request #11185 from fidencio/topic/reclaim-guest-freed-memory-backport-from-runtime-rs runtime: clh: Add reclaim_guest_freed_memory [BACKPORT]	2025-04-28 12:32:33 +02:00
Steve Horsman	4f703e376b	Merge pull request #11201 from BbolroC/remove-non-tee-from-required-tests ci: Remove run-k8s-tests-coco-nontee from required tests	2025-04-28 10:05:07 +01:00
Hyounggyu Choi	9fe70151f7	ci: Remove run-k8s-tests-coco-nontee from required tests In #11044, `run-k8s-tests-coco-nontee` was set as requried by mistake. This PR disables the test again. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-04-28 10:48:08 +02:00
Steve Horsman	83d31b142b	Merge pull request #11044 from Jakob-Naucke/basic-s390x-ci ci: Extend basic s390x tests	2025-04-28 09:14:00 +01:00
Fupan Li	3457572130	Merge pull request #10579 from Apokleos/pcilibs-rs kata-sys-utils: Introduce pcilibs for getting pci devices info	2025-04-27 16:39:40 +08:00
Alex Lyn	43b5a616f6	Merge pull request #11166 from Apokleos/memcfg-adjust kata-types: Optimize memory adjuesting by only gathering memory info	2025-04-27 15:57:45 +08:00
Fabiano Fidêncio	b747f8380e	clh: Rework CreateVM to reduce the amount of cycles Otherwise the static checks will whip us as hard as possible. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-25 21:30:47 +02:00
Champ-Goblem	9f76467cb7	runtime: clh: Add reclaim_guest_freed_memory [BACKPORT] We're bringing to Cloud Hypervisor only the reclaim_guest_freed_memory option already present in the runtime-rs. This allows us to use virtio-balloon for the hypervisor to reclaim memory freed by the guest. The reason we're not touching other hypervisors is because we're very much aware of avoiding to clutter the go code at this point, so we'll leave it for whoever really needs this on other hypervisor (and trust me, we really do need it for Cloud Hypervisor right now ;-)). Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-25 21:05:53 +02:00
Fabiano Fidêncio	1c72d22212	Merge pull request #11186 from fidencio/topic/kernel-add-taskstats-to-the-config kernel: Add CONFIG_TASKSTATS (and related) configs	2025-04-25 15:28:04 +02:00
Steve Horsman	213f9ddd30	Merge pull request #11191 from fidencio/topic/release-3.16.0-bump release: Bump version to 3.16.0	2025-04-25 09:04:31 +01:00
Fabiano Fidêncio	fc4e10b08d	release: Bump version to 3.16.0 Bump VERSION and helm-chart versions Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-25 08:17:15 +02:00
Fabiano Fidêncio	b96685bf7a	Merge pull request #11153 from fidencio/topic/build-allow-choosing-which-runtime-will-be-built build: Allow users to build the go, rust, or both runtimes	2025-04-25 08:13:07 +02:00
Fabiano Fidêncio	800c05fffe	Merge pull request #11189 from kata-containers/sprt/fix-create-cluster temp: ci: Fix AKS cluster creation	2025-04-24 23:01:12 +02:00
Aurélien Bombo	1de466fe84	temp: ci: Fix AKS cluster creation The AKS CLI recently introduced a regression that prevents using aks-preview extensions (Azure/azure-cli#31345), and hence create CI clusters. To address this, we temporarily hardcode the last known good version of aks-preview. Note that I removed the comment about this being a Mariner requirement, as aks-preview is also a requirement of AKS App Routing, which will be introduced soon in #11164. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-04-24 15:06:14 -05:00
Dan Mihai	706c2e2d68	Merge pull request #11184 from microsoft/danmihai1/retry-genpolicy ci: retry genpolicy execution	2025-04-24 08:01:22 -07:00
Champ-Goblem	cf4325b535	kernel: Add CONFIG_TASKSTATS (and related) configs Knowing that the upstream project provides a "ready to use" version of the kernel, it's good to include an easy way to users to monitor performance, and that's what we're doing by enabling the TASKSTATS (and related) kernel configs. This has been present as part of older kernels, but I couldn't reasonably find the reason why it's been dropped. Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-24 11:51:21 +02:00
Fabiano Fidêncio	7e9e9263d1	build: Allow users to build the go, rust, or both runtimes Let's add a RUNTIME_CHOICE env var that can be passed to be build scripts, which allows the user to select whether they bulld the go runtime, the rust runtime, or both. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-24 10:36:26 +02:00
Alex Lyn	8b49564c01	Merge pull request #10610 from Xynnn007/faet-initdata-rbd Feat \| Implement initdata for bare-metal/qemu hypervisor	2025-04-24 09:59:14 +08:00
Alex Lyn	e8f19609b9	Merge pull request #11150 from zvonkok/cdi-annotations gpu: Fix CDI annotations	2025-04-24 09:58:16 +08:00
Dan Mihai	517d6201f5	ci: retry genpolicy execution genpolicy is sending more HTTPS requests than other components during CI so it's more likely to be affected by transient network errors similar to: ConnectError( "dns error", Custom { kind: Uncategorized, error: "failed to lookup address information: Try again", }, ) Note that genpolicy is not the only component hitting network errors during CI. Recent example from a different component: "Message: failed to create containerd task: failed to create shim task: failed to async pull blob stream HTTP status server error (502 Bad Gateway)" This CI change might help just with the genpolicy errors. Fixes: #11182 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-04-23 21:38:12 +00:00
Zvonko Kaiser	3946435291	gpu: Handle VFIO devices with DevicePlugin and CDI We can provide devices during cold-plug with CDI annotation on a Pod level and add per container device information wit the device plugin. Since the sandbox has already attached the VFIO device remove them from consideration and just apply the inner runtime CDI annotation. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-23 21:02:06 +00:00
Zvonko Kaiser	486244b292	gpu: Remove unneeded parsing of CDI devices The addition of CDI devices is now done for single_container and pod_sandbox and pod_container before the devmanager creates the deviceinfos no need for extra parsing. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-23 21:02:06 +00:00
Zvonko Kaiser	6713db8990	gpu: Add CDI parsing for Sandbox as well Extend the CDI parsing for pod_sandbox as well, only single_container was covered properly. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-23 21:02:06 +00:00
Zvonko Kaiser	97f4bcb456	gpu: Remove CDI annotations for outer runtime After the outer runtime has processed the CDI annotation from the spec we can delete them since they were converted into Linux devices in the OCI spec. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-23 21:02:06 +00:00
Steve Horsman	6102976d2d	Merge pull request #11178 from stevenhorsman/gperf-mirror versions: Switch gperf mirror	2025-04-23 20:21:42 +01:00
stevenhorsman	09052faaa0	versions: Switch gperf mirror Every so often the main gnu site has an outage, so we can't download gperf. GNU providesthe generic URL https://ftpmirror.gnu.org to automatically choose a nearby and up-to-date mirror, so switch to this to help avoid this problem Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 15:29:54 +01:00
stevenhorsman	ed56050a99	versions: Bump golangci-lint version v1.60.0+ is needed for go 1.23 support, so bump to the current latest 1.x version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 12:37:48 +01:00
stevenhorsman	1c9d7ce0eb	ci: cri-containerd: Remove source from install_go.sh If the correct version of go is already installed then install_go.sh runs `exit`. When calling this as source from cri-containerd/gha-run.sh it means all dependencies after are skipped, so remove this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 12:37:48 +01:00
stevenhorsman	c37840ce80	versions: Bump golang version Bump golang version to the latest minor 1.23.x release now that 1.24 has been released and 1.22.x is no longer stable and receiving security fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 12:37:48 +01:00
dependabot[bot]	463fd4eda4	build(deps): bump crossbeam-channel from 0.5.14 to 0.5.15 in /src/agent Bumps [crossbeam-channel](https://github.com/crossbeam-rs/crossbeam) from 0.5.14 to 0.5.15. - [Release notes](https://github.com/crossbeam-rs/crossbeam/releases) - [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md) - [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-channel-0.5.14...crossbeam-channel-0.5.15) --- updated-dependencies: - dependency-name: crossbeam-channel dependency-version: 0.5.15 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-04-23 11:34:14 +00:00
Steve Horsman	1ffce3ff70	Merge pull request #11173 from stevenhorsman/update-before-install workflows: Add apt update before install	2025-04-23 12:32:54 +01:00
stevenhorsman	ccfdf59607	workflows: Add apt update before install Add apt/apt-get updates before we do apt/apt-get installs to try and help with issues where we fail to fetch packages Co-authored-by: Fabiano Fidêncio <fidencio@northflank.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 09:06:08 +01:00
Xynnn007	b1c72c7094	test: add integration test for initdata This test we will test initdata in the following logic 1. Enable image signature verification via kernel commandline 2. Set Trustee address via initdata 3. Pull an image from a banned registry 4. Check if the pulling fails with log `image security validation failed` the initdata works. Note that if initdata does not work, the pod still fails to launch. But the error information is `[CDH] [ERROR]: Get Resource failed` which internally means that the KBS URL has not been set correctly. This test now only runs on qemu-coco-dev+x86_64 and qemu-tdx Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-04-23 15:55:04 +08:00
RuoqingHe	ef12dcd7da	Merge pull request #11158 from RuoqingHe/2025-04-15-fix-flag-calc runtime-rs: Use bitwise or assign for bitflags	2025-04-23 15:20:33 +08:00
alex.lyn	9eb3fcb84b	kata-types: Clean up noise caused by unformatted code For a long time, there has been unformatted code in the kata-types codebase, for example: ``` if qemu.memory_info.enable_guest_swap { - return Err(eother!( - "Qemu hypervisor doesn't support enable_guest_swap" - )); + return Err(eother!("Qemu hypervisor doesn't support enable_guest_swap")); } ... - }, device::DRIVER_NVDIMM_TYPE, eother, resolve_path + }, + device::DRIVER_NVDIMM_TYPE, + eother, resolve_path, -use std::collections::HashMap; -use anyhow::{Result, anyhow}; +use anyhow::{anyhow, Result}; use std::collections::hash_map::Entry; +use std::collections::HashMap; -/// DRIVER_VFIO_PCI_GK_TYPE is the device driver for vfio-pci +/// DRIVER_VFIO_PCI_GK_TYPE is the device driver for vfio-pci ``` This has brought unnecessary difficulties in version maintenance and commit difficulties. This commit will address this issue. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:40:07 +08:00
alex.lyn	97a1942f86	kata-types: Optimize memory adjuesting by only gathering memory info The Coniguration initialization was observed to be significantly slow due to the extensive system information gathering performed by `sysinfo::System::new_all()`. This function collects data on CPU, memory, disks, and network, most of which is unnecessary for Kata's memory adjusting config phase, where only the total system memory is required. This commit optimizes the initialization process by implementing a more targeted approach to retrieve only the total system memory. This avoids the overhead of collecting a large amount of irrelevant data, resulting in a noticeable performance improvement. Fixes #11165 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:40:07 +08:00
alex.lyn	3e77377be0	kata-sys-utils: Add test cases for devices In this, the crate mockall is introduced to help mock get_all_devices. Fixes #10556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:32:04 +08:00
alex.lyn	f714b6c049	kata-sys-utils: Add test cases for pci manager Fixes #10556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:32:04 +08:00
alex.lyn	0cdc05ce0a	kata-sys-utils: Introduce method to help handle proper BAR memory We need more information (BAR memory and other future ures...)for PCI devices when vfio devices passed through. So the method get_bars_max_addressable_memory is introduced for vfio devices to deduce the memory_reserve and pref64_reserve for NVIDIA devices. But it will be extended for other devices. Fixes #10556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:32:04 +08:00
alex.lyn	f5eaaa41d5	kata-sys-utils: Introduce pcilibs to help get pci device info It's the basic framework for getting information of pci devices. Currently, we focus on the PCI Max bar memory size, but it'll be extended in the future. Fixes #10556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:32:04 +08:00
Ruoqing He	d7f4b6cbef	runtime-rs: Use bitwise or assign for bitflags Use `\|=` instead of `+=` while calculating and iterating through a vector of flags, which makes more sense and prevents situations like duplicated flags in vector, which would cause problems. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-22 23:55:11 +00:00
Jakob Naucke	1c3b1f5adb	ci: Extend basic s390x tests Currently, s390x only tests cri-containerd. Partially converge to the feature set of basic-ci-amd64: - containerd-sandboxapi - containerd-stability - docker with the appropriate hypervisors. Do not run tests currently skipped on amd64, as well as - agent-ctl, which we don't package for s390x - nerdctl, does not package the `full` image for s390x - nydus, does not package for s390x Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-04-22 21:34:02 +02:00
Aurélien Bombo	bf93b5daf1	Merge pull request #11113 from Sumynwa/sumsharma/policy_execprocess_container_id genpolicy: Add container_id & related policy container data to state.	2025-04-22 18:37:58 +01:00
Aurélien Bombo	318c409ed6	Merge pull request #11126 from gkurz/rootfs-systemd-files rootfs: Don't remove files from the rootfs by default	2025-04-22 18:17:14 +01:00
Aurélien Bombo	12594a9f9e	Merge pull request #11157 from wainersm/make_nontee_job_not_required ci: demote CoCo non-TEE to non-required from gatekeeper	2025-04-22 18:15:28 +01:00
Greg Kurz	734e7e8c54	rootfs: Don't remove files from the rootfs by default Recent PR #10732 moved the deletion of systemd files and units that were deemed uneccessary by `02b3b3b977` from `image_builder.sh` to `rootfs.sh`. This unfortunately broke `rootfs.sh centos` and `rootfs.sh -r` as used by some other downstream users like fedora and RHEL, with the following error : Warning FailedCreatePodSandBox 1s (x5 over 63s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = CreateContainer failed: Establishing a D-Bus connection Caused by: 0: I/O error: Connection reset by peer (os error 104) 1: Connection reset by peer (os error 104) This is because the aforementioned distros use dbus-broker [1] that requires systemd-journald to be present. It is questionable that systemd units or files should be deemed unnecessary for _all_ distros but this has been around since 2019. There's now also a long-standing expectation from CI that `make rootfs && make image` does remove these files. In order to accomodate all the expectations, add a `-d` flag to `rootfs.sh` to delete the systemd files and have `make rootfs` to use it. [1] https://github.com/bus1/dbus-broker Reported-by: Niteesh Dubey <niteesh@us.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2025-04-17 16:53:05 +02:00
Zvonko Kaiser	497ab9faaf	Merge pull request #10999 from zvonkok/rootfs-updates gpu: Update creation permissions	2025-04-16 10:15:38 -04:00
Wainer dos Santos Moschetta	90397ca4fe	ci: demote CoCo non-TEE to non-required from gatekeeper The CoCo non-TEE job has failed due the removal of an add-on from AKS, causing KBS to not get installed (see #11156). The fix should be done in this repo as well as in trustee, which can take some time. We don't want to hold kata-containers PRs from getting merged anylonger, so removing the job from required list. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-15 19:00:30 -03:00
Wainer Moschetta	ff9fb19f11	Merge pull request #11026 from ldoktor/e2e-resources ci.ocp: Override default runtimeclass CPU resources	2025-04-15 10:33:35 -03:00
Lukáš Doktor	bfdf4e7a6a	ci.ocp: Add peer-pods setup script this script will be used in a new OCP integration pipeline to monitor basic workflows of OCP+peer-pods. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-04-15 12:13:22 +02:00
Xynnn007	91bb6b7c34	runtime: add support for io.katacontainers.config.runtime.cc_init_data io.katacontainers.config.runtime.cc_init_data specifies initdata used by the pod in base64(gzip(initdata toml)) format. The initdata will be encapsulated into an initdata image and mount it as a raw block device to the guest. The initdata image will be aligned with 512 bytes, which is chosen as a usual sector size supported by different hypervisors like qemu, clh and dragonball. Note that this patch only adds support for qemu hypervisor. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-04-15 16:35:59 +08:00
Sumedh Sharma	2a17628591	genpolicy: Add container_id & related policy container data to state. This commit adds changes to add input container_id and related container data to state after a CreateContainerRequest is allowed. This helps constrain reference container data for evaluating request inputs to one instead of matching against every policy container data, Ex: in ExecProcessRequest inputs. Fixes #11109 Signed-off-by: Sumedh Sharma <sumsharma@microsoft.com>	2025-04-15 14:02:59 +05:30
Zvonko Kaiser	2f28be3ad9	gpu: Update creation permissions We need to make sure the device files are created correctly in the rootfs otherwise kata-agent will apply permission 0o000. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-14 21:02:34 +00:00
Fabiano Fidêncio	bfd4b98355	Merge pull request #11142 from fidencio/topic/build-scripts-improvements-for-users build: User-facing improvements for the build scripts	2025-04-14 19:28:12 +02:00
Fabiano Fidêncio	5e363dc277	virtiofsd: Update to v1.13.1 It's been released for some time already ... and although we did have the necessary patches in, we better to stick to a released version of the project. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-14 13:23:31 +02:00
Fabiano Fidêncio	2fef594f14	build: Allow users to define AGENT_POLICY This is mostly used for Kata Containers backing up Confidential Computing use cases, this also has benefits for the normal Kata Containers use cases, this it's left enabled by default. However, let's allow users to specify whether or not they want to have it enabled, as depending on their use-case, it just does not make sense. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-14 10:02:22 +02:00
Fabiano Fidêncio	5d0688079a	build: Allow users to specificy EXTRA_PKGS Right now we've had some logic to add EXTRA_PKGS, but those were restrict to the nvidia builds, and would require changing the file manually. Let's make sure a user can add this just by specifying an env var. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-14 10:02:22 +02:00
Fabiano Fidêncio	40a15ac760	build: Allow adding a guest-hook to the rootfs Kata Containers provides, since forever, a way to run OCI guest-hooks from the rootfs, as long as the files are dropped in a specific location defined in the configuration.toml. However, so far, it's been up to the ones using it to hack the generated image in order to add those guest hooks, which is far from handy. Let's add a way for the ones interested on this feature to just drop a tarball file under the same known build directory, spcificy an env var, and let the guest hooks be installed during the rootfs build. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-14 10:02:16 +02:00
RuoqingHe	0b4fea9382	Merge pull request #11134 from stevenhorsman/rust-toolchain rust: Add rust-toolchain.toml	2025-04-12 15:03:29 +08:00
Steve Horsman	792180a740	Merge pull request #11105 from stevenhorsman/required-tests-process-update doc: Update required job process	2025-04-11 14:53:27 +01:00
stevenhorsman	93830cbf4d	rust: Add rust-toolchain.toml Add a top-level rust-toolchain.toml with the version that matches version.yaml to ensure that we stay in sync Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-11 09:24:04 +01:00
Steve Horsman	ad68cb9afa	Merge pull request #11106 from stevenhorsman/rust-workspace-settings agent: Inherit rust workspace settings	2025-04-10 09:47:53 +01:00
Xynnn007	17d0db9865	agent: add initdata parse logic Kata-agent now will check if a device /dev/vd* with 'initdata' magic number exists. If it exists, kata-agent will try to read it. Bytes 9~16 are the length of the compressed initdata toml in little endine. Bytes starting from 17 is the compressed initdata. The initdata image device layout looks like 0 8 16 16+length ... EOF 'initdata' length gzip(initdata toml) paddings The initdata will be parsed and put as aa.toml, cdh.toml and policy.rego to /run/confidential-containers/initdata. When AgentPolicy is initialized, the default policy will be overwritten by that. When AA is to be launched, if initdata is once processed, the launch arg will include --initdata parameter. Also, if /run/confidential-containers/initdata/aa.toml exists, the launch args will include -c /run/confidential-containers/initdata/aa.toml. When CDH is to be launched, if initdata is once processed, the launch args will include -c /run/confidential-containers/initdata/cdh.toml Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-04-10 13:09:51 +08:00
stevenhorsman	75dc4ce3bf	doc: Update required job process Add information about using required-tests.yaml as a way to track jobs that are required. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 18:13:45 +01:00
Steve Horsman	0dbf4ec39f	Merge pull request #10678 from stevenhorsman/update-gatekeeper-rules-for-md-only-PRs ci: Update gatekeeper tests for md files	2025-04-09 18:10:05 +01:00
stevenhorsman	d1d60cfe89	ci: Update gatekeeper tests for md files Update the required-tests.yaml so that .md files only trigger the static tests, not the build, or CI Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 17:55:27 +01:00
Steve Horsman	9b401cd250	Merge pull request #11090 from stevenhorsman/required-test-updates ci: required-tests fixes/updates	2025-04-09 14:41:57 +01:00
stevenhorsman	576747b060	ci: Skip tests if we only update the required list When making new tests required, or removing existing tests from required, this doesn't impact the CI jobs, so we don't need to run all the tests. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 14:22:47 +01:00
stevenhorsman	9a7c5b914e	ci: required-tests fixes/updates - Remove metrics setup job - Update some truncation typos of job names - Add shellcheck-required - Remove the ok-to-test as a required label on the build test as it isn't needed as a trigger Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 14:22:37 +01:00
Xuewei Niu	5774f131ec	Merge pull request #10938 from Apokleos/fix-iommugrp-symlink runtime-rs: Simplify iommu group base name extraction from symlink	2025-04-09 19:23:48 +08:00
Xuewei Niu	fd9a4548ab	Merge pull request #11129 from RuoqingHe/entend-runtime-rs-workspace runtime-rs: Extend runtime-rs workspace and centralize local dependencies	2025-04-09 19:23:15 +08:00
stevenhorsman	6603cf7872	agent: Update vsock-exporter to use workspace settings To reduce duplication, we could update the vsock-exporter crate to use settings and versions from the agent, where applicable. > [!NOTE] > In order to use the workspace, this has bumped some crate versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 12:02:43 +01:00
stevenhorsman	2cb9fd3c69	agent: Update rustjail to use workspace settings - To reduce duplication, we could update the rustjail crate to use settings and versions from the agent, where applicable. - Also switch to using the derive feature in serde crate rather than the separate serde_derive to avoid keeping both versions in sync > [!NOTE] > In order to use the workspace, this has bumped some crate versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 12:02:43 +01:00
stevenhorsman	655255b50c	agent: Update policy to use workspace settings To reduce duplication, we could update the policy crate to use settings and versions from the agent, where applicable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 11:42:05 +01:00
stevenhorsman	1bec432ffa	agent: Create workspace package and dependencies - Create agent workspace dependencies and packge info so that the packages in the workspace can use them - Group the local dependencies together for clarity (like in #11129) Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 11:42:00 +01:00
Ruoqing He	28c09ae645	runtime-rs: Put local dependencies into workspace Put local dependencies into workspace to avoid complex path dependencies all over the workspace. This gives an overview of local dependencies this workspace uses, where those crates are located, and simplifies the local dependencies referencing process. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-09 07:30:29 +00:00
Ruoqing He	3769ad9c0d	runtime-rs: Group local dependencies Judging by the layout of the `Cargo.toml` files, local dependencies are intentionally separated from other dependencies, let's enforce it workspace-wise. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-09 03:52:16 +00:00
Ruoqing He	abb5fb127b	runtime-rs: Extend workspace to cover all crates Only `shim` and `shim-ctl` are incorporated in `runtime-rs`'s workspace, let's extend it to cover all crates in `runtime-rs/crates`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-09 03:51:48 +00:00
alex.lyn	58bebe332a	runtime-rs: Simplify iommu group base name extraction from symlink Just get base name from iommu group symlink is enough. As the validation will be handled in subsequent steps when constructing the full path /sys/kernel/iommu_groups/$iommu_group. In this PR, it will remove dupicalted validation of iommu_group. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-09 09:28:00 +08:00
Steve Horsman	8df271358e	Merge pull request #11128 from stevenhorsman/disable-metrics-jobs ci: Remove metric jobs	2025-04-08 18:16:35 +01:00
stevenhorsman	e6cca9da6d	ci: Remove metric jobs The metrics runner is broken, so skip the metrics jobs to stop the CI being stuck waiting. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-08 17:55:07 +01:00
RuoqingHe	713cbb0c62	Merge pull request #11121 from fidencio/topic/bump-kernel-lts versions: Bump LTS kernel	2025-04-08 17:28:31 +08:00
Xuewei Niu	d3c9cc4e36	Merge pull request #11014 from teawater/mem-agent-doc docs: Add how-to-use-memory-agent.md to howto	2025-04-08 17:20:25 +08:00
Fabiano Fidêncio	a40b919afe	Merge pull request #10724 from likebreath/0109/upgrade_clh_v43.0 versions: Upgrade to Cloud Hypervisor v45.0	2025-04-08 08:11:30 +02:00
Fabiano Fidêncio	bc04c390bd	versions: Bump LTS kernel 6.12.22 has been released Yesterday, let's bump to it. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-04-07 21:46:29 +02:00
Bo Chen	ee84068aed	versions: Upgrade to Cloud Hypervisor v45.0 Details of this release can be found in our roadmap project as iteration v45.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #10723 Signed-off-by: Bo Chen <bchen@crusoe.ai> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-07 20:33:34 +02:00
Dan Mihai	8779abd0a1	Merge pull request #11057 from mythi/tdx-qgs-uds runtime: qemu: add support to use TDX QGS via Unix Domain Sockets	2025-04-07 07:27:48 -07:00
Dan Mihai	e606a8deb5	Merge pull request #11103 from Ankita13-code/ankitapareek/policy-input-validation policy: Add missing input validations for ExecProcessRequest	2025-04-07 07:26:24 -07:00
Steve Horsman	ba92639481	Merge pull request #11094 from RuoqingHe/2025-03-28-enable-riscv-assets-build ci: Enable `build-kata-static-tarball-riscv64.yaml`	2025-04-07 11:26:15 +01:00
Fabiano Fidêncio	c75ea2582e	Merge pull request #11114 from fidencio/topic/allow-building-the-agent-without-enabling-guest-pull agent: Allow users to build without guest-pull	2025-04-06 12:17:27 +01:00
Fabiano Fidêncio	e3c98a5ac7	agent: Allow users to build without guest-pull For those not interested in CoCo, let's at least allow them to easily build the agent without the guest-pull feature. This reduces the binary size (already stripped) from 25M to 18M. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-04-04 22:58:43 +01:00
Ankita Pareek	7e450bc1c2	policy: Add missing input validations for ExecProcessRequest This commit introduces missing validations for input fields in ExecProcessRequest to harden the security policy. The changes include: - Update rules.rego to add null/empty field enforcements for String_user, SelinuxLabel and ApparmorProfile - Add unit test cases for ExecProcessRequest for each of the validations Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>	2025-04-03 12:53:59 +00:00
Hui Zhu	17af28acad	docs: Add how-to-use-memory-agent.md to howto Add how-to-use-memory-agent.md (How to use mem-agent to decrease the memory usage of Kata container) to docs to show how to use mem-agent. Fixes: #11013 Signed-off-by: Hui Zhu <teawater@gmail.com>	2025-04-02 17:45:59 +08:00
Lukáš Doktor	009aa6257b	ci.ocp: Override default runtimeclass CPU resources some of the e2e tests spawn a lot of workers which are mainly idle, but the scheduler fails to schedule them due to cpu resource overcommit. For our testing we are more focused on having actual pods running than the speed of the scheduled pods so let's increase the amount of schedulable pods by decreasing the default cpu requests. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-04-02 10:30:40 +02:00
RuoqingHe	2f134514b0	Merge pull request #11097 from kimullaa/robust-user-input kata-deploy: add INSTALLATION_PREFIX validation	2025-04-02 10:05:03 +08:00
Ruoqing He	96e43fbee5	ci: Enable `build-kata-static-tarball-riscv64.yaml` Previously we introduced `build-kata-static-tarball-riscv64.yaml`, enable that workflow in `ci.yaml`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-01 16:35:14 +08:00
RuoqingHe	10ceeb0930	Merge pull request #11104 from fidencio/topic/kata-deploy-create-runtimeclasses-by-default kata-deploy: Create runtimeclasses by default	2025-04-01 10:55:44 +08:00
RuoqingHe	b19a8c7b1c	Merge pull request #11066 from kimullaa/update-command-sample kernel: Update the usage in readme	2025-04-01 09:12:43 +08:00
RuoqingHe	b046f79d06	Merge pull request #11100 from kimullaa/remove-double-slash kata-deploy: remove the double "/"	2025-04-01 08:17:00 +08:00
Shunsuke Kimura	a05f5f1827	kata-deploy: add INSTALLATION_PREFIX validation INSTALLATION_PREFIX must begin with a "/" because it is being concatenated with /host. If there is no /, displays a message and makes an error. Fixes: #11096 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-04-01 06:47:30 +09:00
Shunsuke Kimura	a49b6f8634	kata-deploy: Moves the function to the top Move functions that may be used in validation to the top. Fixes: #11097 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-04-01 06:47:30 +09:00
Zvonko Kaiser	d81a1747bd	Merge pull request #11085 from kevinzs2048/fix-virtiomem runtime-go: qemu: Fix sandbox start failing with virtio-mem enable on arm64	2025-03-31 17:09:43 -04:00
Zvonko Kaiser	e5c4cfb8a1	Merge pull request #11081 from BbolroC/unsealed-secret-fix tests: Enable sealed secrets for all TEEs	2025-03-31 11:19:52 -04:00
Shunsuke Kimura	c0af0b43e0	kernel: Update the outdated usage in the readme Since it is difficult to update the README when modifying the options of ./build-kernel.sh, instead of update the README, we encourage users to run the -h command. Fixes: #11065 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-03-31 23:29:58 +09:00
Shunsuke Kimura	902cb5f205	kata-deploy: remove the double "/" Currently, ConfigPath in containerd.toml is a double "/" as follows. ``` [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-clh.options] ConfigPath = "/opt/kata/share/defaults/kata-containers//configuration-clh.toml" ... [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-cloud-hypervisor.options] ConfigPath = "/opt/kata/share/defaults/kata-containers//runtime-rs/configuration-cloud-hypervisor.toml" ... ``` So, removed the double "/". Fixes: #11099 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-03-31 22:31:36 +09:00
Fabiano Fidêncio	28be53ac92	kata-deploy: Create runtimeclasses by default Let's make the life of the users easier and create the runtimeclasses for them by default. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-31 11:29:44 +01:00
Xuewei Niu	abbc9c6b50	Merge pull request #11101 from RuoqingHe/runtime-rs-fix-fmt-check runtime-rs: Remove redundant empty line	2025-03-31 16:28:55 +08:00
Ruoqing He	3c78c42ea5	runtime-rs: Remove redundant empty line While running `cargo fmt -- --check` in `src/runtime-rs` directory, it errors out and suggesting these is an redundant empty line, which prevents `make check` of `runtime-rs` component from passing. Remove redundant empty line to fix this. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-31 00:39:04 +08:00
Steve Horsman	44bab5afc4	Merge pull request #11091 from fidencio/topic/ci-add-kata-deploy-tests-as-required gatekeeper: Add kata-deploy tests as required	2025-03-28 11:05:03 +00:00
Fabiano Fidêncio	5a08d748b9	Merge pull request #11088 from kimullaa/fix-cleanup-failure kata-deploy: Fix kata-cleanup's CrashLoopBackOff	2025-03-27 20:33:52 +01:00
Fabiano Fidêncio	700944c420	gatekeeper: Add kata-deploy tests as required kata-deploy tests have been quite stable, working for more than 10 days without any nightly failure (or any failure reported at all), and I'll be the one maintaining those. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-27 19:47:38 +01:00
Steve Horsman	97bd311a66	Merge pull request #11058 from stevenhorsman/required-static-checks-rename ci: Update static-checks strings	2025-03-27 12:56:28 +00:00
Xuewei Niu	54dcf0d342	Merge pull request #11056 from RuoqingHe/runtime-qemu-riscv runtime: Support and enable build on riscv64	2025-03-27 17:02:21 +08:00
Fabiano Fidêncio	047b7e1fb7	Merge pull request #11063 from lifupan/fix_compile runtime-rs: update the protobuf to 3.7.1	2025-03-27 09:52:20 +01:00
Fabiano Fidêncio	41b536d487	Merge pull request #11059 from microsoft/danmihai1/tests-common tests: k8s: clean-up shellcheck warnings in tests_common.sh	2025-03-27 09:51:49 +01:00
Shunsuke Kimura	9ab6ab9897	kata-deploy: Fix kata-cleanup's CrashLoopBackOff Since kata-deploy.sh references an undefined variable, kata-cleanup.yaml enters a CrashLoopBackOff state. ``` $ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-cleanup/base/kata-cleanup.yaml daemonset.apps/kubelet-kata-cleanup created $ kubectl get pods -n kube-system kubelet-kata-cleanup-zzbd2 0/1 CrashLoopBackOff 3 (33s ago) 80s $ kubectl logs -n kube-system daemonsets/kubelet-kata-cleanup /opt/kata-artifacts/scripts/kata-deploy.sh: line 19: SHIMS: unbound variable ``` Therefore, set an initial value for the environment variables. Fixes: #11083 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-03-27 15:00:19 +09:00
Hyounggyu Choi	0432d2fcdf	Merge pull request #11086 from BbolroC/fix-overwrite-containerd-config tests: Make sure /etc/containerd before writing config	2025-03-27 05:57:31 +01:00
Ruoqing He	46caa986bb	ci: Skip tests depend on virtualization on riscv64 `VMContainerCapable` requires a present `kvm` device, which is not yet available in our RISC-V runners. Skipped related tests if it is running on `riscv-builder`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:47:49 +08:00
Ruoqing He	7f0b1946c5	ci: Enable build-check for runtime on riscv64 `runtime` support for riscv64 is now ready, let enable building and testing on that component. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:38:30 +08:00
Yuting Nie	1f52f83309	runtime: Enable kata-check test on riscv64 Provide according tests to cover `kata-runtime` package, test `kata-runtime`'s `check` functionality on riscv64 platforms. Signed-off-by: Yuting Nie <nieyuting@iscas.ac.cn>	2025-03-27 10:36:55 +08:00
Yuting Nie	b6924ef5e5	runtime: Add getExpectedHostDetails for riscv64 Add `getExpectedHostDetails` with expected value according to template defined in `kata-check_data_riscv64_test.go`. This provides necessary `HostInfo` for tests to cover `kata-check_riscv64.go`. Signed-off-by: Yuting Nie <nieyuting@iscas.ac.cn>	2025-03-27 10:34:34 +08:00
Yuting Nie	594c5e36a6	runtime: Add mock data for kata-check Add definition of `testCPUInfoTemplate` which is retrieved from `/proc/cpuinfo` of a QEMU emulated virtual machine on virt board. Signed-off-by: Yuting Nie <nieyuting@iscas.ac.cn>	2025-03-27 10:33:42 +08:00
Yuting Nie	0ff5cb1e66	runtime: Enable testSetCPUTypeGeneric for riscv64 `testSetCPUTypeGeneric` will be used for writting `kata-check` in `kata-runtime` on riscv64 platforms, enable building for later testing. Signed-off-by: Yuting Nie <nieyuting@iscas.ac.cn>	2025-03-27 10:32:29 +08:00
Ruoqing He	2329aeec38	runtime: Disable race flag for riscv64 `-race` flag used for `go test` is not yet supported on riscv64 platforms, disable it for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:28:53 +08:00
Ruoqing He	1b4dbebb1b	runtime: Enable runtime to build on riscv64 Convert Rust arch to Go arch in Makefile, and add `riscv64-options.mk` to provide definitions required for runtime to build on riscv64. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:22:55 +08:00
Ruoqing He	805da14634	runtime: Enable runtime check for riscv64 Enable `kata-runtime check` command to work on riscv64 platforms to make sure required features/devices presents. Co-authored-by: Yuting Nie <nieyuting@iscas.ac.cn> Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:07:09 +08:00
Ruoqing He	96b2d25508	runtime: Define default values for QEMU riscv Provide default values while invoking QEMU as the hypervisor for Go runtime on riscv64 platform. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:05:36 +08:00
Ruoqing He	1662595146	runtime: Introduce riscv64 to govmm pkg Define `vmm` for riscv64, set `MaxVCPUs` to 512 as QEMU RISC-V virt Generic Virtual Platform [1] define. [1] https://www.qemu.org/docs/master/system/riscv/virt.html Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 09:57:49 +08:00
Ruoqing He	1e4963a3b2	runtime: Define availableGuestProtection for riscv64 `GuestProtection` feature is not made available yet, return `noneProtection` for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 09:34:53 +08:00
Ruoqing He	4947938ce8	runtime: Introduce riscv64 template for vm factory Set `templateDeviceStateSize` to 8 as other architectures did. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 09:28:32 +08:00
Zvonko Kaiser	b7cf4fd2e6	Merge pull request #11053 from ldoktor/ci ci: shellcheck fixes	2025-03-26 13:22:56 -04:00
Hyounggyu Choi	1e187482d4	tests: Make sure /etc/containerd before writing config We get the following error while writing containerd config if a base dir `/etc/containerd` does not exist like: ``` sudo tee /etc/containerd/config.toml << EOF ... EOF tee: /etc/containerd/config.toml: No such file or directory ``` The commit makes sure a base directory for containerd before writing config and drops the config file deletion because a default behaviour of `tee` is overwriting. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-26 18:19:45 +01:00
Hyounggyu Choi	0aa76f7206	tests: Enable sealed secrets for TEEs Fixes: #11011 This commit allows all TEEs to run the sealed secret test. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-26 17:50:41 +01:00
Hyounggyu Choi	423ad8341d	agent: Call cdh_handler for sealed secrets after add_storage() As reported in #11011, mounted secrets are available after a container image is pulled by add_storage() for IBM SE. But secure mount should be handled before the `add_storage()`. Therefore, this commit divides cdh_handler() into: - cdh_handler_trusted_storage() - cdh_handler_sealed_secrets() and calls cdh_handler_sealed_secrets() after add_storage() while keeping cdh_handler_trusted_storage() unchanged. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-26 17:50:41 +01:00
Fabiano Fidêncio	7a0ac55f22	Merge pull request #10984 from fidencio/topic/tests-kata-deploy-ground-work-to-rewrite-the-tests tests: kata-deploy: The rest of the ground work to rewrite the kata-deploy tests	2025-03-26 17:47:48 +01:00
Hyounggyu Choi	8088064b8b	tests: Set default policy before running sealed secrets tests The test `Cannot get CDH resource when deny-all policy is set` completes with a KBS policy set to deny-all. This affects the future TEE test (e.g. k8s-sealed-secrets.bats) which makes a request against KBS. This commit introduces kbs_set_default_policy() and puts it to the setup() in k8s-sealed-secrets.bats. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-26 17:44:38 +01:00
Jakob Naucke	d808cef2fb	agent: AP bind-associate for Secure Execution Kata Containers has support for both the IBM Secure Execution trusted execution environment and the IBM Crypto Express hardware security module (used via the Adjunct Processor bus), but using them together requires specific steps. In Secure Execution, the Acceleration and Enterprise PKCS11 modes of Crypto Express are supported. Both modes require the domain to be _bound_ in the guest, and the latter also requires the domain to be _associated_ with a _guest secret_. Guest secrets must be submitted to the ultravisor from within the guest. Each EP11 domain has a master key verification pattern (MKVP) that can be established at HSM setup time. The guest secret and its ID are to be provided at `/vfio_ap/{mkvp}/secret` and `/vfio_ap/{mkvp}/secret_id` via a key broker service respectively. Bind each domain, and for each EP11 domain, - get the secret and secret ID from the addresses above, - submit the secret to the ultravisor, - find the index of the secret corresponding to the ID, and - associate the domain to the index of this secret. To bind, add the secret, parse the info about the domain, and associate, the s390_pv_core crate is used. The code from this crate also does the AP online check, which can be removed from here. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-03-26 16:37:23 +01:00
Kevin Zhao	211a36559c	runtime-go: qemu: Fix sandbox start failing with virtio-mem enable on arm64 Also add CONFIG_VIRTIO_MEM to arm64 platform Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-03-26 22:31:00 +08:00
Fabiano Fidêncio	404e212102	tests: kata-deploy: Use helm_helper() With this we switch to fully testing with helm, instead of testimg with the kustomizations (which will soon be removed). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-26 13:30:15 +01:00
Fabiano Fidêncio	f7976a40e4	tests: Create a helm_helper() common function Let's use what we have in the k8s functional tests to create a common function to deploy kata containers using our helm charts. This will help us immensely in the kata-deploy testing side in the near future. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-26 13:30:11 +01:00
Fabiano Fidêncio	eb884d33a8	tests: k8s: Export all the default env vars on gha-run.sh This is not strictly needed, but it does help a lot when setting up a cluster manually, while still relying on those scripts. While here, let's also ensure the assignment is between quotes, to make shellchecker happier. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-26 13:23:16 +01:00
Saul Paredes	ae5c587efc	Merge pull request #11074 from Sumynwa/sumsharma/genpolicy_test genpolicy: Refactor tests to allow different request types in a testcases json.	2025-03-25 12:38:19 -07:00
Sumedh Sharma	3406df9133	genpolicy: Refactor tests to add different request types in testcases json This commit introduces changes to add test data for multiple request type in a single testcases.json file. This allows for stateful testing, for ex: enable testing ExecProcessRequest using policy state set after testing a CreateContainerRequest. Fixes #11073. Signed-off-by: Sumedh Sharma <sumsharma@microsoft.com>	2025-03-25 13:52:17 +05:30
Mikko Ylinen	85f3391bcf	runtime: qemu: add support to use TDX QGS via Unix Domain Sockets TDX Quote Generation Service (QGS) signs TDREPORT sent to it from Qemu (GetQuote hypercall). Qemu needs quote-generation-socket address configured for IPC. Currently, Kata govmm only enables vsock based IPC for QGS but QGS supports Unix Domain Sockets too which works well for host process to process IPC (Qemu <-> QGS). The QGS configuration to enable UDS is to run the service with "-port=0" parameter. The same works well here too: setting "tdx_quote_generation_service_socket_port=0" let's users to enable UDS based IPC. The socket path is fixed in QGS and cannot be configured: when "-port=0" is used, the socket appears in /var/run/tdx-qgs/qgs.socket. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-03-25 10:18:40 +02:00
RuoqingHe	7a704453b6	Merge pull request #11075 from microsoft/danmihai1/genpolicy-debug-build genpolicy: add support for BUILD_TYPE=debug	2025-03-25 14:59:15 +08:00
RuoqingHe	5d68600c06	Merge pull request #11010 from stevenhorsman/metrics-containerd-debugging metrics: Test improvements	2025-03-25 11:38:28 +08:00
Dan Mihai	15c9035254	genpolicy: add support for BUILD_TYPE=debug Use "cargo build --release" when BUILD_TYPE was not specified, or when BUILD_TYPE=release. The default "cargo build" behavior is to build in debug mode. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-24 16:10:20 +00:00
Jakob Naucke	683a482d64	protos: Add CDH GetResourceService Add service to get arbitrary data from Confidential Data Hub. Taken from https://github.com/confidential-containers/guest-components/tree/main/api-server-rest. Marked as `#[allow(dead_code)]` because planned use is architecture-specific at this time. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-03-24 15:46:40 +00:00
RuoqingHe	f6a1c6d0e0	Merge pull request #11069 from kimullaa/exit-if-action-is-invalid kata-deploy: return exit code for invalid argument	2025-03-24 09:40:39 +08:00
Shunsuke Kimura	e5d7414c33	kata-deploy: Return exit code for invalid argument It hangs when invalid arguments are specified. ```bash kata-deploy-6sr2p:/# /opt/kata-artifacts/scripts/kata-deploy.sh xxx Action: * xxx ... Usage: /opt/kata-artifacts/scripts/kata-deploy.sh [install/cleanup/reset] ERROR: invalid arguments ... ^C <- hang ``` I changed it to behave the same as when there are no arguments. ```bash kata-deploy-6sr2p:/# /opt/kata-artifacts/scripts/kata-deploy.sh Usage: /opt/kata-artifacts/scripts/kata-deploy.sh [install/cleanup/reset] ERROR: invalid arguments kata-deploy-6sr2p:/# echo $? 1 ``` Fixes: #11068 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-03-22 21:32:38 +09:00
Aurélien Bombo	17baa6199b	Merge pull request #11061 from RuoqingHe/2025-03-21-generalize-non-kvm ci: Generalize `GITHUB_RUNNER_CI_ARM64`	2025-03-21 15:23:51 -05:00
Fupan Li	4b93176225	runtime-rs: update the protobuf to 3.7.1 Since some files generated by protobuf were share between runtime-rs and kata agent, and the kata agent's dependency image-rs dependened protobuf@3.7.1, thus we'd better to keep the protobuf version aligned between runtime-rs and agent, otherwise, we couldn't compile the runtime-rs and agent at the same time. Fixes: https://github.com/kata-containers/kata-containers/issues/10650 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-21 17:46:12 +08:00
Ruoqing He	5e81f67ceb	ci: Generalize GITHUB_RUNNER_CI_ARM64 `GITHUB_RUNNER_CI_ARM64` is turned on for self hosted runners without virtualization to skipped those tests depend on virtualization. This may happen to other archs/runners as well, let's generalize it to `GITHUB_RUNNER_CI_NON_VIRT` so we can reuse it on other archs. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-21 09:49:44 +08:00
RuoqingHe	e84f7c2c4b	Merge pull request #11046 from mythi/drop-dcap-libs build: drop libtdx-attest	2025-03-21 09:23:33 +08:00
Dan Mihai	835c6814d7	tests: k8s/tests_common: avoid using regex More straightforward implementation of hard_coded_policy_tests_enabled, that avoids ShellCheck warning: warning: Remove quotes from right-hand side of =~ to match as a regex rather than literally. [SC2076] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 22:23:19 +00:00
Dan Mihai	d83b8349a2	tests: policy: avoid using caller's variable Fix unintended use of caller's variable. Use the corresponding function parameter instead. ShellCheck: warning: policy_settings_dir is referenced but not assigned. [SC2154] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:29 +00:00
Dan Mihai	59a70a2b28	tests: k8s/tests_common: avoid masking return values Avoid masking command return values by declaring and only then assigning. ShellCheck: warning: Declare and assign separately to avoid masking return values. [SC2155] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:29 +00:00
Dan Mihai	b895e3b3e5	tests: k8s/tests_common.sh: add variable assignments Pick the the values exported by other scripts. ShellCheck: warning: AUTO_GENERATE_POLICY is referenced but not assigned. [SC2154] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:29 +00:00
Dan Mihai	0f4de1c94a	tests: tests_common: remove useless assignment ShellCheck: warning: This assignment is only seen by the forked process. [SC2097] warning: This expansion will not see the mentioned assignment. [SC2098] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:29 +00:00
Dan Mihai	9c0d069ac7	tests: tests_common: prevent globbing and word splitting ShellCheck: note: Double quote to prevent globbing and word splitting. [SC2086] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	15961b03f7	tests: k8s/tests_common.sh: -n instead of ! -z ShellCheck: note: Use -n instead of ! -z. [SC2236] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	4589dc96ef	tests: k8s/tests_common.sh: add double quoting ShellCheck: note: Prefer double quoting even when variables don't contain special characters. [SC2248] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	cc5f8d31d2	tests: k8s/tests_common.sh: add braces ShellCheck: add braces around variable references: note: Prefer putting braces around variable references even when not strictly required. [SC2250] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	0d3f9fcee1	tests: tests_common: export variables used externally ShellCheck: export variables used outside of tests_common.sh - e.g., warning: timeout appears unused. Verify use (or export if used externally). [SC2034] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	5df43ffc7c	tests: k8s/tests_common.sh: Prefer [[ ]] over [ ] Replace [ ] with [[ ]] as advised by shellcheck: note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	f79fabab24	Merge pull request #11024 from microsoft/danmihai1/empty-exec-output tests: k8s: retry "kubectl exec" on empty output	2025-03-20 11:03:08 -07:00
stevenhorsman	70d32afbb7	ci: Remove metrics tests from required list The metrics tests haven't been stable, or required through github for many week now, so update the required-tests.yaml list to re-sync Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-20 16:03:03 +00:00
stevenhorsman	607b27fd7f	ci: Update static-checks strings With the refactor in #10948 the names of the static checks has changed, so update these. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-20 13:45:57 +00:00
Mikko Ylinen	f52a565834	build: drop libtdx-attest with the latest CoCo guest-components, tdx-attester no longer depends on libtdx attest. Stop installing it to the rootfs. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-03-20 10:45:30 +02:00
Steve Horsman	c0632f847f	Merge pull request #11043 from stevenhorsman/3.15.0-release release: Bump version to 3.15.0	2025-03-20 07:38:20 +00:00
Greg Kurz	e19b81225c	Merge pull request #11045 from kata-containers/sprt/fix-gha-tag security: ci: Pin third-party actions to commit hashes	2025-03-20 08:14:06 +01:00
Aurélien Bombo	a678046d13	gha: Pin third-party actions to commit hashes A popular third-party action has recently been compromised [1][2] and the attacker managed to point multiple git version tags to a malicious commit containing code to exfiltrate secrets. This PR follows GitHub's recommendation [3] to pin third-party actions to a full-length commit hash, to mitigate such attacks. Hopefully actionlint starts warning about this soon [4]. [1] https://www.cve.org/CVERecord?id=CVE-2025-30066 [2] https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised [3] https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions#using-third-party-actions [4] https://github.com/rhysd/actionlint/pull/436 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-19 13:52:49 -05:00
stevenhorsman	fad248ef09	release: Bump version to 3.15.0 Bump VERSION and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-19 17:28:06 +00:00
Fabiano Fidêncio	a6e5d28a15	Merge pull request #11055 from stevenhorsman/bump-github.com/containerd/containerd/v1.7.27 runtime: Update github.com/containerd/containerd	2025-03-19 18:19:10 +01:00
stevenhorsman	cb7c599180	runtime: Switch from deprecated tracer `go.opentelemetry.io/otel/trace.NewNoopTracerProvider` is deprectated now, so switch to `go.opentelemetry.io/otel/trace/noop.NewTracerProvider` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-19 14:22:06 +00:00
stevenhorsman	8f22b07aba	runtime: Update github.com/containerd/containerd Update to 1.7.27 to resolve CVE-2024-40635 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-19 13:48:04 +00:00
Lukáš Doktor	d708866b2a	ci.ocp: shellcheck various fixes various manual fixes. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:28 +01:00
Lukáš Doktor	7e11489daf	ci: shellcheck - collection of fixes manual fixes of various issues. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:23 +01:00
Lukáš Doktor	f62e08998c	ci: shellcheck - remove unused argument the "-a" argument was introduced with this tool but never was actually used. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:19 +01:00
Lukáš Doktor	02deb1d782	ci: shellcheck SC2248 SC2248 (style): Prefer double quoting even when variables don't contain special characters, might result in arguments difference, shouldn't in our cases. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:16 +01:00
Lukáš Doktor	d80e7c7644	ci: shellcheck SC2155 SC2155 (warning): Declare and assign separately to avoid masking return values, should be harmless. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:12 +01:00
Lukáš Doktor	6552ac41e0	ci: shellcheck SC2086 SC2086 Double quote to prevent globbing and word splitting, might break places where we deliberately use word splitting, but we are not using it here. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:08 +01:00
Lukáš Doktor	154a4ddc00	ci: shellcheck SC2292 SC2292 (style): Prefer [[ ]] over [ ] for tests in Bash/Ksh. This might result in different handling of globs and some ops which we don't use. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:03 +01:00
Lukáš Doktor	667e26036c	ci: shellcheck SC2250 Treat the SC2250 require-variable-braces in CI. There are no functional changes. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:25:44 +01:00
Zvonko Kaiser	d37d9feee9	Merge pull request #11035 from kata-containers/sprt/fix-dependabot security: ci: Remove `replace` directives in go.mod files	2025-03-18 12:43:46 -04:00
Steve Horsman	ba5b0777b5	Merge pull request #11002 from fitzthum/bump-gc-0130 Bump Trustee and Guest Components for coco v0.13.0	2025-03-17 16:31:23 +00:00
RuoqingHe	36d2dee3a4	Merge pull request #11042 from RuoqingHe/runtime-rs-riscv runtime-rs: Support and enable build on riscv64	2025-03-17 21:42:15 +08:00
Ruoqing He	cb7508ffdc	ci: Enable runtime-rs component build-check on riscv64 `runtime-rs` is now buildable and testable on riscv64 platforms, enable `build-check` on `runtime-rs`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-17 17:38:59 +08:00
Steve Horsman	f308cbba93	Merge pull request #11015 from AdithyaKrishnan/main CI: Mark SNP as a Required test	2025-03-17 09:27:28 +00:00
Ruoqing He	084fb2d780	runtime-rs: Enable RISC-V build Define `riscv64gc-options.mk` to enable `runtime-rs` to be built on RISC-V platforms. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-17 17:22:48 +08:00
Ruoqing He	fd6c16e209	kata-sys-util: Set NoProtection for riscv64 `available_guets_protection` is required for `runtime-rs` to infer while building it on riscv64 platforms. Set it to `NoProtection` as riscv64 does not support guest protection for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-17 17:22:48 +08:00
Aurélien Bombo	26bd7989b3	csi-kata-directvolume: Remove `replace` in go.mod Running `go mod tidy` and `go mod vendor` after this resulted in no-ops. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	b965fe8239	tests: Run `go mod vendor` `go mod tidy` was a no-op. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	e9f88757ba	tests: Remove `replace` directives in go.mod Same rationale as for runtime. With tests, the blackfriday replacement was actually meaningful, so I refactored some imports. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	35c92aa6ad	runtime: Run `go mod vendor` Regenerating go module files. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	fa0f85e8b0	runtime: Run `go mod tidy` Tidying up go.mod. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	c3a9c70d45	runtime: Remove `replace` directives in go.mod These replace directives aren't understood by dependabot, hence dependabot can claim to upgrade a dependency, while a replace directive still makes the dependency point to an old version. Fixes: #11020 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Adithya Krishnan Kannan	32dbee8d7e	CI: Mark SNP as a Required test The SNP CI has been consistently passing and we request the @kata-containers/architecture-committee to mark this test as a required test. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2025-03-14 12:48:55 -05:00
Dan Mihai	dab981b0bc	tests: k8s: retry "kubectl exec" on empty output Retry "kubectl exec" a few times if it unexpectedly produced an empty output string. This is an attempt to work around test failures similar to: https://github.com/kata-containers/kata-containers/actions/runs/13840930994/job/38730153687?pr=10983 not ok 1 Environment variables (from function `grep_pod_exec_output' in file tests_common.sh, line 394, in test file k8s-env.bats, line 36) `grep_pod_exec_output "${pod_name}" "HOST_IP=$[0-9]\+\(\.\\|$$\)\{4\}" "${exec_command[@]}"' failed That test obtained correct ouput from "sh -c printenv" one time, but the second execution of the same command returned an empty output string. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-14 17:03:03 +00:00
Tobin Feldman-Fitzthum	b7786fbcf0	agent: update image-rs for coco v0.13.0 image-rs has gotten a number of significant updates, eliminating corner cases with obscure containers, improving support for local certs, and more. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-03-14 10:44:10 -05:00
Tobin Feldman-Fitzthum	63ec1609bc	versions: update guest-components for coco v0.13.0 Update to the latest hash of guest-components. This will pick up some nice new features including using ec key for the rcar handshake. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-03-14 10:44:10 -05:00
Tobin Feldman-Fitzthum	c352905998	versions: bump trustee for coco v0.13.0 Update to new hashes for Trustee. The MSRV for Trustee is now 1.80.0 so bump the rust toolchain as well. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-03-14 10:44:04 -05:00
Steve Horsman	7968a3c09d	Merge pull request #11028 from Amulyam24/hooks gha: use runner hooks instead of pre/post scripts for ppc64le runners	2025-03-14 15:43:27 +00:00
stevenhorsman	1022d8d260	metrics: Update range for clh tests In `ef0e8669fb` we had been seeing some significantly lower minvalues in the jitter.Result test, so I lowered the mid-value rather than having a very high minpercent, but it appears that the variability of this result is very high, so we are still getting the occasional high value, so reset the midval and just have a bigger ranges on both sides, to try and keep the test stable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:54:30 +00:00
stevenhorsman	d77008b817	metrics: Further reduce repeats for boot time tests on qemu I've seen failures on the third run, so reduce it further to just run twice on qemu Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:53:26 +00:00
stevenhorsman	97151cce4e	metrics: Improve iperf timeout The kubectl wait has a built in timeout of 30s, so wrapping it in waitForProcess, means we have 180/2 * 30 delay, which is much longer than intended, so just set the timeout directly. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:53:26 +00:00
Amulyam24	becb760e32	gha: use runner hooks instead of pre/post scripts for ppc64le runners This PR makes changes to remove steps to run scripts for preparing and cleaning the runner and instead use runner hooks env variables to manage them. Fixes: #9934 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2025-03-14 17:12:54 +05:30
RuoqingHe	af4058fa82	Merge pull request #10889 from katexochen/p/config-idblock-qemu runtime: make SNP IDBlock configurable	2025-03-14 16:23:05 +08:00
Paul Meyer	a994f142d0	runtime: make SNP IDBlock configurable For a use case, we want to set the SNP IDBlock, which allows configuring the AMD ASP to enforce parameters like expected launch digest at launch. The struct with the config that should be enforced (IDBlock) is signed. The public key is placed in the auth block and the signature is verified by the ASP before launch. The digest of the public key is also part of the attestation report (ID_KEY_DIGESTS). Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-03-14 07:50:54 +01:00
RuoqingHe	810a6dafad	Merge pull request #10939 from mchtech/fix-unbound-var tools: initialize unbound variables in rootfs.sh	2025-03-14 08:22:05 +08:00
Saul Paredes	b7087eb0ea	Merge pull request #10983 from microsoft/cameronbaird/updateinterfacerequest-hardening-upstream genpolicy: Introduce UpdateInterfaceRequest rules in genpolicy-settings	2025-03-13 16:12:03 -07:00
Dan Mihai	b910daf625	Merge pull request #11012 from microsoft/saulparedes/validate_generated_name_upstr policy: validate pod generated name	2025-03-13 14:09:57 -07:00
Steve Horsman	199b16f053	Merge pull request #11022 from microsoft/danmihai1/polist-test-volume-path tests: k8s-policy-pod: safer host path volume source	2025-03-13 20:26:06 +00:00
Dan Mihai	0e26dd4ce8	tests: k8s-policy-pod: safer host path volume source Test using the host path /tmp/k8s-policy-pod-test instead of /var/lib/kubelet/pods. /var/lib/kubelet/pods might happen to contain files that CopyFileRequest would try to send to the Guest before CreateContainerRequest. Such CopyFileRequest was an unintended side effect of this test. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-13 18:56:57 +00:00
Cameron Baird	bceffd5ff6	genpolicy: Introduce UpdateInterfaceRequest rules in genpolicy-settings Introduce rules for UpdateInterfaceRequest and genpolicy tests for them. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-03-13 17:30:01 +00:00
Saul Paredes	1c406e9c1d	Merge pull request #11004 from microsoft/cameronbaird/updateroutesrequest-hardening-upstream genpolicy: Introduce UpdateRoutesRequest rules in genpolicy-settings	2025-03-13 10:11:39 -07:00
Saul Paredes	7a5db51c80	policy: validate pod generated name Validate sandbox name using a regex. If the YAML specifies metadata.name, use a regex that exact matches. If the YAML specifies metadata.generateName, use a regex that matches the prefix of the generated name. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-03-13 09:49:57 -07:00
Steve Horsman	e6a78e64e6	Merge pull request #10967 from stevenhorsman/coco-tests-required ci: Add coco required tests	2025-03-13 15:10:22 +00:00
mchtech	0e61eb215d	tools: initialize unbound variables in rootfs.sh Initialize unbound variables in rootfs.sh for RHEL series OS. Signed-off-by: mchtech <michu_an@126.com>	2025-03-13 22:57:43 +08:00
Fupan Li	592d58ca52	Merge pull request #11001 from RuoqingHe/enable-riscv-kernel-build kernel: Support and enable riscv kernel build	2025-03-13 19:28:00 +08:00
Ruoqing He	e0fb8f08d8	ci: Add riscv-builder to actionlint.yaml We have three SG2042 connected and labeled as `riscv-builder`, add that entry to `actionlint.yaml` to help linting while setting up workflows. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	a7e953c7a7	ci: Enable static-tarball build for riscv64 Enable `kernel` and `virtiofsd` static-tarball build for riscv64. Since `virtiofsd` was previously supported and `kernel` is supported now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	3c8a8ca9c2	kernel: Enable riscv kernel build Modify `build-kernel.sh` to enable building of riscv64 kernel. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	e316f633d8	kernel: Bump kata_config_version Bump kata_config_version since riscv kernel build is introduced. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	31446b8be8	kernel: Skip ACPI common fragment for riscv ACPI is not yet ratified and is still frequently evolving, disable acpi.conf for riscv architecture. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	ebd1214b2e	kernel: Introduce riscv mmu fragment conf Memory hotplug and related features is required, enable them in `mmu.conf`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	734f5d07a9	kernel: Introduce riscv pci fragment conf AIA (Advanced Interrupt Architecture) is available and enabled by default after v6.10 kernel, provide pci.conf to make proper use of IMSIC of AIA. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	19d78ca844	kernel: Introduce riscv base fragment conf Create `riscv` folder for riscv64 architecture to be inferred while constructing kernel configuration, and introduce `base.conf` which builds 64-bit kernel and with KVM built-in to kernel. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Cameron Baird	cf129f3744	genpolicy: Introduce UpdateRoutesRequest rules in genpolicy-settings Introduce rule to block routes from source addresses which are the loopback. Block routes added to the lo device. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-03-12 19:03:57 +00:00
Dan Mihai	71d4ad5fca	Merge pull request #11003 from microsoft/mahuber/grpc-1-58-3 runtime: upgrade grpc vendor dependency	2025-03-12 09:23:07 -07:00
Wainer Moschetta	8c2d1b374c	Merge pull request #10892 from ldoktor/webhook ci: Change the way we modify runtimeclass in webhook	2025-03-12 12:32:45 -03:00
RuoqingHe	386fed342c	Merge pull request #10990 from kata-containers/shell-check-vendor-skip workflows: shellcheck: Expand vendor ignore	2025-03-12 21:34:26 +08:00
Alex Lyn	fdc0d81198	Merge pull request #10994 from teawater/swap7 runtime-rs: Add guest swap support	2025-03-12 17:59:00 +08:00
Hui Zhu	796eab3bef	runtime-rs: Update swap option of configuration file Remove swap configuration from qemu config file because runtime-rs qemu support code doesn't support hotplug block device. Add swap configuration to dragonball and cloud-hypervisor config file. Fixes: #10988 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-12 13:51:35 +08:00
Dan Mihai	4f41989a6a	Merge pull request #11009 from mythi/e2e-skip-flaky-tests tests: k8s: skip trusted storage tests for qemu-tdx	2025-03-11 12:13:35 -07:00
Dan Mihai	e40251d9f8	Merge pull request #11006 from ryansavino/fix-confidential-ssh-dockerfile tests: fix confidential ssh Dockerfile	2025-03-11 11:22:23 -07:00
Aurélien Bombo	33f3a8cf5f	Merge pull request #10973 from microsoft/danmihai1/main ci: temporarily avoid using the Mariner Host image	2025-03-11 10:24:00 -05:00
Steve Horsman	420b282279	Merge pull request #10948 from RuoqingHe/better-matrix ci: Refactor matrix for `build-checks`	2025-03-11 14:13:10 +00:00
Mikko Ylinen	71531a82f4	tests: k8s: skip trusted storage tests for qemu-tdx follow other TEEs to skip trusted storage tests due to #10838. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-03-11 15:14:03 +02:00
Hui Zhu	93cd30862d	libs: Add AddSwapPath to service AgentService AddSwap send the pci path to guest kernel to let it add swap device. But some mmio device doesn't have pci path. To support it add AddSwapPath send virt_path to guest kernel as swap device. Fixes: #10988 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-11 16:02:48 +08:00
Hui Zhu	7787340ab6	runtime-rs: Add guest swap support This commit add guest swap support. When configuration enable_guest_swap is enabled, runtime-rs will start a swap task. When the VM start or update the guest memory, the swap task will be waked up to create and insert a swap file. Before this job, swap task will sleep some seconds (set by configuration guest_swap_create_threshold_secs) to reduce the impact on guest kernel boot performance and prevent the insertion of multiple swap files due to frequent memory elasticity within a short period. The size of swap file is set by configuration guest_swap_size_percent. The percentage of the total memory to be used as swap device. Fixes: #10988 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-11 16:02:31 +08:00
Hui Zhu	4cd9d70c4d	runtime-rs: Add is_direct to struct BlockConfig Add is_direct to struct BlockConfig. This option specifies cache-related options for block devices. Denotes whether use of O_DIRECT (bypass the host page cache) is enabled. If not set, use configurarion block_device_cache_direct. Fixes: #10988 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-11 15:44:40 +08:00
Ryan Savino	1dbe3fb8bc	tests: fix confidential ssh Dockerfile Need to set correct permissions for ssh directories and files Fixes: #11005 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-03-10 18:31:05 -05:00
Dan Mihai	e8405590c1	ci: temporarily avoid using the Mariner Host image Disable the Mariner host during CI, while investigating test failures with new Cloud Hypervisor v43.0. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-10 20:15:09 +00:00
Steve Horsman	730e007abd	Merge pull request #11000 from microsoft/danmihai1/print-exec-output2 tests: k8s: log kubectl exec ouput	2025-03-10 09:31:41 +00:00
Fupan Li	df9c6ae9d7	Merge pull request #10998 from teawater/ma_config runtime-rs: Add mem-agent config to clh and qemu config file	2025-03-10 16:23:20 +08:00
Dan Mihai	509e6da965	tests: k8s-env.bats: log exec output Log the "kubectl exec" ouput, just in case it helps investigate sporadic test errors like: https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329321?pr=10973 not ok 1 Environment variables (in test file k8s-env.bats, line 37) `grep "HOST_IP=$[0-9]\+\(\.\\|$$\)\{4\}"' failed It appears that the first exec from this test case produced the expected output: MY_POD_NAME=test-env but the second exec produced something else - that will be logged after this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-07 19:37:20 +00:00
Dan Mihai	95d47e4d05	tests: k8s-configmap.bats: log exec output Log the "kubectl exec" ouput, just in case it helps investigate sporadic test errors like: https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329268?pr=10973 not ok 1 ConfigMap for a pod (in test file k8s-configmap.bats, line 44) `kubectl exec $pod_name -- "${exec_command[@]}" \| grep "KUBE_CONFIG_2=value-2"' failed It appears that the first exec from this test case produced the expected output: KUBE_CONFIG_1=value-1 but the second exec produced something else - that will be logged after this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-07 19:35:45 +00:00
Dan Mihai	caee12c796	tests: k8s: add function to log exec output grep_pod_exec_output invokes "kubectl exec", logs its output, and checks that a grep pattern is present in the output. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-07 19:34:57 +00:00
Steve Horsman	014ff8476a	Merge pull request #10992 from microsoft/danmihai1/git-helper gha: always delete workspace on rebase error	2025-03-07 14:26:00 +00:00
Steve Horsman	cb682ef3c8	Merge pull request #10987 from RuoqingHe/enable-docker-on-riscv kata-deploy: Use docker.io for all architectures	2025-03-07 11:14:19 +00:00
Xuewei Niu	0671252466	Merge pull request #10760 from lifupan/route_flags_suport	2025-03-07 18:18:01 +08:00
Hui Zhu	691430ca95	runtime-rs: Add mem-agent config to clh and qemu config file Add mem-agent config to clh and qemu config file. Fixes: #10996 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-07 15:54:59 +08:00
Fupan Li	9a4c0a5c5c	agent: add the route flags support when adding routes Get the route entry's flags passed from host and set it in the add route request. Fixes: #7934 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	d929bc0224	agent: refactor the code of update routes/interfaces We can use the netlink update method to add a route or an interface address. There is no need to delete it first and then add it. This can save two system commissions. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	aad915a7a1	agent: upgrade the netlink related crates Upgrade rtnetlink and related crates to support route flags. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	0995c6528e	runtime-rs: add the route flags support Get the route entry's flags from the host and pass it into kata-agent to add route entries with flags support. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	cda6d0e36c	runtime-rs: upgrade the netlink related crates Upgrade netlink-packet-route and rtnetlink to support route flags. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	1ade2a874f	runtime: add the flags support to the route setting We should support the flags when add the route from host to guest. Otherwise, some route would be set failed. Fixes: #7934 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Dan Mihai	7b63f256e5	gha: fix git-helper issues reported by shellcheck ./tests/git-helper.sh:20:5: note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292] ./tests/git-helper.sh:22:26: note: Double quote to prevent globbing and word splitting. [SC2086] ./tests/git-helper.sh:23:7: note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-06 20:28:41 +00:00
Dan Mihai	04adcdace6	gha: always delete workspace on rebase error The workplace was already being deleted on non-x86_64 platforms, but x86_64 can be affected by the same problem too. That might have been the case with the SNP and TDX test runs from: https://github.com/kata-containers/kata-containers/actions/runs/13687511270/job/38313758751?pr=10973 https://github.com/kata-containers/kata-containers/actions/runs/13687511270/job/38313760086?pr=10973 Rebase worked fine for the same patch/PR on other platforms. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-06 20:24:09 +00:00
Ruoqing He	3a8131349e	kata-deploy: Use docker.io for all archietcutres Switch to `docker.io` provided by Ubuntu sources. It is not necessary for us to install docker through `get-docker.sh`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-07 02:22:31 +08:00
RuoqingHe	8ef8109b2f	Merge pull request #10985 from RuoqingHe/remove-s390x-conditional-compilation runtime-rs: Remove s390x conditional compilation	2025-03-06 23:13:11 +08:00
Pavel Mores	133528a63c	runtime-rs: remove snp_certs_path support SNP certs were apparently obsoleted by AMD. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-03-06 15:53:24 +01:00
stevenhorsman	a40d5d3daa	ci: Add arm64 K8s tests as required This is based on the request from @fidencio, who is one of the maintainers	2025-03-06 14:39:04 +00:00
stevenhorsman	f45b398170	ci: Add coco required tests Add the zvsi and nontee coco tests to the required jobs list Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-06 14:38:52 +00:00
stevenhorsman	ee0f0b7bfe	workflows: shellcheck: Expand vendor ignore - In the previous PR I only skipped the runtime/vendor directory, but errors are showing up in other vendor packages, so try a wildcard skip - Also update the job step was we can distinguish between the required and non-required versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-06 14:35:12 +00:00
Manuel Huber	c05b976ebe	runtime: upgrade grpc vendor dependency - remove hard link to v.1.47.0 in go.mod - run go mod tidy, go mod vendor to actually update to v1.58.3 - addresses CVE-2023-44487 Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2025-03-06 10:00:49 +00:00
Xuewei Niu	644af52968	Merge pull request #10876 from lifupan/fupan_containerd ci: cri-containerd: upgrade the LTS / Active versions for containerd	2025-03-06 17:08:40 +08:00
Hyounggyu Choi	bf41618a84	Merge pull request #10862 from BbolroC/enable-ibm-se-for-qemu-runtime-rs runtime-rs: Enable IBM SE for QEMU	2025-03-06 05:38:13 +01:00
Ruoqing He	ed6f57f8f6	runtime-rs: Restrict cloud-hypervisor feature Cloud-Hypervisor currently only supports `x86_64` and `aarch64`, this features should not be avaiable even if other architectures explicitly requires it. Restrict `cloud-hypervisor` feature to only `x86_64` and `aarch64`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-06 11:21:57 +08:00
Ruoqing He	6f894450fe	runtime-rs: Drop s390x target predicates Drop `target_arch = "s390x"` all over `runtime-rs`, it is strange to have such predicates on features and code while we do not support it. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-06 11:20:28 +08:00
Xuewei Niu	a54eed6bab	Merge pull request #10975 from teawater/fix_log_level runtime-rs: Fix log_level's comments in configuration-dragonball.toml.in	2025-03-06 10:05:09 +08:00
Alex Lyn	2619b57411	Merge pull request #10937 from Apokleos/bugfix-useless-annotation kata-types: Fix bugs related to annotations in kata-types	2025-03-06 09:37:29 +08:00
Hyounggyu Choi	c3e3ef7b25	Merge pull request #10981 from BbolroC/remove-sclp-console-s390x runtime: Remove console=ttysclp0 for s390x	2025-03-05 21:43:57 +01:00
Fabiano Fidêncio	80e95bd264	Merge pull request #10966 from kata-containers/topic/tests-bring-back-kata-deploy-tests tests: Bring back kata-deploy tests	2025-03-05 21:11:21 +01:00
Zvonko Kaiser	ae63bbb824	Merge pull request #10982 from zvonkok/fix-zvonkos-fix agent: fix permisssion according to runc	2025-03-05 15:08:48 -05:00
Fabiano Fidêncio	545780a83a	shellcheck: tests: k8s: Fix gha-run.sh warnings As we'll touch this file during this series, let's already make sure we solve all the needed warnings. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	50f765b19c	shellcheck: tests: Fix gha-run-k8s-common.sh warnings Let's fix all the warnings caught in this file, as we're already touching it. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	219db60071	tests: kata-deploy: microk8s: Re-work installation So we can ensure that the user has enough permissions to access microk8s. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	c337a21a4e	shellcheck: kata-deploy: Fix warnings He were fixing the few warnings we found in the files present in the functional tests for kata-deploy. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	fd832d0feb	tests: kata-deploy: Run installation with only one VMM It doesn't make much sense to test different VMMs as that wouldn't trigger a different code path. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	14bf653c35	tests: kata-deploy: Re-add tests, now using github runners As GitHub runners now support nested virt, we're don't depend on garm for those anymore. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Zvonko Kaiser	3cea080185	agent: fix permisssion according to runc The previous PR mistakenly set all perms to 0o666 we should follow what runc does and fetch the permission from the guest aka host if the file_mode == 0. If we do not find the device on the guest aka host fallback to 0. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-05 17:33:40 +00:00
Fupan Li	7024d3c600	CI: cri-containerd: upgrade the LTS / Active versions for containerd As we're testing against the LTS and the Active versions of containers, let's upgrade the lts version from 1.6 to 1.7 and active version from 1.7 to 2.0 to cover the sandboxapi tests. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-05 23:09:24 +08:00
Hyounggyu Choi	624f7bfe0b	runtime: Remove console=ttysclp0 for s390x After the introduction of the following kernel parameters (see #6163): ``` CONFIG_SCLP_VT220_TTY=y CONFIG_SCLP_VT220_CONSOLE=y ``` the system log for Kata components (e.g., the agent) no longer appeared on the SCLP console (i.e., /dev/ttysclp0). Let's switch to the default fallback console (likely /dev/console) for logging. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-05 15:06:08 +01:00
Zvonko Kaiser	a5629f9bfa	Merge pull request #10971 from zvonkok/host-guest-mapping agent: Enable VFIO and initContainers	2025-03-05 08:58:45 -05:00
Fabiano Fidêncio	504d9e2b66	Merge pull request #10976 from zvonkok/fix-dev-permissions agent: Fix default linux device permissions	2025-03-05 13:54:06 +01:00
Hyounggyu Choi	4ea7d274c4	runtime-rs: Add new runtimeClass qemu-se-runtime-rs When `KATA_HYPERVISOR` is set to `qemu-se-runtime-rs`, a configuration file is properly referenced and a runtime class should be created via kata-deploy. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-05 13:50:38 +01:00
Hyounggyu Choi	2c72cf5891	runtime-rs: Add SE configuration A configuration file, `configuration-qemu-se-runtime-rs.toml`, is referenced when the `qemu-se-runtime-rs` runtime is configured. This commit adds a template file and updates the Makefile configuration accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-05 13:50:38 +01:00
Hyounggyu Choi	65021caca6	Merge pull request #10963 from RuoqingHe/remove-arch-predicates-in-runtime-rs runtime-rs: Enable Dragonball only for x86_64 & aarch64	2025-03-05 09:10:33 +01:00
Zvonko Kaiser	c73ff7518e	agent: Fix default linux device permissions We had the default permissions set to 0o000 if the file_mode was not present, for most container devices this is the wrong default. Since those devices are meant also to be accessed by users and others add a sane default of 0o666 to devices that do not have any permissions set. Otherwise only root can acess those and we cannot run containers as a user. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-05 02:22:24 +00:00
Ruoqing He	186c88b1d5	ci: Move musl-tools installation into Setup rust `musl-tools` is only needed when a component needs `rust`, and the `instance` running is of `x86_64` or `aarch64`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-05 09:43:19 +08:00
Zvonko Kaiser	4bb0eb4590	Merge pull request #10954 from kata-containers/topic/metrics-kata-deploy Rework and fix metrics issues	2025-03-04 20:22:53 -05:00
Hui Zhu	c3c3f23b33	runtime-rs: Fix log_level's comments in configuration-dragonball.toml.in Add double quotes to fix log_level's comments in configuration-dragonball.toml.in. Fixes: #10974 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-05 09:21:08 +08:00
Dan Mihai	edf6af2a43	Merge pull request #10955 from microsoft/cameronbaird/hyp-loglevel-default-upstream runtime: Properly set default hyp loglevel to 1	2025-03-04 16:44:08 -08:00
Cameron Baird	d48116114e	runtime: Properly set default hyp loglevel to 1 Tweak default HypervisorLoglevel config option for clh to 1. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-03-04 20:36:40 +00:00
Zvonko Kaiser	248d04c20c	agent: Enable VFIO and initContainers We had a static mapping of host guest PCI addresses, which prevented to use VFIO devices in initContainers. We're tracking now the host-guest mapping per container and removing this mapping if a container is removed. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-04 19:53:52 +00:00
Fabiano Fidêncio	874129a11f	Merge pull request #10958 from stevenhorsman/shell-check-errors-fix Shell check errors fix	2025-03-04 17:37:36 +01:00
stevenhorsman	02a2f6a9c1	tests: Sanitize `K8S_TEST_ENTRY` Now we've added the double quotes around `${K8S_TEST_UNION[@]}`, so platforms are failing with: ``` Error: Test file "/home/ubuntu/runner/_layout/_work/kata-containers/kata-containers/tests/integration/kubernetes/k8s-nginx-connectivity.bats " does not exist ``` due to the line continuation, so sanitise the value to try and fix this. Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	e33ad56cf4	kernel: bump kata_config_version Bump kernel version as the build-kernel script was updated (even if there was no functional change). Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	2df3e5937a	ci/openshift-ci: Fix script error The space was missing before `]`, so fix this and also swtich to double square brackets and variable braces Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	9a9e88a38d	test: vfio: Attempt to fix logic This was checking that a literal string was non-zero. I'm assume it instead wanted to check if the file exists Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	b220cca253	shellcheck: Fix shellcheck SC2066 > Since you double-quoted this, it will not word split, and the loop will only run once. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	b8cfdd06fb	shellcheck: Fix shellcheck SC2071 > > is for string comparisons. Use -gt instead. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	eb90b93e3f	shellcheck: Fix shellcheck SC2104 > In functions, use return instead of break. > rationale: break or continue are used to abort or continue a loop, and are not the right way to exit a function. Use return instead. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	67bfd4793e	shellcheck: Fix shellcheck SC2242 > Can only exit with status 0-255. Other data should be written to stdout/stderr. Switch exit -1 to exit 1 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:01 +00:00
stevenhorsman	ed8347c868	shellcheck: Fix shellcheck SC2070 > -n doesn't work with unquoted arguments. Quote or use [[ ]] Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	dbba6b056b	shellcheck: Fix shellcheck SC2148 > Tips depend on target shell and yours is unknown. Add a shebang. Add ``` #!/usr/bin/env bash ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	c5ff513e0b	shellcheck: Fix shellcheck SC2068 > Double quote array expansions to avoid re-splitting elements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	58672068ff	shellcheck: Fix shellcheck SC2145 > Argument mixes string and array. Use * or separate argument. - Swap echos for printfs and improve formatting - Replace $@ with $* - Split arrays into separate arguments Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	bc2d7d9e1e	osbuilder: Skip shellcheck on test_images.sh I'm not sure if we use test_images anywhere, so before we invest the time to fix the 120 shellcheck errors and warnings we should decide if we want to keep it. See #10957 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	fb1d4b571f	workflows: Add required shellcheck workflow Start with a required smaller set of shellchecks to try and prevent regressions whilst we fix the current problems Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	b3972df3ca	workflows: Shellcheck - ignore vendor Ignore the vendor directories in our shellcheck workflow as we can't fix them. If there is a way to set this in shellcheckrc that would be better, but it doesn't seem to be implemented yet. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
Zvonko Kaiser	4df406f03c	Merge pull request #10965 from zvonkok/fix-init gpu: fix init symlinks	2025-03-03 14:46:41 -05:00
Zvonko Kaiser	eb2f75ee61	gpu: fix init symlinks With the recent changes we need to make sure NVRC is symlinked for init and sbin/init Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-03 17:21:59 +00:00
Greg Kurz	545022f295	Merge pull request #10817 from Jakob-Naucke/virtio-net-ccw Fix virtio-net-ccw	2025-03-03 17:37:46 +01:00
Hyounggyu Choi	e8aa5a5ab7	runtime-rs: Enable virtio-net-ccw for s390x When using `virtio-net-pci` for IBM SE, the following error occurs: ``` update interface: Link not found (Address: f2:21:48:25:f4:10) ``` On s390x, it is more appropriate to use the CCW type of virtio network device. This commit ensures that a subchannel is configured accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-03 16:34:03 +01:00
Hyounggyu Choi	59c1f0b59b	runtime-rs: Suppress kernel parameters for IBM SE For IBM SE, the following kernel parameters are not required: - Basic parameters (reboot and systemd-related) - Rootfs parameters This commit suppresses these parameters when IBM SE is configured. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-03 16:34:03 +01:00
Hyounggyu Choi	4c8e881a84	runtime-rs: Enable IBM SE support for QEMU This commit configures the command line for IBM Secure Execution (SE) and other TEEs. The following changes are made: - Add a new item `Se` to ProtectionDeviceConfig and handle it at sandbox - Introduce `add_se_protection_device()` for SE cmdline config - Bypass rootfs image/initrd validity checks when SE is configured. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-03 16:32:18 +01:00
Ruoqing He	2ecb2fe519	runtime-rs: Enable Dragonball for x86_64 & aarch64 `USE_BUILDIN_DB` is turned on by default for architectures do not support `Dragonball`, which leads `s390x` is building `runtime-rs` with `--features dragonball` presents. Let's restrict `USE_BUILDIN_DB` to be enable only for architectures supported by `Dragonball` (namely x86_64 and aarch64 as of now). Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-03 12:10:58 +08:00
stevenhorsman	c69509be1c	metrics: Reduce repeats for boot time tests on qemu On qemu the run seems to error after ~4-7 runs, so try a cut down version of repetitions to see if this helps us get results in a stable way. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:42:00 +00:00
stevenhorsman	0962cd95bc	metrics: Increase minpercent range for qemu iperf test We have a new metrics machine and environment and the iperf jitter result failed as it finished too quickly, so increase the minpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:32:26 +00:00
stevenhorsman	ef0e8669fb	metrics: Increase minpercent range for clh tests We have a new metrics machine and environment and the fio write.bw and iperf3 parallel.Results tests failed for clh, as below the minimum range, so increase the minpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:32:26 +00:00
stevenhorsman	f81c85e73d	metrics: Increase maxpercent range for clh boot times We have a new metrics machine and environment and the boot time test failed for clh, so increase the maxpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	435ee86fdd	metrics: Update iperf affinity The iperf deployment is quite a lot out of date and uses `master` for it's affinity and toleration, so update this to control-plane, so it can run on newer Kubernetes clusters Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	85bbc0e969	metrics: Increase wait time The new metrics runner seems slower, so we are seeing errors like: The iperf3 tests are failing with: ``` pod rejected: RuntimeClass "kata" not found ``` so give more time for it to succeed Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	4ce94c2d1b	Revert "metrics: Add init_env function to latency test" This reverts commit `9ac29b8d38`. to remove the duplicate `init_env` call Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	658a5e032b	metrics: Increase containerd start timeout - Move `kill_kata_components` from common.bash into the metrics code base as the only user of it - Increase the timeout on the start of containerd as the last 10 nightlies metric tests have failed with: ``` 223478 Killed sudo timeout -s SIGKILL "${TIMEOUT}" systemctl start containerd ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	3fab7944a3	workflows: Improve metrics jobs - As the metrics tests are largely independent then allow subsequent tests to run even if previous ones failed. The results might not be perfect if clean-up is required, but we can work on that later. - Move the test results check out of the latency test that seems arbitrary and into it's own job step - Add timeouts to steps that might fail/hang if there are containerd/K8s issues Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	6f918d71f5	workflows: Update metrics jobs Currently the run-metrics job runs a manual install and does this in a separate job before the metrics tests run. This doesn't make sense as if we have multiple CI runs in parallel (like we often do), there is a high chance that the setup for another PR runs between the metrics setup and the runs, meaning it's not testing the correct version of code. We want to remove this from happening, so install (and delete to cleanup) kata as part of the metrics test jobs. Also switch to kata-deploy rather than manual install for simplicity and in order to test what we recommend to users. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
Zvonko Kaiser	3f13023f5f	Merge pull request #10870 from zvonkok/module-signing gpu: add module signing	2025-03-01 09:51:24 -05:00
Zvonko Kaiser	d971e13446	gpu: Update rootfs.sh Only source NV scripts if variant starts with "nvidia-gpu" Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-01 02:08:29 +00:00
Fabiano Fidêncio	4018079b55	Merge pull request #10960 from fidencio/topic/kata-deploy-fix-k0s-deployment kata-deploy: k0s: Fix drop-in path	2025-02-28 18:49:46 +01:00
Zvonko Kaiser	94579517d4	shellcheck: Update nvidia_rootfs.sh With the new rules we need more updates. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 16:36:05 +00:00
Zvonko Kaiser	af1d6c2407	shecllcheck: Update nvidia_chroot.sh Make shellcheck happy with the new rules new updates needed Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 16:27:51 +00:00
Fabiano Fidêncio	c95f9885ea	kata-deploy: k0s: Fix drop-in path The drop-in path should be /etc/containerd (from the containers' perspective), which mounts to the host path /etc/k0s/containerd.d. With what we had we ended up dropping the file under the /etc/k0s/containerd.d/containerd.d/, which is wrong. This is a regression introduce by: `94b3348d3c` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-28 16:32:00 +01:00
Zvonko Kaiser	c4e4e14b32	kernel: bump kata_config_version Mandatory update to have a unique kernel version name Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 15:18:15 +00:00
Fabiano Fidêncio	d13be49f9b	Merge pull request #10846 from stalb/feature/microk8s-support kata-deploy: Update kata-deploy to support microk8s	2025-02-28 13:57:44 +01:00
Stephane Talbot	f80e7370d5	test: Verify deployement of kata-deploy on microk8s Enable fonctional test to verify deployment of kata-deploy on a Microk8s cluster Signed-off-by: Stephane Talbot <Stephane.Talbot@univ-savoie.fr>	2025-02-28 10:10:29 +01:00
Stéphane Talbot	f2ba224e6c	kata-deploy: Update kata-deploy to support microk8s Change kata-deploy script and Helm chart in order to be able to use kata-deploy on a microk8s cluster deployed with snap. Fixes: #10830 Signed-off-by: Stephane Talbot <Stephane.Talbot@univ-savoie.fr>	2025-02-28 10:10:29 +01:00
Ruoqing He	09030ee96e	ci: Refactor build-checks workflow Refator matrix setup and according dependencies installation logic in `build-checks.yaml` and `build-checks-preview-riscv64.yaml` to provide better readability and maintainability. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-28 09:47:25 +08:00
Ruoqing He	eb94700590	ci: Drop install-libseccomp matrix variant `install-libseccomp` is applied only for `agent` component, and we are already combining matrix with `if`s in steps, drop `install-libseccomp` in matrix to reduce complexity. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-28 09:44:53 +08:00
Zvonko Kaiser	4dadd07699	gpu: Update rootfs.sh Pass-through KBUILD_SIGN_PIN to the rootfs build Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	5ab3192c51	gpu: Update nvidia_rootfs.sh We need to handle KBUILD_SIGN_PIN so that the kbuild can decrypte the signing key Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	493ba63c77	gpu: Provide KBUILD_SIGN_PIN to the build.sh At the proper step pass-through the var KBUILD_SIGN_PIN so that the kernel_headers step has the PIN for encrypting the signing key. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	0309b70522	gpu: Pass-through KBUILD_SIGN_PIN In kata-deploy-binaries.sh we need to pass-through the var KBUILD_SIGN_PIN to the other static builder scripts. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	9602ba6ccc	gpu: Add proper KBUILD_SIGN_PIN to entry script Update kata-deploy-binaries-in-docker.sh to read the env variable KBUILD_SIGN_PIN that either can be set via GHA or other means. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	39d3b7fb90	gpu: Update NVIDIA chroot script We need to place the signing key and cert at the right place and hide the KBUILD_SIGN_PIN from echo'ing or xtrace Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	d815fb6f46	gpu: Update kernel-headers Use the kernel-headers as the extra_tarball to move the encrypted key and cert from stage to stage Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	c2cb89532b	gpu: Add the proper handling in build-kernel.sh If KBUILD_SIGN_PIN is provided we can encrypt the signing key for out-of-tree builds and second round jobs in GHA Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	bc8360e8a9	gpu: Add proper config for module signing We want to enable module signing in Kata and Coco Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:34 +00:00
Zvonko Kaiser	f485e52f75	Merge pull request #10953 from zvonkok/shellcheckrc ci: Add shellcheckrc	2025-02-27 13:35:23 -05:00
Fabiano Fidêncio	96ed706d20	Merge pull request #10950 from fidencio/topic/skip-arm-check-tests-that-depend-on-virt ci: arm64: Skip tests that depend on virt on non-virt capable runners	2025-02-27 18:26:32 +01:00
Zvonko Kaiser	abfbc0ab60	ci: Add shellcheckrc Let's have common rules over all shell files. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-27 17:11:24 +00:00
Zvonko Kaiser	33460386b9	Merge pull request #10803 from ryansavino/update-confidential-initrd-22.04 versions: update confidential initrd to 22.04	2025-02-27 09:29:36 -05:00
Fabiano Fidêncio	e18e1ec3a8	ci: arm64: Skip tests that depend on virt on non-virt capable runners The GitHub hosted runners for ARM64 do not provide virtualisation support, thus we're just skipping the tests as those would check whether or not the system is "VMContainerCapable". Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-27 14:43:21 +01:00
Wainer Moschetta	5fda6b69e8	Merge pull request #10883 from stevenhorsman/k0s-version-pinning ci: k8s: Pin k0s version to get cri-o tests back working	2025-02-27 10:11:59 -03:00
Steve Horsman	f3c22411fc	Merge pull request #10930 from stevenhorsman/codeql-config workflows: Add codeql config	2025-02-27 12:43:41 +00:00
stevenhorsman	d08787774f	ci: k8s: Use pinned k0s version Update the code to install the version of k0s that we have in our versions.yaml, rather than just installing the latest, to help our CI being less stable and prone to breaking due to things we don't control. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-27 11:33:23 +00:00
stevenhorsman	3fe35c1594	version: Add k0s version Add external versions support for k0s and initially pin it at v1.31.5 as our cri-o tests started failing when v1.32 became the latest Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-27 11:33:23 +00:00
Fabiano Fidêncio	6e236fd44c	Merge pull request #10652 from burgerdev/sysctls genpolicy: support sysctls from PodSpec and environment defaults	2025-02-27 08:25:14 +01:00
Dan Mihai	cb382e1367	Merge pull request #10925 from katexochen/p/fail-on-layer-pull genpolicy: fail when layer can't be processed	2025-02-26 13:28:38 -08:00
Ryan Savino	ceafa82f2e	tests: skip trusted storage tests for qemu-snp skip tests for trusted storage until #10838 is resolved. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-02-26 14:23:57 -06:00
Ryan Savino	a00a7c500a	build: initrd rootfs init symlink directly to systemd when no AGENT_INIT In some cases, /init is not following two levels of symlinks i.e. /init to /sbin/init to /lib/systemd/systemd Setting /init directly to /lib/systemd/systemd when AGENT_INIT is not mandated Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-02-26 14:23:56 -06:00
Markus Rudy	70709455ef	genpolicy: support sysctl settings Sysctls may be added to a container by the Kubernetes pod definition or by containerd configuration. This commit adds support for the corresponding PodSecurityContext field and an option to specify environment-dependent sysctls in the settings file. The sysctls requested in a CreateContainerRequest are checked against the sysctls in the pod definition, or if not defined there in the defaults in genpolicy-settings.json. There is no check for the presence of expected sysctls, though, because Kubernetes might legitimately omit unsafe syscalls itself and because default sysctls might not apply to all containers. Fixes: #10064 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-26 18:56:17 +01:00
Steve Horsman	5aa89bc1d7	Merge pull request #10831 from RuoqingHe/ci-riscv64 ci: Enable partial components build-check on riscv	2025-02-26 17:50:47 +00:00
Fabiano Fidêncio	9d8026b4e5	Merge pull request #10654 from burgerdev/cronjob genpolicy: add get_process_fields to CronJob	2025-02-26 15:13:40 +01:00
Fabiano Fidêncio	7b16df64c9	Merge pull request #10935 from burgerdev/error-messages runtime: add cause to CDI errors	2025-02-26 14:01:22 +01:00
Jakob Naucke	c146980bcd	agent: Handle virtio-net-ccw devices separately On s390x, a virtio-net device will use the CCW bus instead of PCI, which impacts how its uevent should be handled. Take the respective path accordingly. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:43 +01:00
Jakob Naucke	a084b99324	virtcontainers: Separate PCI/CCW for net devices On s390x, virtio-net devices should use CCW, alongside a different device path. Use accordingly. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:43 +01:00
Jakob Naucke	2aa523f08a	virtcontainers: Fix virtio-net-ccw address format Hex device number was formatted as hex twice, thus encoding the string as hex. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:43 +01:00
Jakob Naucke	2a992c4080	virtcontainers: Add CCW device to endpoint To support virtio-net-ccw for s390x, add CCW devices to the Endpoint interface. Add respective fields and functions to implementing structs. Device paths may be empty. PciPath resolves this by being a list that may be empty, but this design does not map to CcwDevice. Use a pointer instead. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:42 +01:00
Jakob Naucke	b325069d72	agent: Update QEMU URL Readthedocs URL was outdated. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:42 +01:00
Jakob Naucke	9935f9ea7e	proto: Rename Interface.pciPath to devicePath Field is being used for both PCI and CCW devices. Name it devicePath to avoid confusion when the device isn't a PCI device. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:42 +01:00
alex.lyn	a338af3f18	kata-types: Fix bugs related to annotations in kata-types It will address two issuses: (1) expected `,`: --> /root/kata-containers/src/libs/kata-types/tests/test_config.rs:15:9 \| 14 \| KATA_ANNO_CFG_HYPERVISOR_ENABLE_IO_THREADS \| - \| \| \| expected one of `,`, `::`, `as`, or `}` \| help: missing `,` 15 \| KATA_ANNO_CFG_HYPERVISOR_FILE_BACKED_MEM_ROOT_DIR, \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unexpected token (2) remove useless annotation `KATA_ANNO_CFG_HYPERVISOR_CTLPATH`. Fixes #10936 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-02-26 17:48:11 +08:00
Fabiano Fidêncio	47a5439a20	Merge pull request #10934 from fidencio/topic/agent-unbreak-non-guest-pull-build agent: Fix non-guest-pull build	2025-02-26 09:45:22 +01:00
Pavel Mores	c5e560e2d1	runtime-rs: handle ProtectionDevice in resource manager and sandbox As part of device preparation in Sandbox we check available protection and create a corresponding ProtectionDeviceConfig if appropriate. The resource-side handling is trivial. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Pavel Mores	eb47f15b10	runtime-rs: support ProtectionDevice in qemu-rs As an example, or a test case, we add some implementation of SEV/SEV-SNP. Within the QEMU command line generation, the 'Cpu' object is extended to accomodate the EPYC-v4 CPU type for SEV-SNP. 'Machine' is extended to support the confidential-guest-support parameter which is useful for other TEEs as well. Support for emitting the -bios command line switch is added as that seems to be the preferred way of supplying a path to firmware for SEV/SEV-SNP. Support for emitting '-object sev-guest' and '-object sev-snp-guest' with an appropriate set of parameters is added as well. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Pavel Mores	87deb68ab7	runtime-rs: add implementation of ProtectionDevice ProtectionDevice is a new device type whose implementation structure matches the one of other devices in the device module. It is split into an inner "config" part which contains device details (we implement SEV/SEV-SNP for now) and the customary outer "device" part which just adds a device instance ID and the customary Device trait implementation. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Pavel Mores	a3f973db3b	runtime-rs: extend SEV/SEV-SNP detection by including a details struct This matches the existing TDX handling where additional details are retrieved right away after TDX is detected. Note that the actual details (cbitpos) acquisition is NOT included at this time. This change might seem bigger than it is. The change itself is just in protection.rs, the rest are corresponding adjustments. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Pavel Mores	c549d12da7	runtime-rs: parse SEV-SNP related config file settings The 'sev_snp_guest' default value of 'false' is in compliance with the golang runtime behaviour. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Markus Rudy	d58f38dfab	genpolicy: add get_process_fields to CronJob This function was accidentally left unimplemented for CronJob, resulting in runAsUser not being supported there. Fixes: #10653 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-26 09:00:04 +01:00
Ruoqing He	ec020399b9	ci: Enable partial components build-check on riscv Since we have RISC-V builders available now, let's start with `agent-ctl`, `trace-forwarder` and `genpolicy` components to run build-checks on these `riscv-builder`s, and gradually add the rest components when they are ready, to catch up with other architectures eventually. This workflow could be mannually triggered, `riscv-builder` will be the default instance when that is the case. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-26 15:38:39 +08:00
Markus Rudy	1f6833bd0d	runtime: add cause to CDI errors Adding devices by CDI annotation can fail for a variety of reasons. If that happens, it's helpful to know the root cause of the issue (CDI spec missing, malformatted, requested device not present, etc.). This commit adds the root cause of the CDI device addition to the errors reported back to the caller. Since this error is bubbled up all the way back to the shimv2 task.Create handler, it will be visible in Kubernetes logs and enable fixing the root cause. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-26 08:36:15 +01:00
Paul Meyer	9981cdd8a8	genpolicy: fail when layer can't be processed Currently, if a layer can't be processed, we log this a warning and continue execution, finally exit with a zero exit code. This can lead to the generation of invalid policies. One reason a layer might not be processed is that the pull of that layer fails. We need all layers to be processed successfully to generate a valid policy, as otherwise we will miss the verity hash for that layer or we might miss the USER information from a passwd stored in that layer. This will cause our VM to not get through the agent's policy validation. Returning an error instead of printing a warning will cause genpolicy to fail in such cases. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-02-26 08:30:59 +01:00
Fabiano Fidêncio	b3b570e4c4	agent: Fix non-guest-pull build As the guest-pull is a very Confidental Containers specific feature, let's make sure we, at least, don't break folks who decide to build Kata Containers' agent without having this feature enabled (for instance, for the sake of the agent size). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-25 21:48:41 +01:00
Zvonko Kaiser	04c56a0aaf	Merge pull request #10931 from zvonkok/iommufd-fix gpu: IOMMUFD fix	2025-02-25 12:50:24 -05:00
Ruoqing He	ed50e31625	build: Reorganize target selection Architectures here with `musl` available are minority, which is more suitable for enumeration. With this change, we are implicitly choosing gnu target for `ppc64le`, `riscv64` and `s390x`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-26 00:56:54 +08:00
Ruoqing He	562911e170	build: Add riscv mapping for common.bash While installing Rust and Golang in our CI workflow, `arch_to_golang` and `arch_to_rust` are needed for inferring the correct arch string for riscv64 architecture. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-26 00:56:54 +08:00
Ruoqing He	62e2473c32	build: Add riscv64 to utils.mk Since `ARCH` for `riscv64` is `riscv64gc`, we'll need to override it in `utils.mk`, and forcing `gnu` target for `riscv64` because `musl` target is not yet made ready. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-26 00:56:54 +08:00
Zvonko Kaiser	804e5cd332	gpu: IOMMUFD provide proper ID We need a proper ID otherwise QEMU sometimes fails with invalid ID. Use the same pattern as with the old VFIO implementation. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-25 16:24:17 +00:00
stevenhorsman	c97e9e1592	workflows: Add codeql config I noticed that CodeQl using the default config hasn't scanned since May 2024, so figured it would be worth trying an explicit configuration to see if that gets better results. It's mostly the template, but updated to be more relevant: - Only scan PRs and pushes to the `main` branch - Set a pinned runner version rather than latest (with mac support) - Edit the list of languages to be scanned to be more relevant for kata-containers Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-25 15:05:43 +00:00
Fabiano Fidêncio	e09ae2cc0b	Merge pull request #10921 from RuoqingHe/drop-redundant-override build: Drop redundant ARCH override	2025-02-25 14:54:36 +01:00
Fabiano Fidêncio	c01e7f1ed5	Merge pull request #10932 from kata-containers/topic/consolidate-publish-workflow workflows: Refactor publish workflows	2025-02-25 14:50:40 +01:00
stevenhorsman	5000fca664	workflows: Add build-checks to manual CI Currently the ci-on-push workflow that runs on PRs runs two jobs: gatekeeper-skipper.yaml and ci.yaml. In order to test things like for the error ``` too many workflows are referenced, total: 21, limit: 20 ``` on topic branches, we need ci-devel.yaml to have an extra workflow to match ci-on-push, so add the build-checks as this is helpful to run on topic branches anyway. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-25 11:38:49 +00:00
stevenhorsman	23434791f2	workflows: Refactor publish workflows Replace the four different publish workflows with a single one that take input parameters of the arch and runner, so reduce the amount of duplicated code and try and avoid the ``` too many workflows are referenced, total: 21, limit: 20 ``` error	2025-02-25 10:49:09 +00:00
Fabiano Fidêncio	e3eb9e4f28	Merge pull request #10929 from kata-containers/topic/enable-arm-tests arm: ci: k8s: Enable CI	2025-02-24 19:34:28 +01:00
Fabiano Fidêncio	a6186b6244	ci: k8s: arm: Skip "Check the number vcpus are ..." test See https://github.com/kata-containers/kata-containers/issues/10928 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-24 18:43:24 +01:00
Fabiano Fidêncio	1798804c32	ci: k8s: arm: Skip "Pod quota" test See https://github.com/kata-containers/kata-containers/issues/10927 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-24 18:43:24 +01:00
Fabiano Fidêncio	053827cacc	ci: k8s: arm: Skip "Running within memory constraints" test See https://github.com/kata-containers/kata-containers/issues/10926 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-24 18:43:24 +01:00
Fabiano Fidêncio	7bd444fa52	ci: Run k8s tests on arm64 Let's take advantege of the current arm64 runners, and make sure we have those tests running there as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-02-24 18:43:20 +01:00
Aurélien Bombo	16aa6b9b4b	Merge pull request #10911 from kata-containers/sprt/fix-cgroup-race agent: Fix race condition with cgroup watchers	2025-02-24 10:28:58 -06:00
Ruoqing He	265a751837	build: Drop redundant ARCH override There are many `override ARCH = powerpc64le` after where `utils.mk` is included, which are redundant. Drop those redundant `override`s. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-24 22:04:28 +08:00
Fabiano Fidêncio	aa30f9ab1f	versions: Use jammy for x86_64 confidential initrd Set confidential initrd to use jammy rootfs Signed-off-by: Ryan Savino <ryan.savino@amd.com>	2025-02-22 23:57:16 -06:00
Aurélien Bombo	adca339c3c	ci: Fix GH throttling in run-nerdctl-tests Specify a GH API token to avoid the below throttling error: https://github.com/kata-containers/kata-containers/actions/runs/13450787436/job/37585810679?pr=10911#step:4:96 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Aurélien Bombo	111803e168	runtime: cgroups: Remove commented out code Doesn't seem like we're going to use this and it's confusing when inspecting code. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Aurélien Bombo	1f8c15fa48	Revert "tests: Skip k8s job test on qemu-coco-dev" This reverts commit `a8ccd9a2ac`. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Aurélien Bombo	7542dbffb8	Revert "tests: disable k8s-policy-job.bats on coco-dev" This reverts commit `47ce5dad9d`. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Aurélien Bombo	a1ed923740	agent: Fix race condition with cgroup watchers In the CI, test containers intermittently fail to start after creation, with an error like below (see #10872 for more details): # State: Terminated # Reason: StartError # Message: failed to start containerd task "afd43e77fae0815afbc7205eac78f94859e247968a6a4e8bcbb987690fcf10a6": No such file or directory (os error 2) I've observed this error to repro with the following containers, which have in common that they're all very short-lived by design (more tests might be affected): * k8s-job.bats * k8s-seccomp.bats * k8s-hostname.bats * k8s-policy-job.bats * k8s-policy-logs.bats Furthermore, appending a `; sleep 1` to the command line for those containers seemed to consistently get rid of the error. Investigating further, I've uncovered a race between the end of the container process and the setting up of the cgroup watchers (to report OOMs). If the process terminates first, the agent will try to watch cgroup paths that don't exist anymore, and it will fail to start the container. The added error context in notifier.rs confirms that the error comes from the missing cgroup: https://github.com/kata-containers/kata-containers/actions/runs/13450787436/job/37585901466#step:17:6536 The fix simply consists in creating the watchers before we start the container but still after we create it -- this is non-blocking, and IIUC the cgroup is guaranteed to already be present then. Fixes: #10872 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:11 -06:00
Fabiano Fidêncio	aaa7008cad	versions: Add a comment about "jammy" being 22.04 I missed that when I added the other comments, so, for the sake of consistency, let's just add it there as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-21 16:02:38 -06:00
Fabiano Fidêncio	a7d33cc0cb	build: Ensure MEASURED_ROOTFS is only used for images We never ever tested MEASURED_ROOTFS with initrd, and I sincerely do not know why we've been setting that to "yes" in the initrd cases. Let's drop it, as it may be causing issues with the jobs that rely on the rootfs-initrd-confidential. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-21 15:32:20 -06:00
Dan Mihai	b90c537f79	Merge pull request #10881 from mythi/build-fixes minor build fixes	2025-02-21 09:54:55 -08:00
Jeremi Piotrowski	304978ad47	Merge pull request #10784 from arvindskumar99/disable_nesting_checks Disabling Nesting Check for SNP upstream	2025-02-21 12:39:18 +01:00
Xuewei Niu	cdb29a4fd1	Merge pull request #10780 from RuoqingHe/setup-dragonball-workspace dragonball: Appease clippy, setup workspace and centralize RustVMM	2025-02-21 14:04:19 +08:00
Hyounggyu Choi	58647bb654	Merge pull request #10743 from zvonkok/iommufd-gpu-fix IOMMUFD GPU enhancement	2025-02-20 23:43:00 +01:00
Zvonko Kaiser	7cca2c4925	gpu: Use a dedicated VFIO group vs iommufd entry We do not want to abuse the sysfsentry lets use a dedicated devfsentry. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-20 18:27:52 +00:00
Zvonko Kaiser	9add633258	qemu: Add command line for IOMMUFD For each IOMMUFD device create an object and assign it to the device, we need additional information that is populated now correctly to decide if we run the old VFIO or new VFIO backend. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-20 18:27:50 +00:00
Fabiano Fidêncio	19a7f27736	Merge pull request #10906 from BbolroC/remove-measured-rootfs-check-for-shimv2-on-s390x shim-v2: Remove MEASURED_ROOTFS assignment for s390x	2025-02-20 15:53:50 +01:00
arvindskumar99	c0a3ecb27b	config: Disabling nesting check for SNP Adding disable_nesting_checks to accomodate SNP on Azure Signed-off-by: arvindskumar99 <arvinkum@amd.com>	2025-02-20 12:24:08 +01:00
Hyounggyu Choi	1a9dabd433	shim-v2: Remove MEASURED_ROOTFS assignment for s390x As a follow-up for #10904, we do not need to set MEASURED_ROOTFS to no on s390x explicitly. The GHA workflow already exports this variable. This commit removes the redundant assignment. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-20 10:43:36 +01:00
Greg Kurz	f51d84b466	Merge pull request #10904 from BbolroC/turn-off-measured-rootfs-s390x-gha-workflows GHA: Turn off MEASURED_ROOTFS in build-kata-static-tarball-s390x	2025-02-20 10:24:23 +01:00
Aurélien Bombo	601c403603	Merge pull request #10818 from burgerdev/plumbing agent: clear log pipes if denied by policy	2025-02-19 16:28:58 -06:00
Aurélien Bombo	cb3467535c	tests: Add policy test for ReadStreamRequest This test verifies that, when ReadStreamRequest is blocked by the policy, the logs are empty and the container does not deadlock. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-19 14:03:41 -06:00
Hyounggyu Choi	ca40462a1c	Merge pull request #10903 from BbolroC/fixes-for-cri-containerd-on-ubuntu24 tests: Support systemd unit files in /usr/lib as well as /lib	2025-02-19 19:45:55 +01:00
Hyounggyu Choi	d973d41efb	GHA: Turn off MEASURED_ROOTFS in build-kata-static-tarball-s390x This is the first attempt to remove the following code: ``` if [ "${ARCH}" == "s390x" ]; then export MEASURED_ROOTFS=no fi ``` from install_shimv2() in kata-deploy-binaries.sh. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-19 18:19:19 +01:00
Zvonko Kaiser	238db32126	Merge pull request #10868 from zvonkok/qemu-tdx-experimental-workflow QEMU TDX experimental workflow	2025-02-19 10:09:27 -05:00
Zvonko Kaiser	f0eef73a89	gpu: Add no_patches.txt for TDX flavour As alwasy if we do not have any patches create the no_patches.txt for the specific tag gpu_tdx_... Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-19 14:59:04 +00:00
Zvonko Kaiser	ca4d227562	gpu: Add qemu-tdx-experimental build We need to introduce again the qemu-tdx build for the GPU Depends-on: #10867 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-19 14:48:56 +00:00
Hyounggyu Choi	a8363c28ca	tests: Support systemd unit files in /usr/lib as well as /lib On Ubuntu 24.04, due to the /usr merge, system-provided unit files now reside in `/usr/lib/systemd/system/` instead of `/lib/systemd/system/`. For example, the command below now returns a different path: ``` $ systemctl show containerd.service -p FragmentPath /usr/lib/systemd/system/containerd.service ``` Previously, on Ubuntu 22.04 and earlier, it returned: ``` /lib/systemd/system/containerd.service ``` The current pattern `if [[ $unit_file == /lib* ]]` fails to match the new path. To ensure compatibility across versions, we update the pattern to match both `/lib` and `/usr/lib` like: ``` if [[ $unit_file =~ ^/(usr/)?lib/ ]] ``` Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-19 14:34:59 +01:00
Zvonko Kaiser	0d786577c6	Merge pull request #10867 from zvonkok/qemu-snp-tdx-experimental gpu: QEMU SNP+TDX experimental updates	2025-02-19 08:26:37 -05:00
Ruoqing He	a8a096b20c	dragonball: Centralize RustVMM crates Centralize all RustVMM crates to workspace.dependencies to prevent having multiple versions of each RustVMM crate, which is error-prone and inconsistent. With this setup, updates on RustVMM crates would be much easier. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-19 21:20:30 +08:00
Ruoqing He	b129972e12	dragonball: Setup workspace Setup workspace in dragonball, move `dbs` crates one level up to be managed as members of dragonball workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-19 21:20:30 +08:00
Ruoqing He	a174e2be03	dragonball: Appease clippy introduced by 1.80.0 New clippy warnings show up after Rust Tool Chain bumped from 1.75.0 to 1.80.0, fix accrodingly. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-19 21:20:30 +08:00
Ruoqing He	6bb193bbc0	spell: Update dictionary for dbs crates Add entries for dbs_* crates' README.md to pass `kata-spell-check.sh` spell checking. Changed British terms to American terms in README of `dbs_pci` to pass `hunspell` check. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-19 21:20:30 +08:00
Zvonko Kaiser	73b7a3478c	Merge pull request #10893 from RuoqingHe/fix-static-check ci: Fix spell_check and improve header_check	2025-02-19 08:08:40 -05:00
Mikko Ylinen	926119040c	packaging: make install_oras.sh to run curl without sudo sudo hides the environment variables that are sometimes useful with the builds (for example: proxy settings). While install_oras.sh could run completely without sudo in the container it's COPY'd to, make minimal changes to it to keep it functional outside the container too while still addressing the problem of 'sudo curl' not working with proxy env variables. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-02-19 09:34:13 +02:00
Mikko Ylinen	0d8242aee4	agent: rename cargo config To mitigate: warning: `.../kata-containers/src/agent/.cargo/config` is deprecated in favor of `config.toml` note: if you need to support cargo 1.38 or earlier, you can symlink `config` to `config.toml` Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-02-19 09:34:13 +02:00
Fabiano Fidêncio	c8db24468c	Merge pull request #10894 from BbolroC/use-multi-arch-for-qemu-sample example: Use multi-arch image for test-deploy-kata-qemu.yaml	2025-02-18 23:43:52 +01:00
Dan Mihai	672462e6b8	Merge pull request #10895 from katexochen/p/agent-deps agent: make policy feature optional again	2025-02-18 13:27:23 -08:00
Dan Mihai	6b389fdd4f	Merge pull request #10896 from katexochen/p/oci-client-genplicy genpolicy: bump oci-distribution to v0.12.0	2025-02-18 12:42:23 -08:00
Markus Rudy	67fbad5f37	genpolicy: bump oci-distribution to v0.12.0 This picks up a security fix for confidential pulling of unsigned images. The crate moved permanently to oci-client, which required a few import changes. Co-authored-by: Paul Meyer <katexochen0@gmail.com> Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-18 16:32:00 +01:00
Ruoqing He	d23284a0dc	header_check: Check header for changed text files We are running `header_check` for non-text files like binary files, symbolic link files, image files (pictures) and etc., which does not make sense. Filter out non-text files and run `header_check` only for text files changed. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-18 22:39:53 +08:00
Paul Meyer	80af09aae9	agent: make policy feature optional again This was messed up a little when factoring out the policy crate. Removing the dependencies no longer used by the agent and making the import of kata-agent-policy optional again. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-02-18 15:28:06 +01:00
Hyounggyu Choi	4646058c0c	example: Use multi-arch image for test-deploy-kata-qemu.yaml An image `registry.k8s.io/hpa-example` only supports amd64. Let's use a multi-arch image `quay.io/prometheus/prometheus` for the QEMU example instead. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-18 14:23:09 +01:00
Ruoqing He	7e49e83779	spell: Add missing entries for kata-spell-check `kata-dictionary.dic` changes after running `kata-spell-check.sh make-dict`. This is due to someone forgot to first update entries in data and run `make-dict`, but directly updated `kata-dictionary.dic` instead. Add mssing entries to data and re-run `make-dict` to generate correct `kata-dictionary.dic`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-18 19:06:34 +08:00
Lukáš Doktor	d0ef78d3a4	ci: Change the way we modify runtimeclass in webhook previously we used to deploy the webhook and then modified the cm from our ci/openshift-ci/ script to the desired value, but sometimes it happens that the webhook pod starts before we modify the cm and keeps using the default value. Let's change the approach and modify the deployments in-place. The only cons is it leaves the git dirty, but since this script is only supposed to be used in ci it should be safe. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-02-18 11:39:22 +01:00
Anastassios Nanos	1e6cea24c8	Merge pull request #10890 from zvonkok/arm64-fix-release release: Remove artifacts for release	2025-02-17 22:29:23 +02:00
Zvonko Kaiser	1d9915147d	release: Remove artifacts for release We need to make sure the release does not have any residual binaries left for the release payload Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-17 20:16:48 +00:00
Anastassios Nanos	ae1be28ddd	Merge pull request #10880 from nubificus/3.14.0-release release: Bump version to 3.14.0	2025-02-17 20:25:30 +02:00
Zvonko Kaiser	72833cb00b	Merge pull request #10878 from zvonkok/agent_cdi_timeout gpu: agent cdi timeout	2025-02-17 12:49:51 -05:00
Zvonko Kaiser	fda095a4c9	Merge pull request #10786 from zvonkok/gpu-config-update gpu: Update config files	2025-02-17 12:45:54 -05:00
Anastassios Nanos	c7347cb76d	release: Bump version to 3.14.0 Bump VERSION and helm-chart versions Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2025-02-17 16:47:24 +00:00
Fabiano Fidêncio	639bc84329	Merge pull request #10787 from fidencio/topic/bump-kernel-to-6.12.11 version: Bump kernel to 6.12.13	2025-02-17 17:39:14 +01:00
Fabiano Fidêncio	7ae5fa463e	versions: Bump coco-guest-components So attestation-agent and others have a version including the ttrpc bump to v0.8.4, allowing us to use the latest LTS kernel. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-17 15:16:54 +01:00
Fabiano Fidêncio	1381cab6f0	build: Fix rootfs cache logic We've been appending to the wrong variable for quite some time, it seems, leading to not actually regenerating the rootfs when needed. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-17 13:55:36 +01:00
Fabiano Fidêncio	7fc7328bbc	versions: Bump kernel to 6.12.13 Let's try to keep up with the LTS patch releases. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-17 13:47:35 +01:00
Simon Kaegi	f5edbfd696	kernel: support loop device in v6.8+ kernels Set CONFIG_BLK_DEV_WRITE_MOUNTED=y to restore previous kernel behaviour. Kernel v6.8+ will by default block buffer writes to block devices mounted by filesystems. This unfortunately is what we need to use mounted loop devices needed by some teams to build OSIs and as an overlay backing store. More info on this config item [here](https://cateee.net/lkddb/web-lkddb/BLK_DEV_WRITE_MOUNTED.html) Fixes: #10808 Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>	2025-02-17 13:47:35 +01:00
Fabiano Fidêncio	d96e8375c4	Merge pull request #10885 from stevenhorsman/bump-agent-crates-to-resolve-CVEs agent: Bump agent crates to resolve CVEs	2025-02-17 12:11:43 +01:00
stevenhorsman	e5a284474d	deps: Update cookie-store & publicsuffix Run: ``` cargo update -p cookie-store cargo update -p publicsuffix ``` to update the version of idna and resolve CVE-2024-12224 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-14 17:30:03 +00:00
stevenhorsman	5656fc6139	deps: Bump reqwest Bump reqwest to 0.12.12 to pick up fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-14 17:30:03 +00:00
stevenhorsman	3a3849efff	deps: Update quinn-proto Update quin-proto to fix CVE-2024-45311 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-14 17:30:03 +00:00
Fabiano Fidêncio	64ceb0832a	Merge pull request #10851 from fidencio/topic/bump-image-rs-to-bring-in-ttrpc-0.8.4 agent: Bump image-rs to 514c561d93	2025-02-14 18:21:56 +01:00
Fabiano Fidêncio	d5878437a4	Merge pull request #10845 from DataDog/dind-subcgroup-fix Add process to init subcgroup when we're using dind with cgroups v2	2025-02-14 18:12:24 +01:00
Steve Horsman	469c651fc0	Merge pull request #10879 from nubificus/fix_version packaging(release): Properly handle version tag for the release bundle	2025-02-14 14:40:37 +00:00
Zvonko Kaiser	908aacfa78	gpu: Update the logging around CDI Removed a rogue printf and updated the logging to say that we're waiting for CDI spec(s) to be generated rather than saying there is an error, it's not we have a timeout after that it is an error. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-14 14:32:00 +00:00
Zvonko Kaiser	4bda16565b	gpu: Update timeouts With the create_container_timeout the dial_timeout is lest important. Add the custom timeout for GPUs in create_container_timeout Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-14 14:29:18 +00:00
Zvonko Kaiser	66ccc25724	tdx: Update GPU config for the latest TDX stack We need extra kernel_params for TDX Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-14 14:29:18 +00:00
Zvonko Kaiser	d4dd87a974	gpu: Update config files With the recent changed to cgroupsv1 and AGENT_INIT=no we need update to the config files. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-14 14:29:18 +00:00
Anastassios Nanos	b13db29aaa	packaging(release): Properly handle version tag for the release bundle The tags created automatically for published Github releases are probably not annotated, so by simply running `git describe` we are not getting the correct tag. Use a `git describe --tags` to allow git to look at all tags, not just annotated ones. Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2025-02-14 12:41:08 +00:00
Zvonko Kaiser	2499d013bd	gpu: Update handle_cdi_devices AgentConfig now has the cdi_timeout from the kernel cmdline, update the proper function signature and use it in the for loop. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-13 20:11:48 +00:00
Zvonko Kaiser	d28410ed75	Merge pull request #10877 from AdithyaKrishnan/main CI: Deprecate SEV	2025-02-13 14:55:11 -05:00
Zvonko Kaiser	95aa21f018	gpu: Add CDI timeout via kernel config Some systems like a DGX where we have 8 H100 or 8 H800 GPUs need some extended time to be initialized. We need to make sure we can configure CDI timeout, to enable even systems with 16 GPUs. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-13 19:23:19 +00:00
Adithya Krishnan Kannan	6cc5b79507	CI: Deprecate SEV Phase 1 of Issue #10840 AMD has deprecated SEV support on Kata Containers, and going forward, SNP will be the only AMD feature supported. As a first step in this deprecation process, we are removing the SEV CI workflow from the test suite to unblock the CI. Will be adding future commits to remove redundant SEV code paths. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2025-02-13 12:20:21 -06:00
Steve Horsman	0a39f59a9b	Merge pull request #10874 from stevenhorsman/skip-consistently-failing-block-volume-test tests: Skip block volume test on fc, stratovirt	2025-02-13 15:39:45 +00:00
Zvonko Kaiser	a0766986e7	Merge pull request #10832 from RuoqingHe/update-yq ci: Update yq to v4.44.5 to support riscv64	2025-02-13 08:33:02 -05:00
stevenhorsman	56fb2a9482	tests: Skip block volume test on fc, stratovirt The block volume test has failed on 10/10 nightlies and all the PRs I've seen, so skip it until it can be assessed. See #10873 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-13 11:50:35 +00:00
stevenhorsman	2d266df846	test: Update expected error in signed image tests We are seeing a different error in the new version of image-rs, so update our tests to match. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-13 11:44:51 +00:00
stevenhorsman	d28a512d29	agent: Wait for network before init_image_service Based on the guidance from @Xynnn007 in #10851 > The new version of image-rs will do attestation once ClientBuilder.build().await() is called, while the old version will do so lazily the first image pull request comes. Looks like it's called in rpc::start() in kata-agent, when I'm afraid the network hasn't been initialized yet. > I am not sure if the guest network is prepared after the DNS is configured (in create_sandbox), if so we can move (the init_image_service) right after that. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-13 11:44:51 +00:00
Tobin Feldman-Fitzthum	a13d5a3f04	agent: Bump image-rs to 514c561d93 As this brings in the commit bumping ttrpc to 0.8.4, which fixes connection issues with kernel 6.12.9+. As image-rs has a new builder pattern and several of the values in the image client config have been renamed, let's change the agent to account for this. Signed-off-by: Tobin Feldman-Fitzthum <tobin@linux.ibm.com> Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-13 11:44:51 +00:00
Steve Horsman	8614e5efc4	Merge pull request #10869 from stevenhorsman/bump-kcli-ubuntu-version ci: k8s: Bump kcli image version	2025-02-13 09:59:20 +00:00
Antoine Gaillard	4b5b788918	agent: Use init subcgroup for process attachment in DinD cgroups v2 enforces stricter delegation rules, preventing operations on cgroups outside our ownership boundary. When running Docker-in-Docker (DinD), processes must be attached to an "init" subcgroup within the systemd unit. This fix detects and uses the init subcgroup when proxying process attachment. Fixes #10733 Signed-off-by: Antoine Gaillard <antoine.gaillard@datadoghq.com>	2025-02-13 10:44:51 +01:00
Dan Mihai	958cd8dd9f	Merge pull request #10613 from 3u13r/feat/policy/refactor-out-policy-crate-and-network-namespace policy: add policy crate and add network namespace check to policy	2025-02-12 18:28:09 -08:00
Alex Lyn	e1b780492f	Merge pull request #10839 from RuoqingHe/appease-clippy dragonball: Appease clippy	2025-02-13 09:12:15 +08:00
Zvonko Kaiser	acd2a933da	Merge pull request #10864 from fidencio/topic/packaging-move-to-ubuntu-22-04 packaging: Move builds to Ubuntu 22.04	2025-02-12 14:29:41 -05:00
Wainer Moschetta	62e239ceaa	Merge pull request #10810 from arvindskumar99/nydus_perm_install Skipping SNP and SEV from deploying and deleting Snapshotter	2025-02-12 14:38:56 -03:00
stevenhorsman	fd7bcd88d0	ci: k8s: Bump kcli image version When trying to deploy nydus on kcli locally we get the following failure: ``` root@sh-kata-ci1:~# kubectl get pods -n nydus-system NAMESPACE NAME READY STATUS RESTARTS AGE nydus-system nydus-snapshotter-5kdqs 0/1 CrashLoopBackOff 4 (84s ago) 7m29s ``` Digging into this I found that the nydus-snapshotter service is failing with: ``` ubuntu@kata-k8s-worker-0:~$ journalctl -u nydus-snapshotter.service -- Logs begin at Wed 2025-02-12 15:06:08 UTC, end at Wed 2025-02-12 15:20:27 UTC. -- Feb 12 15:10:39 kata-k8s-worker-0 systemd[1]: Started nydus snapshotter. Feb 12 15:10:39 kata-k8s-worker-0 containerd-nydus-grpc[6349]: /usr/local/bin/containerd-nydus-grpc: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required b> Feb 12 15:10:39 kata-k8s-worker-0 containerd-nydus-grpc[6349]: /usr/local/bin/containerd-nydus-grpc: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required b> Feb 12 15:10:39 kata-k8s-worker-0 systemd[1]: nydus-snapshotter.service: Main process exited, code=exited, status=1/FAILURE ``` I think this is because 20.04 has version: ``` ubuntu@kata-k8s-worker-0:~$ ldd --version ldd (Ubuntu GLIBC 2.31-0ubuntu9.16) 2.31 ``` so it's too old for the nydus snapshotter. Also 20.04 is EoL soon, so bumping is better. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-12 15:38:18 +00:00
Zvonko Kaiser	fbc8454d3d	Merge pull request #10866 from zvonkok/enable-cc-gpu-build gpu: enable confidential initrd build	2025-02-12 09:26:08 -05:00
Ruoqing He	897e2e2b6e	dragonball: Appease clippy Some problem hidden in `dbs` crates are revealed after making these crates workspace components, fix according to `cargo clippy` suggests. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-12 19:44:34 +08:00
Leonard Cohnen	ec0af6fbda	policy: check the linux network namespace Peer pods have a linux namespace of type network. We want to make sure that all container in the same pod use the same namespace. Therefore, we add the first namespace path to the state and check all other requests against that. This commit also adds the corresponding integration test in the policy crate showcasing the benefit of having rust integration tests for the policy. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2025-02-12 10:41:15 +01:00
Leonard Cohnen	7aca7a6671	policy: use agent policy crate in genpolicy test The generated rego policies for `CreateContainerRequest` are stateful and that state is handled in the policy crate. We use this policy crate in the genpolicy integration test to be able to test if those state changes are handled correctly without spinning up an agent or even a cluster. This also allows to easily test on a e.g., CreateContainerRequest level instead of relying on changing the yaml that is applied to a cluster. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2025-02-12 10:41:15 +01:00
Leonard Cohnen	d03738a757	genpolicy: expose create as library This commit allows to programmatically invoke genpolicy. This allows for other rust tools that don't want to consume genpolicy as binary to generate policies. One such use-case is the policy integration test implemented in the following commits. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2025-02-12 10:41:15 +01:00
Leonard Cohnen	cf54a1b0e1	agent: move policy module into separate crate The policy module augments the policy generated with genpolicy by keeping and providing state to each invocation. Therefore, it is not sufficient anymore to test the passing of requests in the genpolicy crate. Since in Rust, integration tests cannot call functions that are not exposed publicly, this commit factors out the policy module of the agent into its own crate and exposes the necessary functions to be consumed by the agent and an integration tests. The integration test itself is implemented in the following commits. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2025-02-12 10:41:15 +01:00
Fupan Li	ec7b2aa441	Merge pull request #10850 from teawater/direct Clean the config block_device_cache_direct of runtime-rs	2025-02-12 09:45:37 +08:00
Zvonko Kaiser	5431841a80	Merge pull request #10814 from kata-containers/shellcheck-gha gha: Add shellcheck	2025-02-11 18:30:41 -05:00
Zvonko Kaiser	2d8531cd20	gpu: Add TDX experimental target for GPUs We have custom branches on coco/qemu to support GPUs in TDX and SNP add experimental target. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 17:32:31 +00:00
Zvonko Kaiser	7ded74c068	gpu: Add version for QEMU+TDX+SNP SNP and TDX patches for GPU are not compatible hence we need an own build for TDX. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 17:32:31 +00:00
Zvonko Kaiser	e4679055c6	gpu: qemu-snp-experimental no patches The branch has all the needed cherry-picks Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 17:32:31 +00:00
Zvonko Kaiser	7a219b3f03	gpu: Add GPU+SNP QEMU build Since the CPU SNP is upstreamed and available via our default QEMU target we're repurposing the SNP-experimental for the GPU+SNP enablement. First step is to update the version we're basing it off. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 17:32:31 +00:00
Zvonko Kaiser	b231a795d7	gha: Add shellcheck We need to start to fix our scripts. Lets run shellcheck and see what needs to be reworked. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 16:00:34 +00:00
Zvonko Kaiser	befb2a7c33	gpu: Confidential Initrd Start building the confidential initrd Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 15:41:36 +00:00
Fupan Li	5b809ca440	CI: a workaround for containerd v2.x e2e test the latest containerd had an issue for its e2e test, thus we should do the following fix to workaround this issue. For much info about this issue, please see: https://github.com/containerd/containerd/pull/11240 Once this pr was merged and release new version, we can remove this workaround. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	a3fd3d90bc	ci: Add the sandbox api testcases A test case is added based on the intergrated cri-containerd case. The difference between cri containerd integrated testcase and sandbox api testcase is the "sandboxer" setting in the sandbox runtime handler. If the "sandboxer" is set to "" or "podsandbox", then containerd will use the legacy shimv2 api, and if the "sandboxer" is set to "shim", then it will use the sandbox api to launch the pod. In addition, add a containerd v2.0.0 version. Because containerd officially supports the sandbox api from version 2.0.0. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	36bf080c1e	runtime-rs: register the sandbox api service add and resiger the sandbox api service, thus runtime-rs can deal with the sandbox api rpc call from the containerd. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	8332f427d2	runtime-rs: add the wait and status method for sandbox api Add the sandbox wait and sandbox status method for sandbox api. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	2d6b1e6b13	runtime-rs: add the sandbox api support For Kata-Containers, we add SandboxService for these new calls alongside the existing TaskService, including processing requests and replies, and properly calling VirtSandbox's interfaces. By splitting the start logic of the sandbox, virt_container is compatible with calls from the SandboxService and TaskService. In addition, we modify the processing of resource configuration to solve the problem that SandboxService does not have a spec file when creating a pod. Sandbox api can be supported from containerd 1.7. But there's a difference from container 2.0. To enbale it from 2.0, you can support the sandbox api for a specific runtime by adding: sandboxer = "shim", take kata runtime as an example: [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata] runtime_type = "io.containerd.kata.v2" sandboxer = "shim" privileged_without_host_devices = true pod_annotations = ["io.katacontainers.*"] For container version 1.7, you can enable it by: 1: add env ENABLE_CRI_SANDBOXES=true 2: add sandbox_mode = "shim" to runtime config. Acknowledgement This work was based on @wllenyj's POC code: (`f5b62a2d7c`) Signed-off-by: Fupan Li <fupan.lfp@antgroup.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2025-02-11 15:21:53 +01:00
Fupan Li	65e908a584	runtime-rs: add the sandbox init for sandbox api For the processing of init sandbox, the init of task api has some more special processing procedures than the init of sandbox api, so these two types of init are separated here. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	be40646d04	runtime-rs: move the sandbox start from sandbox init function Split the sandbox start from the sandbox init process, and call them separately. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	438f81b108	runtime-rs: only get the containerd id when start container When start the sandbox, the sandbox id would be passed from the shim command line, and it only need to get the containerd id from oci spec when starting the pod container instead of the pod sandbox. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	9492c45d06	runtime-rs: load the cgroup path correctly When the sandbox api was enabled, the pause container would be removed and sandbox start api only pass an empty bundle directory, which means there's no oci spec file under it, thus the cgroup config couldn't get the cgroup path from pause container's oci spec. So we should set a default cgroup path for sandbox api case. In the future, we can promote containerd to pass the cgroup path during the sandbox start phase. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	78b96a6e2e	runtime-rs: fix the issue of missing create sandbox dir It's needed to make sure the sandbox storage path exist before return it. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	97785b1f3f	runtime-rs: rustfmt against lib.rs It seemed some files was mssing run rustfmt. This commit do rustfmt for them. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	33555037c0	protocols: Add the cri api protos Add the cri api protos to support the sandbox api. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Hui Zhu	27cff15015	runtime-rs: Remove block_device_cache_direct from config of fc Remove block_device_cache_direct from config of fc in runtime-rs because fc doesn't support this config. Fixes: #10849 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-02-11 14:04:11 +08:00
Hui Zhu	70d9afbd1f	runtime-rs: Add block_device_cache_direct to config of ch and dragonball Add block_device_cache_direct to config of ch and dragonball in runtime-rs because they support this config. Fixes: #10849 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-02-11 14:04:11 +08:00
Hui Zhu	db04c7ec93	runtime-rs: Add block_device_cache_direct config to ch and qemu Add block_device_cache_direct config to ch and qemu in runtime-rs. Fixes: #10849 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-02-11 14:04:11 +08:00
Hui Zhu	e4cbc6abce	runtime-rs: CloudHypervisorInner: Change config type This commit change config in CloudHypervisorInner to normal HypervisorConfig to decrease the change of its type. Fixes: #10849 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-02-11 14:04:11 +08:00
Fabiano Fidêncio	75ac09baba	packaging: Move builds to Ubuntu 22.04 As Ubuntu 20.04 will reach its EOL in April. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-10 21:25:43 +01:00
Fabiano Fidêncio	c9f5966f56	Merge pull request #10860 from kata-containers/topic/debug-ci workflows: build: Do not store unnecessary content on the tarball	2025-02-10 20:01:37 +01:00
Fabiano Fidêncio	ec290853e9	workflows: build: Do not store unnecessary content on the tarball Otherwise we may end up simply unpacking kata-containers specific binaries into the same location that system ones are needed, leading to a broken system (most likely what happened with the metrics CI, and also what's happening with the GHA runners). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-10 18:57:29 +01:00
Steve Horsman	fb341f8ebb	Merge pull request #10857 from fidencio/topic/ci-tdx-only-use-one-machine-for-testing ci: Only use the Ubuntu TDX machine in the CI	2025-02-10 15:25:06 +00:00
Fabiano Fidêncio	23cb5bb6c2	ci: Only use the Ubuntu TDX machine in the CI We've been hitting issues with the CentOS 9 Stream machine, which Intel doesn't have cycles to debug. After raising this up in the Confidential Containers community meeting we got the green light from Red Hat (Ariel Adam) to just disable the CI based on CentOS 9 Stream for now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-10 12:50:16 +01:00
Zvonko Kaiser	eb1cf792de	Merge pull request #10791 from kata-containers/gpu_ci_cd gpu: Add first target and fix extratarballs	2025-02-06 15:47:27 -05:00
Zvonko Kaiser	62a975603e	Merge pull request #10806 from stevenhorsman/rust-1.80.0-bump Rust 1.80.0 bump	2025-02-06 14:49:23 -05:00
Dan Mihai	fdf3088be0	Merge pull request #10842 from microsoft/danmihai1/disable-job-policy-test tests: disable k8s-policy-job.bats on coco-dev	2025-02-06 09:09:49 -08:00
Hyounggyu Choi	48c5b1fb55	Merge pull request #10841 from BbolroC/make-measured-rootfs-configurable local-build: Do not build measured rootfs on s390x	2025-02-06 16:07:15 +01:00
Hyounggyu Choi	1bdb34e880	tests: Skip trusted storage tests for IBM SE Let's skip all tests for trusted storage until #10838 is resolved. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-06 12:09:14 +01:00
Hyounggyu Choi	27ce3eef12	local-build: Do not use measured rootfs on s390x IBM SE ensures to make initrd measured by genprotimg and verified by ultravisor. Let's not build the measured rootf on s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-06 10:12:55 +01:00
stevenhorsman	fce49d4206	dragonball: Skip unsafe tests Skip tests that use unsafe uses of file descriptor which causes ``` fatal runtime error: IO Safety violation: owned file descriptor already closed ``` See #10821 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-06 08:54:17 +00:00
Fabiano Fidêncio	2ceb7a35fc	versions: Bump rust to 1.80.0 (matching coco-guest-components) This is needed in order to avoid agent build issues, such as: ``` error[E0658]: use of unstable library feature 'lazy_cell' --> /home/ansible/.cargo/git/checkouts/guest-components-1e54b222ad8d9630/514c561/ocicrypt-rs/src/lib.rs:10:5 \| 10 \| use std::sync::LazyLock; \| ^^^^^^^^^^^^^^^^^^^ \| = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information ``` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-06 08:53:51 +00:00
Fabiano Fidêncio	76df852f33	packaging: agent: Add rust version to the builder image name As we want to make sure a new builder image is generated if the rust version is bumped. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-06 08:53:51 +00:00
stevenhorsman	d3e0ecc394	kata-ctl: Allow empty const Due to the way that multi-arch support is done, on various platforms we will get a clippy error: ``` error: this expression always evaluates to false ``` which might not be true on those other platforms, so allow this code pattern to suppress the clippy error Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-06 08:53:51 +00:00
Fabiano Fidêncio	6de8e59109	Merge pull request #10824 from stevenhorsman/updates-in-prep-of-rust-1.80-bump Updates in prep of rust 1.80 bump	2025-02-06 09:05:23 +01:00
Dan Mihai	47ce5dad9d	tests: disable k8s-policy-job.bats on coco-dev k8s-policy-job is modeled after the older k8s-job, and it appears that both of them fail occasionally on coco-dev. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-02-05 23:06:16 +00:00
Arvind Kumar	47534c1c3e	nydus: Skipping SNP and SEV from deploying and deleting Snapshotter Preparing to install nydus permanently on the AMD node, so disabling deploy and delete command for SNP and SEV. Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-02-05 12:26:53 -06:00
Zvonko Kaiser	45bd451fa0	ci: add arm64 attestation Do the very same thing that we do on amd64 and add attestation Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	9a7dff9c40	gpu: Add arm64 targets We want to make sure we deliver arm64 GPU targets as well Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	968318180d	ci: Add extratarballs steps We introduced extratarballs with a make target. The CI currently only uploads tarballs that are listed in the matrix. The NV kernel builds a headers package which needs to be uploaded as well. The get-artifacts has a glob to download all artifacts hence we should be good. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	b04bdf54a5	gpu: Add rootfs target amd64/arm64 Adding the initrd build first to get the rootfs on amd64. With that we can start to add tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
stevenhorsman	7831caf1e7	libs/safe-path: Fix doc formatting Clippy fails with ``` error: doc list item missing indentation ``` so indent further to avoid this.	2025-02-05 15:16:47 +00:00
stevenhorsman	17b1e94f1a	cargo: Update time crate So it avoids us hitting ``` error[E0282]: type annotations needed for `Box<_>` --> /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/time-0.3.31/src/format_description/parse/mod.rs:83:9 \| 83 \| let items = format_items \| ^^^^^ ... 86 \| Ok(items.into()) \| ---- type must be known at this point \| help: consider giving `items` an explicit type, where the placeholders `_` are specified \| 83 \| let items: Box<_> = format_items \| ++++++++ ``` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 15:16:47 +00:00
stevenhorsman	e9393827e8	agent: Workaround ppc formatting On powerpc64le platform the ip neigh command has a trailing space after the state, so the test is failing e.g. ``` assertion `left == right` failed left: "169.254.1.1 lladdr 6a:92:3a:59:70:aa PERMANENT \n" right: "169.254.1.1 lladdr 6a:92:3a:59:70:aa PERMANENT\n" ``` Trim the whitespace to make the test pass on all platforms Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 15:16:47 +00:00
stevenhorsman	1ac0e67245	kata-ctl: Add stub of missing method for ppc `host_is_vmcontainer_capable` is required, but wasn't implemented for powerpc64, so copy the aarch64 approach @Amulyam24 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 15:16:47 +00:00
stevenhorsman	bd3c93713f	kata-sys-util: Complete code move In #7236 the guest protection code was moved to kata-sys-utils, but some of it was left behind, and the adjustment to the new location wasn't completed, so the powerpc64 code doesn't build now we've fixed the cfg to test it. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 15:16:47 +00:00
stevenhorsman	9f865f5bad	kata-ctl: Allow dead_code Some of the Kernel structs have `#[allow(dead_code)]` but not all and this results in the clippy error: ``` error: fields `name` and `value` are never read ``` so complete the job started before to remove the error. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	61a252094e	dragonball: Fix feature typo Replace `legacy_irq` with `legacy-irq` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	add785f677	dragonball: Remove unused fields `metrics` is never used, so remove this code Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	dde34bb7b8	runtime-rs: Remove un-used code The `r#type` method is never used, so neither are the log type constants Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	71fffb8736	runtime-rs: Allow dead code Clippy errors with: ``` error: field `driver` is never read --> crates/resource/src/network/utils/link/driver_info.rs:77:9 \| 76 \| pub struct DriverInfo { \| ---------- field in this struct 77 \| pub driver: String, \| ^^^^^^ ``` We set this, but never read it, so clippy is correct, but I'm not sure if it's useful for logging, or other purposes, so I'll allow it for now. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	d75a0ccbd1	dragonball: Allow test-mock feature Clippy fails with: ``` warning: unexpected `cfg` condition value: `test-mock` --> /root/go/src/github.com/kata-containers/kata-containers/src/dragonball/src/dbs_pci/src/vfio.rs:1929:17 \| 1929 \| #[cfg(all(test, feature = "test-mock"))] \| ^^^^^^^^^^^^^^^^^^^^^ help: remove the condition \| = note: no expected values for `feature` = help: consider adding `test-mock` as a feature in `Cargo.toml` ``` So add it as an expected cfg in the linter to skip this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	bddaea6df1	runtime-rs: Allow enable-vendor feature Clippy fails with: ``` error: unexpected `cfg` condition value: `enable-vendor` --> crates/hypervisor/src/device/driver/vfio.rs:180:11 \| 180 \| #[cfg(feature = "enable-vendor")] \| ^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: expected values for `feature` are: `ch-config`, `cloud-hypervisor`, `default`, and `dragonball` = help: consider adding `enable-vendor` as a feature in `Cargo.toml` ``` So add it as an expected cfg in the linter to skip this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	bed128164a	runtime-rs: Allow unexpected config Clippy fails with: ``` error: unexpected `cfg` condition value: `enable-vendor` --> crates/hypervisor/src/device/driver/vfio.rs:180:11 \| 180 \| #[cfg(feature = "enable-vendor")] \| ^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: expected values for `feature` are: `ch-config`, `cloud-hypervisor`, `default`, and `dragonball` = help: consider adding `enable-vendor` as a feature in `Cargo.toml` ``` allow this until we can check this behaviour with @Apokleos Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	53bcb0b108	runtime-rs: Fix for-loops-over-fallibles Clippy complains about: ``` error: for loop over a `&Result`. This is more readably written as an `if let` statement --> crates/hypervisor/src/firecracker/fc_api.rs:99:22 \| 99 \| for param in &kernel_params.to_string() { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	c332a91ef8	runtime-rs: Fix doc list item missing indentation Add the extra space to format the list correctly Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	fe98d49a29	runtime-rs: Remove direct implementation of ToString Fix clippy error: ``` direct implementation of `ToString` ``` by switching to implement Display instead Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	730c56af2a	runtime-rs: Fix clippy::unnecessary-get-then-check Clippy errors with: ``` error: unnecessary use of `get(&id).is_none()` --> crates/hypervisor/src/device/device_manager.rs:494:29 \| 494 \| if self.devices.get(&id).is_none() { \| -------------^^^^^^^^^^^^^^^^^^ \| \| \| help: replace it with: `!self.devices.contains_key(&id)` ``` so fix this as suggested Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	a9358b59b7	runtime-rs: Allow unused enum field Clippy errors with: ``` error: field `0` is never read --> crates/hypervisor/src/qemu/cmdline_generator.rs:375:25 \| 375 \| DeviceAlreadyExists(String), // Error when trying to add an existing device \| ------------------- ^^^^^^ ``` but this is used when creating the error later, so add an allow to ignore this warning Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	1d9efeb92b	runtime-rs: Remove use of legacy constants Fix clippy error ``` error: usage of a legacy numeric constant ``` by swapping `std::u8::MAX` for `u8::MAX` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	225c7fc026	kata-ctl: Allow unused enum field Clippy errors with: ``` error: field `0` is never read ``` but the field is required for the `map_err`, so ignore this error for now to avoid too much disruption Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	f1d3450d1f	runtime-rs: Remove unused config `gdb` is only activated by a feature `guest_debug` that doesn't exist, so remove this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	1e90fc38de	dragonball: Fix incorrect reference There were references to `config_manager::DeviceInfoGroup` which doesn't exist, so I guess it means `DeviceConfigInfo` instead, so update them Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	f389b05f20	dragonball: Fix doc formatting issue Clippy errors with: ``` error: doc list item missing indentation ``` which I think is because the Return is between two list items, so add a blank line to separate this into a separate paragraph Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	8bea57326a	dragonballl: Fix thread_local initializer error clippy errors with: ``` error: initializer for `thread_local` value can be made `const` ``` so update as suggested Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	7257ee0397	agent: Remove implementation of ToString Fix clippy error: ``` direct implementation of `ToString` ``` by switching to implement Display instead Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	ca87aca1a6	agent: Remove use of legacy constants Fix clippy error ``` error: usage of a legacy numeric constant ``` by swapping `std::i32::<MIN/MAX>` for `i32::<MIN/MAX>` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	6008fd56a1	agent: Fix clippy error ``` error: file opened with `create`, but `truncate` behavior not defined ``` `truncate(true)` ensures the file is entirely overwritten with new data which I believe is the behaviour we want Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	a640bb86ec	agent: cdh: Remove unnecessary borrows Fix clippy error: ``` error: the borrowed expression implements the required traits ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	a131eec5c1	agent: config: Remove supports_seccomp supports_seccomp is never used, so throws a clippy error Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	0bd36a63d9	agent: Fix clippy error ``` error: bound is defined in more than one place ``` Move Sized into the later definition of `R` & `W` rather than defining them in two places Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	7709198c3b	rustjail: Fix clippy error ``` error: file opened with `create`, but `truncate` behavior not defined ``` `truncate(true)` ensures the file is entirely overwritten with new data which I believe is the behaviour we want Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
Fabiano Fidêncio	b4de302cb2	genpolicy: Adjust to build with rust 1.80.0 ``` error: field `image` is never read --> src/registry.rs:35:9 \| 34 \| pub struct Container { \| --------- field in this struct 35 \| pub image: String, \| ^^^^^ \| = note: `Container` has derived impls for the traits `Debug` and `Clone`, but these are intentionally ignored during dead code analysis = note: `-D dead-code` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(dead_code)]` error: field `use_cache` is never read --> src/utils.rs:106:9 \| 105 \| pub struct Config { \| ------ field in this struct 106 \| pub use_cache: bool, \| ^^^^^^^^^ \| = note: `Config` has derived impls for the traits `Debug` and `Clone`, but these are intentionally ignored during dead code analysis error: could not compile `genpolicy` (bin "genpolicy") due to 2 previous errors ``` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	099b241702	powerpc64: Add target_endian = "little" Based on comments from @Amulyam24 we need to use the `target_endian = "little"` as well as target_arch = "powerpc64" to ensure we are working on powerpc64le. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	4c006c707a	build: Fix powerpc64le target_arch Starting with version 1.80, the Rust linter does not accept an invalid value for `target_arch` in configuration checks: ``` Compiling kata-sys-util v0.1.0 (/home/ddd/Work/kata/kata-containers/src/libs/kata-sys-util) error: unexpected `cfg` condition value: `powerpc64le` --> /home/ddd/Work/kata/kata-containers/src/libs/kata-sys-util/src/protection.rs:17:34 \| 17 \| #[cfg(any(target_arch = "s390x", target_arch = "powerpc64le"))] \| ^^^^^^^^^^^^^^------------- \| \| \| help: there is a expected value with a similar name: `"powerpc64"` \| = note: expected values for `target_arch` are: `aarch64`, `arm`, `arm64ec`, `avr`, `bpf`, `csky`, `hexagon`, `loongarch64`, `m68k`, `mips`, `mips32r6`, `mips64`, `mips64r6`, `msp430`, `nvptx64`, `powerpc`, `powerpc64`, `riscv32`, `riscv64`, `s390x`, `sparc`, `sparc64`, `wasm32`, `wasm64`, `x86`, and `x86_64` = note: see <https://doc.rust-lang.org/nightly/rustc/check-cfg/cargo-specifics.html> for more information about checking conditional configuration = note: `-D unexpected-cfgs` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unexpected_cfgs)]` ``` According [to GitHub user @Urgau][explain], this is a new warning introduced in Rust 1.80, but the problem exists before. The correct architecture name should be `powerpc64`, and the differentiation between `powerpc64le` and `powerpc64` should use the `target_endian = "little"` check. [explain]: #10072 (comment) Fixes: #10067 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> [emlima: fix some more occurences and typos] Signed-off-by: Emanuel Lima <emlima@redhat.com> [stevenhorsman: fix some more occurences and typos] Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:20:47 +00:00
Zvonko Kaiser	429b2654f4	Merge pull request #10812 from zvonkok/fix-arch-build-gpu gpu: Fix arm64 build	2025-02-04 17:03:37 -05:00
Dan Mihai	3fc170788d	Merge pull request #10811 from microsoft/cameronbaird/hyp-loglevel-upstream CLH: config: add hypervisor_loglevel	2025-02-04 11:59:21 -08:00
Zvonko Kaiser	eeacd8fd74	gpu: Adapt rootfs build for multi-arch Add aarch64 and x86_64 handling. Especially build the Rust dependency with the correct rust musl target. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-04 16:44:21 +00:00
Steve Horsman	9060904c4f	Merge pull request #10826 from kata-containers/topic/crio-test-timeouts workflows: Add delete kata-deploy timeouts for crio tests	2025-02-04 13:09:49 +00:00
Markus Rudy	937fd90779	agent: clear log pipes if denied by policy Container logs are forwarded to the agent through a unix pipe. These pipes have limited capacity and block the writer when full. If reading logs is blocked by policy, a common setup for confidential containers, the pipes fill up and eventually block the container. This commit changes the implementation of ReadStream such that it returns empty log messages instead of a policy failure (in case reading log messages is forbidden by policy). As long as the runtime does not encounter a failure, it keeps pulling logs periodically. In turn, this triggers the agent to flush the pipes. Fixes: #10680 Co-Authored-By: Aurélien Bombo <abombo@microsoft.com> Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-04 13:17:29 +01:00
Ruoqing He	8e073a6715	ci: Update yq to v4.44.5 to support riscv64 In v4.44.5 of `yq`, artifacts for riscv64 are released. Update the version used for `yq` and enable `install_yq.sh` to work on riscv64. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-04 19:36:34 +08:00
Zvonko Kaiser	95c63f4982	Merge pull request #10827 from stevenhorsman/bump-golang-1.22.11 versions: Bump golang version	2025-02-03 16:06:56 -05:00
Zvonko Kaiser	7dc8060051	Merge pull request #10828 from stevenhorsman/fix-versions-comments versions: Fix formatting	2025-02-03 16:06:37 -05:00
stevenhorsman	546e3ae9ea	versions: Fix formatting The static_checks_versions test uses yamllint which fails with: ``` [comments] too few spaces before comment ``` many times and so makes code reviews more annoying with all these extra messages. Other it's probably not the worse issues, I checked the [yaml spec](https://yaml.org/spec/1.2.2/#66-comments) and it does say > Comments must be separated from other tokens by white space characters so it's easiest to fix it and move on. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-03 17:08:25 +00:00
Zvonko Kaiser	122ad95da6	Merge pull request #10751 from ryansavino/snp-upstream-host-kernel-support snp: update kata to use latest upstream packages for snp	2025-02-03 11:20:59 -05:00
stevenhorsman	d9eb1b0e06	versions: Bump golang version Bump golang versions so we are more up-to-date and have the extra security fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-03 15:28:53 +00:00
stevenhorsman	5203158195	workflows: Add delete kata-deploy timeouts for crio tests I've also seen cases (the qemu, crio, k0s tests) where Delete kata-deploy is still running for this test after 2 hours, and had to be manually cancelled, so let's try adding a 5m timeout to the kata-deploy delete to stop CI jobs hanging. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-03 11:45:43 +00:00
Greg Kurz	a806d74ce3	Merge pull request #10807 from kata-containers/dependabot/go_modules/src/tools/csi-kata-directvolume/go_modules-8d4d0c168c build(deps): bump github.com/golang/glog from 1.2.0 to 1.2.4 in /src/tools/csi-kata-directvolume in the go_modules group across 1 directory	2025-02-01 08:29:44 +01:00
Cameron Baird	b6b0addd5e	config: add hypervisor_loglevel Implement HypervisorLoglevel config option for clh. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-01-31 18:37:03 +00:00
Steve Horsman	41f23f1d2a	Merge pull request #10823 from stevenhorsman/fix-virtiofsd-build-error packaging: virtiofsd: Allow building a specific commit	2025-01-31 16:18:02 +00:00
stevenhorsman	1cf1a332a5	packaging: virtiofsd: Allow building a specific commit #10714 added support for building a specific commit, but due to the clone only having `--depth=1`, we can only reset to a commit if it's the latest on the `main` branch, otherwise we will get: ``` + git clone --depth 1 --branch main https://gitlab.com/virtio-fs/virtiofsd virtiofsd Cloning into 'virtiofsd'... warning: redirecting to https://gitlab.com/virtio-fs/virtiofsd.git/ + pushd virtiofsd + git reset --hard cecc61bca981ab42aae6ec490dfd59965e79025e ... fatal: Could not parse object 'cecc61bca981ab42aae6ec490dfd59965e79025e'. ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-31 11:24:23 +00:00
Greg Kurz	0215d958da	Merge pull request #10805 from balintTobik/egrep_removal egrep/fgrep removal	2025-01-30 18:26:59 +01:00
Hyounggyu Choi	530fedd188	Merge pull request #10767 from BbolroC/enable-coldplug-vfio-ap-s390x Enable VFIO-AP coldplug for s390x	2025-01-30 12:11:00 +01:00
Balint Tobik	1943a1c96d	tests: replace egrep with grep -E to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-29 11:26:27 +01:00
Balint Tobik	47140357c4	docs: replace egrep/fgrep with grep -E/-F to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-29 11:25:54 +01:00
Ryan Savino	90e2b7d1bc	docs: updated build and host setup instructions for SNP Referenced AMD developer page for latest SEV firmware. Instructions to point to upstream 6.11 kernel or later. Referenced sev-utils and AMDESE fork for kernel setup. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-01-28 18:09:40 -06:00
Ryan Savino	c1ca49a66c	snp: set snp to use upstream qemu in config use upstream qemu in snp and nvidia snp configs. load ovmf with bios flag on qemu cmdline instead of file. Fixes: #10750 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-01-28 18:09:40 -06:00
Ryan Savino	af235fc576	Revert "builds: ovmf: Workaround Zeex repo becoming private" This reverts commit `aff3d98ddd`.	2025-01-28 18:09:40 -06:00
Ryan Savino	bb7ca954c7	ovmf: upgrade standard and sev ovmf ovmf upgraded to latest tag for standard and sev. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-01-28 18:09:40 -06:00
Ryan Savino	e87231edc7	snp: remove snp certs on qemu cmdline snp standard attestation with the upstream kernel and qemu do not support extended attestation with certs. Fixes: #10750 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-01-28 18:09:40 -06:00
Zvonko Kaiser	f9bbe4e439	Merge pull request #10785 from zvonkok/agent-cgv2-activate agent: Add proper activation param handling to activate cgroupV2	2025-01-28 14:21:15 -05:00
dependabot[bot]	df5eafd2a1	build(deps): bump github.com/golang/glog Bumps the go_modules group with 1 update in the /src/tools/csi-kata-directvolume directory: [github.com/golang/glog](https://github.com/golang/glog). Updates `github.com/golang/glog` from 1.2.0 to 1.2.4 - [Release notes](https://github.com/golang/glog/releases) - [Commits](https://github.com/golang/glog/compare/v1.2.0...v1.2.4) --- updated-dependencies: - dependency-name: github.com/golang/glog dependency-type: direct:production dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2025-01-28 17:38:14 +00:00
Fabiano Fidêncio	5e00a24145	Merge pull request #10749 from zvonkok/pass-through-stack gpu: Add driver version selection	2025-01-28 16:24:16 +01:00
Hyounggyu Choi	dde627cef4	test: Run full set of zcrypttest for VFIO-AP coldplug Previously, the test for VFIO-AP coldplug only checked whether a passthrough device was attached to the VM guest. This commit expands the test to include a full set of zcrypttest to verify that the device functions properly within a container. Additionally, since containerd has been upgraded to v1.7.25 on the test machine, it is no longer necessary to run the test via crictl. The commit removes all related codes/files. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Hyounggyu Choi	47db9b3773	agent: Run check_ap_device() for VFIO-AP coldplug This commit updates the device handler to call check_ap_device() instead of wait_for_ap_device() for VFIO-AP coldplug. The handler now returns a SpecUpdate for passthrough devices if the device is online (e.g., `/sys/devices/ap/card05/05.001f/online` is set to 1). Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Hyounggyu Choi	200cbfd0b0	kata-types: Introduce new type `vfio-ap-cold` for VFIO-AP coldplug This newly introduced type will be used by the VFIO-AP device handler on the agent. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Hyounggyu Choi	4a6ba534f1	runtime: Introduce new gRPC device type for VFIO-AP coldplug This commit introduces a new gRPC device type, `vfio-ap-cold`, to support VFIO-AP coldplug. This enables the VM guest to handle passthrough devices differently from VFIO-AP hotplug. With this new type, the guest no longer needs to wait for events (e.g., device addition) because the device already exists at the time the device type is checked. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Hyounggyu Choi	419b5ed715	runtime: Add DeviceInfo to Container for VFIO coldplug configuration Even though ociSpec.Linux.Devices is preserved when vfio_mode is VFIO, it has not been updated correctly for coldplug scenarios. This happens because the device info passed to the agent via CreateContainerRequest is dropped by the Kata runtime. This commit ensures that the device info is added to the sandbox's device manager when vfio_mode is VFIO and coldPlugVFIO is true (e.g., vfio-ap-cold), allowing ociSpec.Linux.Devices to be properly updated with the device information before the container is created on the guest. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Balint Tobik	233d15452b	runtime: replace egrep with grep -E to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-28 10:46:44 +01:00
Balint Tobik	e657f58cf9	ci: replace egrep with grep -E to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-28 10:46:44 +01:00
Zvonko Kaiser	9f2799ba4f	Merge pull request #10790 from JakubLedworowski/add-xattr-to-confidential-kernel kernel: Add CONFIG_TMPFS_XATTR to tdx.conf	2025-01-27 13:47:08 -05:00
Zvonko Kaiser	d2528ef84f	gpu: Initialize unbound variables rootfs.sh Since we're importing some build script for nvidia and we're setting set -u we have some unbound variables in rootfs.sh add initialization for those. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 18:37:21 +00:00
Zvonko Kaiser	9162103f85	agent: Update macro for e.g. String type stack-only types are handled properly with the parse_cmdline_param macro advancted types like String couldn't be guarded by a guard function since it passed the variable by value rather than reference. Now we can have guard functions for the String type parse_cmdline_param!( param, CGROUP_NO_V1, config.cgroup_no_v1, get_string_value, \| no_v1 \| no_v1 == "all" ); Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:43 +00:00
Zvonko Kaiser	aab9d36e47	agent: Add tests for cgroup_no_v1 The only valid value is "all", ignore all other Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:43 +00:00
Zvonko Kaiser	e1596f7abf	agent: Add option to parse cgroup_no_v1 For AGENT_INIT=yes we do not run systemd and hence systemd.unified_... does not mean anything to other init systems. Providing cgroup_no_v1=all is enough to signal other init systemd to use cgroupV2. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:43 +00:00
Zvonko Kaiser	cd7001612a	gpu: rootfs adjust for AGENT_INIT=no Since we're defaulting to AGENT_INIT=no for all the initrd/images adapt the NV build to properly get kata-agent installed. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:21 +00:00
Zvonko Kaiser	10974b7bec	gpu: AGENT_INIT=no We're setting globally for each initrd and image AGENT_INIT=no Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:21 +00:00
Zvonko Kaiser	98e0dc1676	gpu: Add set -u to scripts Make the scripts more robust by failing on unset varaibles Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:21 +00:00
Zvonko Kaiser	f153229865	gpu: Add driver version selection Besides latest and lts options add an option to specify the exact driver version. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:21 +00:00
Steve Horsman	311c3638c6	Merge pull request #10794 from fidencio/topic/bump-ubuntu-version-for-the-confidential-rootfs-and-initrd versions: Bump Ubuntu base image & initrd	2025-01-27 15:55:16 +00:00
Fabiano Fidêncio	84b0ca1b18	versions: Bump Ubuntu rootfs / initrd versions While I wish we could be bumping to the very same version everywhere, it's not possible and it's been quite a ride to get a combination of things that work. Let me try to describe my approach here: * Do NOT stay on 20.04 * This version will be EOL'ed by April * This version has a very old version of systemd that causes a bug when trying to online the cpusets for guests using systemd as init, causing then a breakage on the qemu-coco-non-tee and TDX non-attestation set of tests * Bump to 22.04 when possible * This was possible for the majority of the cases, but for the confidential initrd & confidential images for x86_64, the reason being failures on AMD SEV CI (which I didn't debug), and a kernel panic on the CentOS 9 Stream TDX machine * 22.04 is being used instead of 24.04 as multistrap is simply broken on Ubuntu 24.04, and I'd prefer to stay on an LTS release whenever it's possible * Bump to 24.10 for x86_64 image confidential * This was done as we got everything working with 24.10 in the CI. * This requires using libtdx-attest from noble (Ubuntu 24.04), as Intel only releases their sgx stuff for LTS releases. * Stick to 20.04 for x86_64 initrd confidential * 24.10 caused a panic on their CI * This is only being used by AMD so far, so they can decide when to bump, after doing the proper testing & debug that the bump will work as expected for them Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 15:08:20 +01:00
Carlos Segarra	b6e0effc06	tdx: bump version of libtdx-attest in rootfs-builder Bump libtdx-attest to its 1.22 release. Signed-off-by: Carlos Segarra <carlos@carlossegarra.com>	2025-01-27 15:08:20 +01:00
Fabiano Fidêncio	2b5dbfacb8	osbuilder: ubuntu: Try to install pyinstaller using --break-system-packages We first try without passing the `--break-system-packages` argument, as that's not supported on Ubuntu 22.04 or older, but that's required on Ubuntu 24.04 or newer. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 15:08:20 +01:00
Fabiano Fidêncio	c54f78bc6b	local-build: cache: Consider os name & version for image/initrd Otherwise a bump in the os name and / or os version would lead to the CI using a cached artefact. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 15:08:20 +01:00
Fabiano Fidêncio	4a66acc6f5	osbuilder: ubuntu: Abort if multistrap fails (but not on 20.04) We have gotten Ubuntu 20.04 working pretty much "by luck", as multistrap fails the deployment, and then a hacky function was introduced to add the proper dbus links. However, this does not scale at all, and we should: * Fail if multistrap fails * I won't do this for Ubuntu 20.04 as it's working for now and soon enough it'll be EOL * Add better logging to ensure someone can know when multistrap fails Below you can find the failure that we're hitting on Ubuntu 20.04: ```sh Errors were encountered while processing: dbus ERR: dpkg configure reported an error. Native mode configuration reported an error! I: Tidying up apt cache and list data. Multistrap system reported 1 error in /rootfs/. I: Tidying up apt cache and list data. ``` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 15:08:16 +01:00
Fabiano Fidêncio	585f82f730	osbuilder: ubuntu: Ensure OS_VERSION is passed & used Right now we're hitting an interesting situation with osbuilder, where regardless of what's being passed Ubuntu 20.04 (focal) is being used when building the rootfs-image, as shown in the snippets of the logs below: ``` ffidenci@tatu:~/src/upstream/kata-containers/kata-containers$ make rootfs-image-confidential-tarball /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-copy-libseccomp-installer.sh "agent" make agent-tarball-build ... make pause-image-tarball-build ... make coco-guest-components-tarball-build ... make kernel-confidential-tarball-build ... make rootfs-image-confidential-tarball-build make[1]: Entering directory '/home/ffidenci/src/upstream/kata-containers/kata-containers' /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-binaries-in-docker.sh --build=rootfs-image-confidential sha256:f16c57890b0e85f6e1bbe1957926822495063bc6082a83e6ab7f7f13cabeeb93 Build kata version 3.13.0: rootfs-image-confidential INFO: DESTDIR /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/destdir INFO: Create image build image ~/src/upstream/kata-containers/kata-containers/tools/osbuilder ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/builddir INFO: Build image INFO: image os: ubuntu INFO: image os version: latest Creating rootfs for ubuntu /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs.sh -o 3.13.0-13f0807e9f5687d8e5e9a0f4a0a8bb57ca50d00c-dirty -r /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/builddir/rootfs-image/ubuntu_rootfs ubuntu INFO: rootfs_lib.sh file found. Loading content ~/src/upstream/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/ubuntu ~/src/upstream/kata-containers/kata-containers/tools/osbuilder ~/src/upstream/kata-containers/kata-containers/tools/osbuilder INFO: rootfs_lib.sh file found. Loading content INFO: build directly WARNING: apt does not have a stable CLI interface. Use with caution in scripts. Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [128 kB] Get:2 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB] Get:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease [128 kB] Get:4 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [4276 kB] Get:5 http://archive.ubuntu.com/ubuntu focal-backports InRelease [128 kB] Get:6 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB] Get:7 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [1297 kB] Get:8 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [30.9 kB] Get:9 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [4187 kB] Get:10 http://archive.ubuntu.com/ubuntu focal/restricted amd64 Packages [33.4 kB] Get:11 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages [1275 kB] Get:12 http://archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages [177 kB] Get:13 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [4663 kB] Get:14 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1589 kB] Get:15 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [34.6 kB] Get:16 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [4463 kB] Get:17 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [55.2 kB] Get:18 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [28.6 kB] Fetched 34.1 MB in 5s (6284 kB/s) ... ``` The reason this is happening is due to a few issues in different places: 1. IMG_OS_VERSION, passed to osbuilder, is not used anywhere and OS_VERSION should be used instead. And we should break if OS_VERSION is not properly passed down 2. Using UBUNTU_CODENAME is simply wrong, as it'll use whatever comes as the base container from kata-deploy's local-build scripts, and it has just been working by luck Note that at the same time this commit fixes the wrong behaviour, it would break the rootfses build as they are, this we need to set the versions.yaml to use 20.04 were it was already using 20.04 even without us knowing. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 14:19:42 +01:00
Fabiano Fidêncio	02a18c1359	versions: Clarify which release matches a codename It'll make the life of the developers not so familiar with Ubuntu easier. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 14:19:42 +01:00
Fabiano Fidêncio	ca96a6ac76	versions: Use Ubuntu codename instead of versions As this is required as part of the osbuilder tool to be able to properly set the repositories used when building the rootfs. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 14:19:39 +01:00
Fabiano Fidêncio	353ceb948e	versions: Don't use the yaml variable definitions While having variables are nice, those are more extensive to write down, and actually confusing for tired developer eyes to read, plus we're mixing the use of the yaml variables here and there together with not using them for some architectures. With the best "all or nothing" spirit, let's just make it easier for our developers to read the versions.yaml and easily understand what's being used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 14:19:08 +01:00
Jakub Ledworowski	42531cf6c4	kernel: Add CONFIG_TMPFS_XATTR to confidential kernel During pull inside the guest, overlayfs expects xattrs. Fixes: [guest-components#876](https://github.com/confidential-containers/guest-components/issues/876) Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>	2025-01-27 07:07:54 +01:00
Zvonko Kaiser	b4c710576e	Merge pull request #10782 from stevenhorsman/clh-metrics-write-update metrics: Increase minval range for blogbench test	2025-01-24 10:21:20 -05:00
Steve Horsman	54e7e1fdc3	Merge pull request #10768 from kata-containers/dependabot/go_modules/src/runtime/go_modules-28d0d344dd build(deps): bump the go_modules group across 3 directories with 1 update	2025-01-24 12:04:56 +00:00
Greg Kurz	17f3eb0579	Merge pull request #10766 from balintTobik/remove_shebang Remove shebang in non-executable completion script	2025-01-24 12:29:03 +01:00
Alex Lyn	ee635293c6	Merge pull request #10740 from RuoqingHe/virtiofsd-riscv64 virtiofsd: Enable build for RISC-V	2025-01-24 15:43:56 +08:00
Zvonko Kaiser	f5c509d58e	Merge pull request #10779 from kata-containers/topic/arm64-static-build-runner workflows: Move arm static checks runner	2025-01-23 22:29:16 -05:00
Fabiano Fidêncio	4bc978416c	Merge pull request #10720 from fidencio/topic/test-cgroupsv2-on-guest kernel: Ensure no cgroupsv1 is used	2025-01-23 21:26:49 +01:00
Aurélien Bombo	66d292bdb4	Merge pull request #10732 from microsoft/danmihai/minor-systemd-cleanup rootfs: minor systemd file deletion cleanup	2025-01-23 11:29:25 -06:00
Fabiano Fidêncio	b47cc6fffe	cri-containerd: Skip TestDeviceCgroup till it's adapted to cgroupsv2 As the devices controller works in a different way in cgroupsv2, the "/sys/fs/cgroup/devices/devices.list" file simply doesn't exist. For now, let's skip the test till the test maintainer decides to re-enable it for cgroupsv2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
Fabiano Fidêncio	0626d7182a	tests: k8s-cpu-ns: Adapt to cgroupsv2 The changes done are: * cpu/cpu.shares was replaced by cpu.weight * The weight, according to our reference[0], is calculated by: weight = (1 + ((request - 2) * 9999) / 262142) * cpu/cpu.cfs_quota_us & cpu/cpu.cfs_period_us were replaced by cpu.max, where quota and period are written together (in this order) [0]: https://github.com/containers/crun/blob/main/crun.1.md#cgroup-v2 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
Fabiano Fidêncio	4307f0c998	Revert "ci: mariner: Ensure kernel_params can be set" This reverts commit `091ad2a1b2`, in order to ensure tests would be running with cgroupsv2 on the guest. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
Fabiano Fidêncio	c653719270	kernel: Ensure no cgroupsv1 is used Let's ensure that we're fully running the guest on cgroupsv2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
stevenhorsman	d031e479ab	metrics: Increase minval range for blogbench test In the last couple of days I've seen the blogbench metrics write latency test on clh fail a few times because the latency was too low, so adjust the minimum range to tolerate quicker finishes. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 15:58:31 +00:00
Fabiano Fidêncio	66d881a5da	Merge pull request #10755 from fidencio/topic/ensure-systemd-is-used-as-init-for-coco-cases rootfs-confidential: Ensure systemd is used as init	2025-01-23 15:25:24 +01:00
stevenhorsman	3acce82c91	ci: Update gatekeeper tests for static workflow The static-checks targets are `pull_request`, so they can run the PR workflow version, so we want to update the required-tests.yaml so that static-check workflow changes do trigger static checks in order to test them properly. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 14:23:09 +00:00
stevenhorsman	d625f20d18	workflows: Move arm static checks runner Now we have the build-assets running on the gh-hosted runners, try the same approach for the static-checks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 14:23:09 +00:00
Zvonko Kaiser	a23d6a1241	Merge pull request #10777 from zvonkok/arm64-nvidia-gpu-kernel gpu: Fix arm64 kernel build	2025-01-23 07:14:30 -05:00
Christophe de Dinechin	9a92a4bacf	cli: Remove shebang in non-executable completion script Raised during package review [1] by rpmlint [1] https://bugzilla.redhat.com/show_bug.cgi?id=1590425#c8 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-23 13:11:25 +01:00
Fabiano Fidêncio	734ef71cf7	tests: k8s: confidential: Cleanup $HOME/.ssh/known_hosts I've noticed the following error when running the tests with SEV: ``` 2025-01-21T17:10:28.7999896Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2025-01-21T17:10:28.8000614Z # @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ 2025-01-21T17:10:28.8001217Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2025-01-21T17:10:28.8001857Z # IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! 2025-01-21T17:10:28.8003009Z # Someone could be eavesdropping on you right now (man-in-the-middle attack)! 2025-01-21T17:10:28.8003348Z # It is also possible that a host key has just been changed. 2025-01-21T17:10:28.8004422Z # The fingerprint for the ED25519 key sent by the remote host is 2025-01-21T17:10:28.8005019Z # SHA256:x7wF8zI+LLyiwphzmUhqY12lrGY4gs5qNCD81f1Cn1E. 2025-01-21T17:10:28.8005459Z # Please contact your system administrator. 2025-01-21T17:10:28.8006734Z # Add correct host key in /home/kata/.ssh/known_hosts to get rid of this message. 2025-01-21T17:10:28.8007031Z # Offending ED25519 key in /home/kata/.ssh/known_hosts:178 2025-01-21T17:10:28.8007254Z # remove with: 2025-01-21T17:10:28.8008172Z # ssh-keygen -f "/home/kata/.ssh/known_hosts" -R "10.244.0.71" ``` And this was causing a failure to ssh into the confidential pod. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 12:04:13 +01:00
Fabiano Fidêncio	18137b1583	tests: k8s: confidential: Increase log_buf_len to 4M Relying on dmesg is really not ideal, as we may lose important info, mainly those which happen very early in the boot, depending on the size of kernel ring buffer. So, for this specific test, let's increase the kernel ring buffer, by default, to 4M. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 12:04:13 +01:00
Fabiano Fidêncio	d5f907dcf1	rootfs-confidential: Ensure systemd is used as init Let's make sure that we don't use Kata Containers' agent as init for the Confidential related rootfses, as we don't want to increase the agent's complexity for no reason ... mainly when we can rely on a proper init system. : - images already used systemd as init - initrds are now using systemd as init Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 12:04:13 +01:00
dependabot[bot]	d2cb14cdbc	build(deps): bump the go_modules group across 3 directories with 1 update Bumps the go_modules group with 1 update in the /src/runtime directory: [golang.org/x/net](https://github.com/golang/net). Bumps the go_modules group with 1 update in the /src/tools/csi-kata-directvolume directory: [golang.org/x/net](https://github.com/golang/net). Bumps the go_modules group with 1 update in the /tools/testing/kata-webhook directory: [golang.org/x/net](https://github.com/golang/net). Updates `golang.org/x/net` from 0.25.0 to 0.33.0 - [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0) Updates `golang.org/x/net` from 0.23.0 to 0.33.0 - [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0) Updates `golang.org/x/net` from 0.23.0 to 0.33.0 - [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/net dependency-type: direct:production dependency-group: go_modules - dependency-name: golang.org/x/net dependency-type: indirect dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2025-01-23 10:18:22 +00:00
Fupan Li	944eb2cf3f	Merge pull request #10762 from teawater/remove_enable_swap libs/kata-types: Remove config enable_swap	2025-01-23 14:03:42 +08:00
Fupan Li	ebd8ec227b	Merge pull request #10778 from zvonkok/kata-agent-cgroupsV2 agent: Ensure proper cgroupsV2 handling with init_mode=true	2025-01-23 14:00:13 +08:00
Zvonko Kaiser	afd286f6d6	agent: Ensure proper cgroupsV2 with init_mode=yes When the agent is run as the init process cgroupfs is being setup. In the case of cgroupsV1 we needed to enable the memory hiearchy this is now per default enabled in cgroupsV2. Additionally the file /sys/fs/cgroup/memory/memory.use_hierarchy isn't even available with V2. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-23 03:54:51 +00:00
Fabiano Fidêncio	3f8abb4da7	Merge pull request #10776 from kata-containers/topic/arm64-runners workflows: Switch to github-hosted arm runners	2025-01-22 23:14:28 +01:00
Zvonko Kaiser	91c6d524f8	gpu: Fix arm64 kernel build CONFIG_IOASID (not configurable) in newer kernels. Removing it. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-22 18:15:57 +00:00
Fabiano Fidêncio	6baa60d77d	Merge pull request #10775 from fidencio/topic/update-ttrpc-crate agent: Update ttrpc to include the fix for connectivity issues	2025-01-22 17:45:38 +01:00
stevenhorsman	ab27e11d31	workflows: Switch to github-hosted arm runner Now that gituhb have hosted arm runners https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/ we should try and switch our arm64 builder jobs to run on these. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-22 16:27:17 +00:00
Greg Kurz	90b6d5725b	Merge pull request #10773 from RuoqingHe/retry-on-aks-throttle ci: Retry on failure of Create AKS cluster	2025-01-22 15:30:57 +01:00
Ruoqing He	373a388844	ci: Retry on failure of Create AKS cluster The `Create AKS cluster` step in `run-k8s-tests-on-aks.yaml` is likely to fail fail since we are trying to issue `PUT` to `aks` in a relatively high frequency, while the `aks` end has it's limit on `bucket-size` and `refill-rate`, documented here [1]. Use `nick-fields/retry@v3` to retry in 10 seconds after request fail, based on observations that AKS were request 7, or 8 second delays before retry as part of their 429 response [1] https://learn.microsoft.com/en-us/azure/aks/quotas-skus-regions#throttling-limits-on-aks-resource-provider-apis Fixes: #10772 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-22 13:24:51 +00:00
Fabiano Fidêncio	a8678a7794	deps: Update ttrpc to v0.8.4 Update the ttrpc crate to include the fix from Moritz Sanft, which solves the connectivity issues with 6.12.x kernels* *: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.12.9&id=3257813a3ae7462ac5cde04e120806f0c0776850 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-22 13:05:43 +01:00
Fabiano Fidêncio	e71bc1f068	Merge pull request #10770 from zvonkok/gpu_kernel_dep gpu: Add kernel dep for the non coco use-case	2025-01-22 12:53:39 +01:00
Greg Kurz	17d053f4bb	Merge pull request #10711 from teawater/balloon Add reclaim_guest_freed_memory config to qemu and cloud-hypervisor	2025-01-22 10:57:13 +01:00
Hui Zhu	c148b70da7	libs/kata-types: Remove config enable_swap Remove config enable_swap because there is no code use it. Fixes: #10761 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-22 11:08:45 +08:00
Aurélien Bombo	4e9d1363b3	Merge pull request #10754 from sprt/sprt/ci-gh-pr-number-coco ci: Unify on `$GH_PR_NUMBER` environment variable	2025-01-21 15:07:24 -06:00
Zvonko Kaiser	4621f53e4a	gpu: Add kernel dep for the non coco use-case Add the kernel dependency to the non coco use-case so that a rootfs build can be executed via GHA. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-21 16:18:14 +00:00
Zvonko Kaiser	61c282c725	Merge pull request #10769 from kata-containers/revert-10764-gpu_ci_cd Revert "gpu: Add rootfs target amd64/arm64"	2025-01-21 11:09:52 -05:00
Zvonko Kaiser	9fd430e46b	Revert "gpu: Add rootfs target amd64/arm64" Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-21 16:08:30 +00:00
Zvonko Kaiser	ef1639b6bf	Merge pull request #10764 from zvonkok/gpu_ci_cd gpu: Add rootfs target amd64/arm64	2025-01-21 09:51:20 -05:00
Ruoqing He	7e76ef587a	virtiofsd: Enable build for RISC-V With this change, `virtiofsd` (gnu target) could be built and then to be used with other components. Depends: #10741 Fixes: #10739 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-01-21 18:05:37 +08:00
Hui Zhu	185b94b7fa	runtime-rs: Add reclaim_guest_freed_memory cloud-hypervisor support Add reclaim_guest_freed_memory config to cloud-hypervisor in runtime-rs. Fixes: #10710 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-21 10:34:21 +08:00
Hui Zhu	487171d992	runtime-rs: Add reclaim_guest_freed_memory qemu support Add reclaim_guest_freed_memory config to qemu in runtime-rs. Fixes: #10710 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-21 10:34:18 +08:00
Hui Zhu	8f550de88a	runtime-rs: db: Change config enable_balloon_f_reporting Change config enable_balloon_f_reporting of db to reclaim_guest_freed_memory. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-21 10:34:08 +08:00
Hui Zhu	42f5ef9ff1	kernel: config: Add CONFIG_VIRTIO_BALLOON to virtio.conf Add CONFIG_VIRTIO_BALLOON to virtio.conf to open virtio-balloon. Fixes: #10710 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-21 10:34:04 +08:00
Zvonko Kaiser	8b097244e7	gpu: Add rootfs initrd build for arm64 We need the arm64 builds as well for GH and GB systems. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-20 19:03:52 +00:00
Zvonko Kaiser	f525631522	gpu: Add rootfs target amd64 Adding the initrd build first to get the rootfs on amd64. With that we can start to add tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-20 19:01:42 +00:00
Zvonko Kaiser	d7059e9024	Merge pull request #10736 from zvonkok/gpu-rootfs-fix gpu: Fix rootfs build	2025-01-17 14:44:41 -05:00
Aurélien Bombo	0d70dc31c1	ci: Unify on $GH_PR_NUMBER environment variable While working on #10559, I realized that some parts of the codebase use $GH_PR_NUMBER, while other parts use $PR_NUMBER. Notably, in that PR, since I used $GH_PR_NUMBER for CoCo non-TEE tests without realizing that TEE tests use $PR_NUMBER, the tests on that PR fail on TEEs: https://github.com/kata-containers/kata-containers/actions/runs/12818127344/job/35744760351?pr=10559#step:10:45 ... 44 error: error parsing STDIN: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context ... 135 image: ghcr.io/kata-containers/csi-kata-directvolume: ... So let's unify on $GH_PR_NUMBER so that this issue doesn't repro in the future: I replaced all instances of PR_NUMBER with GH_PR_NUMBER. Note that since some test scripts also refer to that variable, the CI for this PR will fail (would have also happened with the converse substitution), hence I'm not adding the ok-to-test label and we should force-merge this after review. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-01-17 10:53:08 -06:00
Fabiano Fidêncio	c018a1cc61	Merge pull request #10741 from RuoqingHe/update-virtiofsd-build-image virtiofsd: Update ubuntu to 22.04 for gnu target	2025-01-16 20:51:10 +01:00
Zvonko Kaiser	2777b13db7	Merge pull request #10742 from zvonkok/3.13.0-release release: Bump version to 3.13.0	2025-01-16 10:05:48 -05:00
Ruoqing He	c70195d629	virtiofsd: Update ubuntu to 22.04 for gnu target With ubuntu 20.04 image, virtiofsd gnu target couldn't be built due to "unsupported ISA subset z" reported by "cc". Updating to ubuntu 22.04 image addresses this problem. Relates: #10739 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-01-16 17:27:38 +08:00
Zvonko Kaiser	e82fdee20f	runtime: Add proper IOMMUFD parsing With newer kernels we have a new backend for VFIO called IOMMUFD this is a departure from VFIO IOMMU Groups since it has only one device associated with an IOMMUFD entry. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-15 23:39:33 +00:00
Zvonko Kaiser	f0bd83b073	gpu: Fix rootfs build The pyinstaller is located per default under /usr/local/bin some prior versions were installing it to ${HOME}. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-15 20:37:51 +00:00
Aurélien Bombo	0d93f59f5b	Merge pull request #10738 from microsoft/danmihai1/empty-pty-lines runtime: skip empty Guest console output lines	2025-01-15 10:33:24 -06:00
Zvonko Kaiser	0b04f43ac6	release: Bump version to 3.13.0 Bump VERSION and helm-chart versions Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-15 16:13:22 +00:00
Zvonko Kaiser	365def9b4a	Merge pull request #10735 from BbolroC/kubectl-create-retry-trusted-storage tests: Introduce retry_kubectl_apply() for trusted storage	2025-01-14 21:59:45 -05:00
Dan Mihai	2e21f51375	runtime: skip empty Guest console output lines Skip logging empty lines of text from the Guest console output, if there are any such lines. Without this change, the Guest console log from CLH + /dev/pts/0 has twice as many lines of text. Half of these lines are empty. Fixes: #10737 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-15 00:28:26 +00:00
Hyounggyu Choi	f7816e9206	tests: Introduce retry_kubectl_apply() for trusted storage On s390x, some tests for trusted storage occasionally failed due to: ```bash etcdserver: request timed out ``` or ```bash Internal error occurred: resource quota evaluation timed out ``` These timeouts were not observed previously on k3s but occur sporadically on kubeadm. Importantly, they appear to be temporary and transient, which means they can be ignored in most cases. To address this, we introduced a new wrapper function, `retry_kubectl_apply()`, for `kubectl create`. This function retries applying a given manifest up to 5 times if it fails due to a timeout. However, it will still catch and handle any other errors during pod creation. Fixes: #10651 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-14 21:15:44 +01:00
Fabiano Fidêncio	121ac0c5c0	Merge pull request #10727 from microsoft/danmihai1/mariner3-guest image: bump mariner guest version to 3.0	2025-01-14 19:06:28 +01:00
Fabiano Fidêncio	3658ea2320	Merge pull request #10731 from microsoft/danmihai1/quiet-rootfs-build rootfs: reduced console output by default	2025-01-14 19:02:42 +01:00
Chengyu Zhu	7d34ca4420	Merge pull request #10674 from bpradipt/fix-10398 agent: alternative implementation for sealed_secret as volume	2025-01-14 18:55:45 +08:00
Fabiano Fidêncio	4578969c5d	Merge pull request #10730 from BbolroC/bump-coco-trustee versions: Bump trustee to latest	2025-01-14 08:56:11 +01:00
Dan Mihai	c4da296326	rootfs: delete links to deleted files Delete symbolic links to files being deleted. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-13 21:28:44 +00:00
Dan Mihai	5b8471ffce	rootfs: print the path to files being deleted Show the list of files being deleted. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-13 21:28:34 +00:00
Dan Mihai	a49d0fb343	rootfs: delete systemd units/files from rootfs.sh Move the deletion of unnecessary systemd units and files from image_builder.sh into rootfs.sh. The files being deleted can be applicable to other image file formats too, not just to the rootfs-image format created by image_builder.sh. Also, image_builder.sh was deleting these files after it calculated the size of the rootfs files, thus missing out on the opportunity to possibly create a smaller image file. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-13 21:28:23 +00:00
Dan Mihai	0f522c09d9	rootfs: reduced console output by default Use "set -x" only when the user specified DEBUG=1. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-13 19:34:05 +00:00
Pradipta Banerjee	36580bb642	tests: Update sealed secret CI value to base64url The existing encoding was base64 and it fails due to `874948638a` Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2025-01-13 09:37:05 -05:00
Hyounggyu Choi	2cdb549a75	versions: Bump trustee to latest This update addresses an issue with token verification for SE and SNP introduced in the last update by #10541. Bumping the project to the latest commit resolves the issue. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-13 15:07:33 +01:00
Pradipta Banerjee	5218345e34	agent: alternative implementation for sealed_secret as volume The earlier implementation relied on using a specific mount-path prefix - `/sealed` to determine that the referenced secret is a sealed secret. However that was restrictive for certain use cases as it forced the user to always use a specific mountpath naming convention. This commit introduces an alternative implementation to relax the restriction. A sealed secret can be mounted in any mount-path. However it comes with a potential performance penality. The implementation loops through all volume mounts and reads the file to determine if it's a sealed secret or not. Fixes: #10398 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2025-01-11 12:36:44 -05:00
Dan Mihai	4707883b40	image: bump mariner guest version to 3.0 Use Mariner 3.0 (a.k.a., Azure Linux 3.0) as the Guest CI image. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-11 17:36:19 +00:00
Fabiano Fidêncio	2d9baf899a	Merge pull request #10719 from msanft/msanft/runtime/fix-boolean-opts runtime: use actual booleans for QMP `device_add` boolean options	2025-01-11 16:38:06 +01:00
Zvonko Kaiser	f08a9eac11	Merge pull request #10721 from stevenhorsman/more-metrics-latency-minimum-range-fixes metrics: Increase latency test range	2025-01-10 21:59:39 -05:00
Moritz Sanft	e5735b221c	runtime: use actual booleans for QMP `device_add` boolean options Since `be93fd5372`, which is included in QEMU since version 9.2.0, the options for the `device_add` QMP command need to be typed correctly. This makes it so that instead of `"on"`, the value is set to `true`, matching QEMU's expectations. This has been tested on QEMU 9.2.0 and QEMU 9.1.2, so before and after the change. The compatibility with incorrectly typed options for the `device_add` command is deprecated since version 6.2.0 [^1]. [^1]: https://qemu-project.gitlab.io/qemu/about/deprecated.html#incorrectly-typed-device-add-arguments-since-6-2 Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>	2025-01-10 11:53:56 +01:00
Wainer Moschetta	5fae2a9f91	Merge pull request #9871 from wainersm/fix-print_cluster_name tests/gha-run-k8s-common: shorten AKS cluster name	2025-01-09 14:35:02 -03:00
stevenhorsman	aaae5b6d0f	metrics: clh: Increase network-iperf3 range We hit a failure with: ``` time="2025-01-09T09:55:58Z" level=warning msg="Failed Minval (0.017600 > 0.015000) for [network-iperf3]" ``` The range is very big, but in the last 3 test runs I reviewed we have got a minimum value of 0.015s and a max value of 0.052, so there is a ~350% difference possible so I think we need to have a wide range to make this stable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-09 11:25:57 +00:00
stevenhorsman	e946d9d5d3	metrics: qemu: Increase latency test range After the kernel version bump, in the latest nightly run https://github.com/kata-containers/kata-containers/actions/runs/12681309963/job/35345228400 The sequential read throughput result was 79.7% of the expected (so failed) and the sequential write was 84% of the expected, so was fairly close, so increase their minimum ranges to make them more robust. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-09 11:25:50 +00:00
Wainer dos Santos Moschetta	badc208e9a	tests/gha-run-k8s-common: shorten AKS cluster name Because az client restricts the name to be less than 64 characters. In some cases (e.g. KATA_HYPERVISOR=qemu-runtime-rs) the generated name will exceed the limit. This changed the function to shorten the name: * SHA1 is computed from metadata then compound the cluster's name * metadata as plain-text are passed as --tags Fixes: #9850 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-01-08 16:39:07 -03:00
Fabiano Fidêncio	8f8988fcd1	Merge pull request #10714 from fidencio/topic/update-virtiofsd virtiofsd: Update to its v1.13.0 ( + one patch) release :-)	2025-01-08 17:59:29 +01:00
Fabiano Fidêncio	7e5e109255	Merge pull request #10541 from fitzthum/bump-trustee-010 Update Trustee and Guest Components	2025-01-08 17:44:13 +01:00
Fabiano Fidêncio	eb3fe0d27c	Merge pull request #10717 from fidencio/topic/re-enable-oom-test-for-mariner tests: Re-enable oom tests for mariner	2025-01-08 17:43:56 +01:00
Fabiano Fidêncio	65e267294b	Merge pull request #10718 from stevenhorsman/metrics-blogbench-latency-minimal-range-increase metrics: Increase latency minimum range	2025-01-08 17:09:36 +01:00
stevenhorsman	dc069d83b5	metrics: Increase latency test range The bump to kernel 6.12 seems to have reduced the latency in the metrics test, so increase the ranges for the minimal value, to account for this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-08 15:11:49 +00:00
Fabiano Fidêncio	967d5afb42	Revert "tests: k8s: Skip one of the empty-dir tests" This reverts commit `9aea7456fb`. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-08 14:07:34 +01:00
Fabiano Fidêncio	7ae2ca4c31	virtiofsd: Update to its v1.13.0 + one patch release Together with the bump, let's also bump the rust version needed to build the package, with the caveat that virtiofsd doesn't actually use a pinned version as part of their CI, so we're bumping to whatever is the version on `alpine:rust` (which is used in their CI). It's important to note that we're using a version which brings in one extra patch apart from the release, as the next virtiofsd release will happen at the end of February, 2025. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-08 14:07:34 +01:00
Fabiano Fidêncio	0af3536328	packaging: virtiofsd: Allow building a specific commit Right now we've been only building releases from virtiofsd, but we'll need to pin a specific commit till v1.14.0 is out, thus let's add the needed machinery to do so. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-08 14:07:34 +01:00
Tobin Feldman-Fitzthum	41c7f076fa	packaging: updating guest components build script The guest-components directory has been re-arranged slightly. Adjust the installation path of the LUKS helper script to account for this. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-01-07 16:59:10 -06:00
Tobin Feldman-Fitzthum	cafc7d6819	versions: update trustee and guest components Trustee has some new features including a plugin backend, support for PKCS11 resources, improvements to token verification, and adjustments to logging, and more. Also update guest-components to pickup improvements and keep the KBS protocol in sync. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-01-07 16:59:10 -06:00
Fabiano Fidêncio	53ac0f00c5	tests: Re-enable oom tests for mariner Since we bumped to the 6.12.x LTS kernel, we've also adjusted the aggressivity of the OOM test, which may be enough to allow us to re-enable it for mariner. Fixes: #8821 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-07 18:33:17 +01:00
Fabiano Fidêncio	f4a39e8c40	Merge pull request #10468 from fidencio/topic/early-tests-on-next-lts-kernel versions: Move kernel to the latest 6.12 release (the current LTS)	2025-01-07 18:02:04 +01:00
Fupan Li	bd56891f84	Merge pull request #10702 from lifupan/fix_containerdname CI: change the containerd tarball name from cri-containerd-cni to containerd	2025-01-07 18:56:15 +08:00
Fupan Li	b19db40343	CI: change the containerd tarball name to containerd Since from https://github.com/containerd/containerd/pull/9096 containerd removed cri-containerd-*.tar.gz release bundles, thus we'd better change the tarball name to "containerd". BTW, the containerd tarball containerd the follow files: bin/ bin/containerd-shim bin/ctr bin/containerd-shim-runc-v1 bin/containerd-stress bin/containerd bin/containerd-shim-runc-v2 thus we should untar containerd into /usr/local directory instead of "/" to keep align with the cri-containerd. In addition, there's no containerd.service file,runc binary and cni-plugin included, thus we should add a specific containerd.service file and install install the runc binary and cni-pluginspecifically. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-01-07 17:39:05 +08:00
Fabiano Fidêncio	9aea7456fb	tests: k8s: Skip one of the empty-dir tests An issue has been created for this, and we should fix the issue before the next release. However, for now, let's unblock the kernel bump and have the test skipped. Reference: https://github.com/kata-containers/kata-containers/issues/10706 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-06 21:48:20 +01:00
Fabiano Fidêncio	44ff602c64	tests: k8s: Be more aggressive to get OOM Let's increase the amount of bytes allocated per VM worker, so we can hit the OOM sooner. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-06 21:48:20 +01:00
Fabiano Fidêncio	f563f0d3fc	versions: Update kernel to v6.12.8 There are lots of configs removed from latest kernel. Update them here for convenience of next kernel upgrade. Remove CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE [1] Remove CONFIG_IP_NF_TARGET_CLUSTERIP [2] Remove CONFIG_NET_SCH_CBQ [3] Remove CONFIG_AUTOFS4_FS [4] Remove CONFIG_EMBEDDED [5] Remove CONFIG_ARCH_RANDOM & CONFIG_RANDOM_TRUST_CPU [6] [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=a7e4676e8e2cb158a4d24123de778087955e1b36 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=9db5d918e2c07fa09fab18bc7addf3408da0c76f [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=051d442098421c28c7951625652f61b1e15c4bd5 [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=1f2190d6b7112d22d3f8dfeca16a2f6a2f51444e [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=ef815d2cba782e96b9aad9483523d474ed41c62a [6] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2&id=b9b01a5625b5a9e9d96d14d4a813a54e8a124f4b Apart from the removals, CONFIG_CPU_MITIGATIONS is now a dependency for CONFIG_RETPOLINE (which has been renamed to CONFIG_MITIGATION_RETPOLINE) and CONFIG_PAGE_TABLE_ISOLATION (which has been renamed to CONFIG_MITIGATION_PAGE_TABLE_ISOLATION). I've added that to the whitelist because we still build older versions of the kernel that do not have that dependency. Fixes: #8408 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-06 21:48:20 +01:00
Xuewei Niu	71b14d40f2	Merge pull request #10696 from teawater/kt kata-ctl: direct-volume: Auto create KATA_DIRECT_VOLUME_ROOT_PATH	2025-01-02 14:04:37 +08:00
Hui Zhu	d15a7baedd	kata-ctl: direct-volume: Auto create KATA_DIRECT_VOLUME_ROOT_PATH Got following issue: kata-ctl direct-volume add /kubelet/kata-direct-vol-002/directvol002 "{\"device\": \"/home/t4/teawater/coco/t.img\", \"volume-type\": \"directvol\", \"fstype\": \"\", \"metadata\":"{}", \"options\": []}" subsystem: kata-ctl_main Dec 30 09:43:41.150 ERRO Os { code: 2, kind: NotFound, message: "No such file or directory", } The reason is KATA_DIRECT_VOLUME_ROOT_PATH is not exist. This commit create_dir_all KATA_DIRECT_VOLUME_ROOT_PATH before join_path to handle this issue. Fixes: #10695 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-30 17:55:49 +08:00
Xuewei Niu	6400295940	Merge pull request #10683 from justxuewei/nxw/remove-mut	2024-12-29 00:49:38 +08:00
Fupan Li	2068801b80	Merge pull request #10626 from teawater/ma Add mem-agent to kata	2024-12-24 14:11:36 +08:00
Steve Horsman	2322f6df94	Merge pull request #10686 from stevenhorsman/ppc64le-all-prepare-steps-timeout workflows: Add more ppc64le timeouts	2024-12-20 19:08:48 +00:00
stevenhorsman	9b6fce9e96	workflows: Add more ppc64le timeouts Unsurprisingly now we've got passed the containerd test hangs on the ppc64le, we are hitting others in the "Prepare the self-hosted runner" stage, so add timeouts to all of them to avoid CI blockages. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 17:31:24 +00:00
Steve Horsman	162e2af4f5	Merge pull request #10685 from stevenhorsman/ppc64le-containerd-test-timeout workflows: Add timeout to some ppc64le steps	2024-12-20 16:55:40 +00:00
stevenhorsman	d9d8d53bea	workflows: Add timeout to some ppc64le steps In some runs e.g. https://github.com/kata-containers/kata-containers/actions/runs/12426384186/job/34697095588 and https://github.com/kata-containers/kata-containers/actions/runs/12422958889/job/34697016842 we've seen the Prepare the self-hosted runner and Install dependencies steps get stuck for 5hours+. If they are working then it should take a few minutes, so let's add timeouts and not hold up whole the CI if they are stuck Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 16:37:36 +00:00
Steve Horsman	99f239bc44	Merge pull request #10380 from stevenhorsman/required-tests-guidance doc: Add required jobs info	2024-12-20 16:24:42 +00:00
stevenhorsman	d1d4bc43a4	static-checks: Add words to dictionary devmapper and snapshotters are being marked as spelling errors, so add them to the kata dictionary Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 14:16:52 +00:00
stevenhorsman	7612839640	doc: Add required jobs info Add information about what required jobs are and our initial guidelines for how jobs are eligible for being made required, or non-required Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 14:12:13 +00:00
Xuewei Niu	ecf98e4db8	runtime-rs: Remove unneeded `mut` from `new_hypervisor()` `set_hypervisor_config()` and `set_passfd_listener_port()` acquire inner lock, so that `mut` for `hypervisor` is unneeded. Fixes: #10682 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-20 17:08:10 +08:00
Steve Horsman	2c6126d3ab	Merge pull request #10676 from stevenhorsman/fix-qemu-coco-dev-skip tests: Fix qemu-coc-dev skip	2024-12-20 08:56:54 +00:00
Xuewei Niu	ea60613be9	Merge pull request #9387 from deagon/fix-broken-usage packaging: fix the broken usage help	2024-12-20 15:20:37 +08:00
Guoqiang Ding	75baf75726	packaging: fix the broken usage help Using the plain usage text instead of the bad variable reference. Fixes: #9386 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-12-20 13:58:40 +08:00
stevenhorsman	dd02b6699e	tests: Fix qemu-coc-dev skip Fix the logic to make the test skipped on qemu-coco-dev, rather than the opposite and update the syntax to make it clearer as it incorrectly got written and reviewed by three different people in it's prior form. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-19 19:50:46 +00:00
Steve Horsman	79495379e2	Merge pull request #10668 from stevenhorsman/update-release-process-post-3.12 doc: Update the release process	2024-12-19 14:16:30 +00:00
Steve Horsman	99b9ef4e5a	Merge pull request #10675 from stevenhorsman/release-repeat-abort release: Abort if release version exists	2024-12-19 11:55:44 +00:00
stevenhorsman	c3f13265e4	doc: Update the release process Add a step to wait for the payload publish to complete before running the release action. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-19 09:52:39 +00:00
Zvonko Kaiser	f2d72874a1	Merge pull request #10620 from kata-containers/topic/fix-remove-artifact-ordering workflows: Remove potential timing issues with artifacts	2024-12-18 13:22:12 -05:00
Zvonko Kaiser	fc2c77f3b6	Merge pull request #10669 from zvonkok/qemu-aarch64-fix qemu: Fix aarch64 build	2024-12-18 08:26:55 -05:00
stevenhorsman	e2669d4acc	release: Abort if release version exists In order to check that we don't accidentally overwrite release artifacts, we should add a check if the release name already exists and bail if it does. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-18 11:04:19 +00:00
Zvonko Kaiser	07d2b00863	qemu: Fix aarch64 build Building static binaries for aarch64 requires disabling PIE We get an GOT overflow and the OS libraries are only build with fpic and not with fPIC which enables unlimited sized GOT tables. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-18 03:26:14 +00:00
Zvonko Kaiser	39bf10875b	Merge pull request #10663 from zvonkok/3.12.0-relase release: Bump version to 3.12.0	2024-12-17 10:00:42 -05:00
Zvonko Kaiser	28b57627bd	release: Bump version to 3.12.0 Bump VERSION and helm-chart versions Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-16 18:41:51 +00:00
Xuewei Niu	02b5fa15ac	Merge pull request #10655 from liubogithub/patch-1 kata-ctl: fix outdated comments	2024-12-16 13:11:25 +08:00
Hyounggyu Choi	cfbc425041	Merge pull request #10660 from BbolroC/fix-leading-zero-issue-for-vfio-ap vfio-ap: Assign default string "0" for empty APID and APQI	2024-12-13 17:40:29 +01:00
Hyounggyu Choi	341e5ca58e	vfio-ap: Assign default string "0" for empty APID and APQI The current script logic assigns an empty string to APID and APQI when APQN consists entirely of zeros (e.g., "00.0000"). However, this behavior is incorrect, as "00" and "0000" are valid values and should be represented as "0". This commit ensures that the script assigns the default string “0” to APID and APQI if their computed values are empty. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-12-13 14:39:03 +01:00
Liu Bo	95fc585103	kata-ctl: fix outdated comments MgmnClient can also tolerate short sandbox id. Signed-off-by: Liu Bo <liub.liubo@gmail.com>	2024-12-12 21:59:54 -08:00
stevenhorsman	cf8b82794a	workflows: Only remove artifacts in release builds Due to the agent-api tests requiring the agent to be deployed in the CI by the tarball, so in the short-term lets only do this on the release stage, so that both kata-manager works with the release and the agent-api tests work with the other CI builds. In the longer term we need to re-evaluate what is in our tarballs (issue #10619), but want to unblock the tests in the short-term. Fixes: #10630 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-12 17:38:27 +00:00
stevenhorsman	e1f6aca9de	workflows: Remove potential timing issues with artifacts With the code I originally did I think there is potentially a case where we can get a failure due to timing of steps. Before this change the `build-asset-shim-v2` job could start the `get-artifacts` step and concurrently `remove-rootfs-binary-artifacts` could run and delete the artifact during the download and result in the error. In this commit, I try to resolve this by making sure that the shim build waits for the artifact deletes to complete before starting. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-12 16:52:54 +00:00
Fabiano Fidêncio	7b0c1d0a8c	Merge pull request #10492 from zvonkok/upgrade-qemu-9.1.0 qemu: Upgrade qemu 9.1.2	2024-12-12 08:15:39 +01:00
Fupan Li	07fe7325c2	Merge pull request #10643 from justxuewei/fix-bind-vol runtime-rs & agent: Fix the issues with bind volumes	2024-12-12 11:34:52 +08:00
Fupan Li	372346baed	Merge pull request #10641 from justxuewei/fix-build-type runtime-rs: Ignore BUILD_TYPE if it is not release	2024-12-12 11:32:49 +08:00
Xuewei Niu	5f1b1d8932	Merge pull request #10638 from justxuewei/fix-stderr-fifo runtime-rs: Fix the issues with stderr fifo	2024-12-12 10:03:46 +08:00
Fabiano Fidêncio	a5c863a907	Merge pull request #10581 from ryansavino/snp-enable-skipped Revert "ci: Skip the failing tests in SNP"	2024-12-11 18:22:17 +01:00
Zvonko Kaiser	cc9ecedaea	qemu: Bump version, new options, add no_patches We want to have the latest QEMU version available which is as of this writing v9.1.2 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> qemu: Add new options for 9.1.2 We need to fence specific options depending on the version and disable ones that are not needed anymore Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> qemu: Add no_patches.txt Since we do not have any patches for this version let's create the appropriate files. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:32:39 +00:00
Zvonko Kaiser	69ed4bc3b7	qemu: Add depedency The new QEMU build needs python-tomli, now that we bumped Ubuntu we can include the needed tomli package Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:32:20 +00:00
Zvonko Kaiser	c82db45eaa	qemu: Disable pmem We're disabling pmem support, it is heavilly broken with Ubuntu's static build of QEMU and not needed Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:32:19 +00:00
Zvonko Kaiser	a88174e977	qemu: Replace from source build with package In jammy we have the liburing package available, hence remove the source build and include the package. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Zvonko Kaiser	c15f77737a	qemu: Bump Ubuntu version in Dockerfile We need jammy for a new package that is not available in focal Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Zvonko Kaiser	eef2795226	qemu: Use proper QEMU builder Do not use hardcoded abs path. Use the deduced rel path. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Zvonko Kaiser	e604e51b3d	qemu: Build as user We moved all others artifacts to be build as a user, QEMU should not be the exception Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Zvonko Kaiser	1d56fd0308	qemu: Remove abs path We want to stick with the other build scripts and only use relative paths. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Ryan Savino	7d45382f54	Revert "ci: Skip the failing tests in SNP" This reverts commit `2242aee099`.	2024-12-10 16:20:31 -06:00
Xuewei Niu	3fb91dd631	agent: Fix the issues with bind volumes The mount type should be considered as empty if the value is `Some("none")`. Fixes: #10642 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-11 00:51:32 +08:00
Xuewei Niu	59ed19e8b2	runtime-rs: Fix the issues with bind volumes This path fixes the logic of getting the type of volume: when the type of OCI mount is Some("none") and the options have "bind" or "rbind", the type will be considered as "bind". Fixes: #10642 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-11 00:50:36 +08:00
Xuewei Niu	2424c1a562	runtime-rs: Ignore BUILD_TYPE if it is not release This patch fixes that by adding `--release` only if `BUILD_TYPE=release`. Fixes: #10640 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-11 00:27:28 +08:00
Xuewei Niu	b4695f6303	runtime-rs: Fix the issues with stderr fifo When tty is enabled, stderr fifo should never be opened. Fixes: #10637 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-10 21:48:52 +08:00
Aurélien Bombo	037281d699	Merge pull request #10593 from microsoft/saulparedes/improve_namespace_validation policy: improve pod namespace validation	2024-12-09 11:55:09 -06:00
Steve Horsman	9b7fb31ce6	Merge pull request #10631 from stevenhorsman/action-lint-workflow Action lint workflow	2024-12-09 09:33:07 +00:00
Fabiano Fidêncio	bec1de7bd7	Merge pull request #10548 from Sumynwa/sumsharma/clh_tweak_vm_configs runtime: Set memory config shared=false when shared_fs=None in CLH.	2024-12-06 23:15:29 +01:00
Sumedh Alok Sharma	ac4f986e3e	runtime: Set memory config shared=false when shared_fs=None in CLH. This commit sets memory config `shared` to false in cloud hypervisor when creating vm with shared_fs=None && hugePages = false. Currently in runtime/virtcontainers/clh.go,the memory config shared is by default set to true. As per the CLH memory document, (a) shared=true is needed in case like when using virtio_fs since virtiofs daemon runs as separate process than clh. (b) for shared_fs=none + hugespages=false, shared=false can be set to use private anonymous memory for guest (with no file backing). (c) Another memory config thp (use transparent huge pages) is always enabled by default. As per documentation, (b) + (c) can be used in combination. However, with the current CLH implementation, the above combination cannot be used since shared=true is always set. Fixes #10547 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-12-06 21:22:51 +05:30
stevenhorsman	b4b3471bcb	workflows: linting: Fix shellcheck SC1001 > This \/ will be a regular '/' in this context Remove ignored escape Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	491210ed22	workflows: linting: Fix shellcheck SC2006 > Use $(...) notation instead of legacy backticks `...` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	5d7c5bdfa4	workflows: linting: Fix shellcheck SC2015 > A && B \|\| C is not if-then-else. C may run when A is true Refactor the echo so that we can't get into a situation where the retry of workspace delete happens if the original one was successful Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	c2ba15c111	workflows: linting: Fix shellcheck SC2206 > Quote to prevent word splitting/globbing Double quote variables expanded in an array Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	007514154c	workflows: linting: Fix shellcheck SC2068 > Double quote array expansions to avoid re-splitting elements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	4ef05c6176	workflows: linting: Fix shellcheck SC2116 > Useless echo? Instead of 'cmd $(echo foo)', just use 'cmd foo' Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	f02d540799	workflows: Bump outdated action versions Bump some actions that are significantly out-of-date and out of sync with the versions used in other workflows Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	935327b5aa	workflows: linting: Fix shellcheck SC2046 > Quote this to prevent word splitting. Quote around subshell Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	e93ed6c20e	workflows: linting: Add tdx labels The tdx runners got split into two different runners, so we need to update the known self-hosted runner labels Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	d4bd314d52	workflows: linting: Fix incorrect properties These properties are currently invalid, so either fix, or remove them Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	9113606d45	workflows: linting: Fix shellcheck SC2086 > Double quote to prevent globbing and word splitting. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	42cd2ce6e4	workflows: Add actionlint workflows On PRs that update anything in the workflows directory, add an actionlint run to validate our workflow files for errors and hopefully catch issues earlier. Fixes: #9646 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 11:36:08 +00:00
Fabiano Fidêncio	a93ff57c7d	Merge pull request #10627 from kata-containers/topic/release-helm-charm-tarball release: helm: Add the chart as part of the release	2024-12-06 11:22:43 +01:00
Fabiano Fidêncio	300a827d03	release: helm: Add the chart as part of the release So users can simply download the chart and use it accordingly without the need to download the full repo. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-06 11:19:34 +01:00
Fabiano Fidêncio	652662ae09	Merge pull request #10551 from fidencio/topic/kata-deploy-allow-multi-deployment kata-deploy: Add support to multi-installation	2024-12-06 11:16:20 +01:00
Hui Zhu	d3a6bcdaa5	runtime-rs: configuration-dragonball.toml.in: Add config for mem-agent Add config for mem-agent to configuration-dragonball.toml.in. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:28 +08:00
Hui Zhu	2b6caf26e0	agent-ctl: Add mem-agent API support Add sub command MemAgentMemcgSet and MemAgentCompactSet to agent-ctl to configate the mem-agent inside the running kata-containers. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:24 +08:00
Hui Zhu	cb86d700a6	config: Add config of mem-agent Add config of mem-agent to configate the mem-agent. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:20 +08:00
Hui Zhu	692ded8f96	agent: add support for MemAgentMemcgSet and MemAgentCompactSet Add MemAgentMemcgSet and MemAgentCompactSet to agent API to set the config of mem-agent memcg and compact. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:16 +08:00
Hui Zhu	f84ad54d97	agent: Start mem-agent in start_sandbox mem-agent will run with kata-agent. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:13 +08:00
Hui Zhu	74a17f96f4	protocols/protos/agent.proto: Add mem-agent support Add MemAgentMemcgConfig and MemAgentCompactConfig to AgentService. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:09 +08:00
Hui Zhu	ffc8390a60	agent: Add mem-agent to Cargo.toml Add mem-agent to Cargo.toml of agent. mem-agent will be integrated into kata-agent. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:05 +08:00
Hui Zhu	4407f6e098	mem-agent: Add to src mem-agent is a component designed for managing memory in Linux environments. Sub-feature memcg: Utilizes the MgLRU feature to monitor each cgroup's memory usage and periodically reclaim cold memory. Sub-feature compact: Periodically compacts memory to facilitate the kernel's free page reporting feature, enabling the release of more idle memory from guests. During memory reclamation and compaction, mem-agent monitors system pressure using Pressure Stall Information (PSI). If the system pressure becomes too high, memory reclamation or compaction will automatically stop. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:02 +08:00
Hui Zhu	f9c63d20a4	kernel/configs: Add mglru, debugfs and psi to dragonball-experimental Add mglru, debugfs and psi to dragonball-experimental/mem_agent.conf to support mem_agent function. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 09:59:59 +08:00
Fabiano Fidêncio	111082db07	kata-deploy: Add support to multi-installation This is super useful for development / debugging scenarios, mainly when dealing with limited hardware availability, as this change allows multiple people to develop into one single machine, while still using kata-deploy. Fixes: #10546 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-05 17:42:53 +01:00
Fabiano Fidêncio	0033a0c23a	kata-deploy: Adjust paths for qemu-coco-dev as well I missed that when working on the INSTALL_PREFIX feature, so adding it now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-05 17:42:53 +01:00
Fabiano Fidêncio	62b3a07e2f	kata-deploy: helm: Add overlooked INSTALLATION_PREFIX env var At the same time that INSTALLATION_PREFIX was added, I was working on the helm changes to properly do the cleanup / deletion when it's removed. However, I missed adding the INSTALLATION_PREFIX env var there. which I'm doing now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-05 17:42:53 +01:00
Steve Horsman	5d96734831	Merge pull request #10572 from ldoktor/gk-stalled-results ci.gatekeeper: Update existing results	2024-12-04 19:02:14 +00:00
Wainer Moschetta	a94982d8b8	Merge pull request #10617 from stevenhorsman/skip-k8s-job-test-on-non-tee tests: Skip k8s job test on qemu-coco-dev	2024-12-04 15:47:33 -03:00
Saul Paredes	84a411dac4	policy: improve pod namespace validation - Remove default_namespace from settings - Ensure container namespaces in a pod match each other in case no namespace is specified in the YAML Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-12-04 10:17:54 -08:00
Steve Horsman	c86f76d324	Merge pull request #10588 from stevenhorsman/metrics-clh-min-range-relaxation metrics: Increase minval range for failing tests	2024-12-04 16:10:26 +00:00
stevenhorsman	a8ccd9a2ac	tests: Skip k8s job test on qemu-coco-dev The tests is unstable on this platform, so skip it for now to prevent the regular known failures covering up other issues. See #10616 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-04 16:00:05 +00:00
Steve Horsman	9e609dd34f	Merge pull request #10615 from kata-containers/topic/update-remove-artifact-filter workflows: Fix remove artifact name filter	2024-12-04 15:02:35 +00:00
Fabiano Fidêncio	531a29137e	Merge pull request #10607 from microsoft/danmihai1/less-logging runtime: skip logging some of the dial errors	2024-12-04 15:01:45 +01:00
stevenhorsman	14a3adf4d6	workflows: Fix remove artifact name filter - Fix copy-paste errors in artifact filters for arm64 and ppc64le - Remove the trailing wildcard filter that falsely ends up removing agent-ctl and replace with the tarball-suffix, which should exactly match the artifacts Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-04 13:34:42 +00:00
Alex Lyn	5f9cc86b5a	Merge pull request #10604 from 3u13r/euler/fix/genpolicy-rego-state-getter genpolicy: align state path getter and setter	2024-12-04 13:57:34 +08:00
Alex Lyn	c7064027f4	Merge pull request #10574 from BbolroC/add-ccw-subchannel-qemu-runtime-rs Add subchannel support to qemu-runtime-rs for s390x	2024-12-04 09:17:45 +08:00
Aurélien Bombo	57d893b5dc	Merge pull request #10563 from sprt/csi-deploy coco: ci: Fully implement compilation of CSI driver and require it for CoCo tests [2/x]	2024-12-03 18:58:14 -06:00
Aurélien Bombo	4aa7d4e358	ci: Require CSI driver for CoCo tests With the building/publishing step for the CSI driver validated, we can set that as a requirement for the CoCo tests. Depends on: #10561 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 14:43:36 -06:00
Aurélien Bombo	fe55b29ef0	csi-kata-directvolume: Remove go version check The driver build recipe has a script to check the current Go version against the go.mod version. However, the script is broken ($expected is unbound) and I don't believe we do this for other components. On top of this, Go should be backward-compatible. Let's keep things simple for now and we can evaluate restoring this script in the future if need be. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 14:43:36 -06:00
Aurélien Bombo	fb87bf221f	ci: Implement build step for CSI driver This fully implements the compilation step for csi-kata-directvolume. This component can now be built by the CI running: $ cd tools/packaging/kata-deploy/local-build $ make csi-kata-directvolume-tarball A couple notes: * When installing the binary, we rename it from directvolplugin to csi-kata-directvolume on the fly to make it more readable. * We add go to the tools builder Dockerfile to support building this tool. * I've noticed the file install_libseccomp.sh gets created by the build process so I've added it to a .gitignore. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 14:43:36 -06:00
Aurélien Bombo	0f6113a743	Merge pull request #10612 from kata-containers/sprt/fix-csi-publish2 ci: Fix Docker publishing for CSI driver, 2nd try	2024-12-03 14:43:28 -06:00
Aurélien Bombo	a23ceac913	ci: Fix Docker publishing for CSI driver, 2nd try Follow-up to #10609 as it seems GHA doesn't allow hard links: https://github.com/kata-containers/kata-containers/actions/runs/12144941404/job/33868901896?pr=10563#step:6:8 Note that I also updated the `needs` directive as we don't need the Kata payload container, just the tarball artifact. Part of: #10560 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 13:04:46 -06:00
Dan Mihai	2a67038836	Merge pull request #10608 from microsoft/saulparedes/policy_metadatata_uid policy: ignore optional metadata uid field	2024-12-03 10:19:12 -08:00
Dan Mihai	25e6f4b2a5	Merge pull request #10592 from microsoft/saulparedes/add_constants_to_rules policy: add constants to rules.rego	2024-12-03 10:17:10 -08:00
Aurélien Bombo	5e1fc5a63f	Merge pull request #10609 from kata-containers/sprt/fix-publish-csi ci: Fix Docker publishing for CSI driver	2024-12-03 11:21:55 -06:00
Hyounggyu Choi	8b998e5f0c	runtime-rs: Introduce get_devno_ccw() for deduplication The devno assignment logic is repeated in 5 different places during device addition. To improve code maintainability and readability, this commit introduces a standalone function, `get_devno_ccw()`, to handle the deduplication. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-12-03 15:35:03 +01:00
Leonard Cohnen	9b614a4615	genpolicy: align state path getter and setter Before this patch there was a mismatch between the JSON path under which the state of the rule evaluation is set in comparison to under which it is retrieved. This resulted in the behavior that each time the policy was evaluated, it thought it was the _first_ time the policy was evaluated. This also means that the consistency check for the `sandbox_name` was ineffective. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-12-03 13:25:24 +01:00
Aurélien Bombo	85d3bcd713	ci: Fix Docker publishing for CSI driver The compilation succeeds, however Docker can't find the binary because we specify an absolute path. In Docker world, an absolute path is absolute to the Docker build context (here: src/tools/csi-kata-directvolume). To fix this, we link the binary into the build context, where the Dockerfile expects it. Failure mode: https://github.com/kata-containers/kata-containers/actions/runs/12068202642/job/33693101962?pr=10563#step:8:213 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-02 15:50:01 -06:00
Saul Paredes	711d12e5db	policy: support optional metadata uid field This prevents a deserialization error when uid is specified Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-12-02 11:24:58 -08:00
Dan Mihai	efd492d562	runtime: skip logging some of the dial errors With full debug logging enabled there might be around 1,500 redials so log just ~15 of these redials to avoid flooding the log. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-12-02 19:11:32 +00:00
Hyounggyu Choi	9c19d7674a	Merge pull request #10590 from zvonkok/fix-ci ci: Fix variant for confidential targets	2024-12-02 18:39:52 +01:00
Saul Paredes	9105c1fa0c	policy: add constants to rules.rego Reuse constants where applicable Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-12-02 08:28:58 -08:00
Hyounggyu Choi	6f4f94a9f0	Merge pull request #10595 from BbolroC/add-zvsi-devmapper-to-gatekeeper-required-jobs gatekeeper: add run-k8s-tests-on-zvsi(devmapper) to required jobs	2024-12-02 15:28:14 +01:00
Zvonko Kaiser	20442c0eae	ci: Fix variant for confidential targets The default initrd confidential target will have a variant=confidential we need to accomodate this and make sure we also accomodate aaa-xxx-confidential targets. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-02 14:21:03 +00:00
stevenhorsman	b87b4b6756	metrics: Increase ranges range for qemu failing tests We've also seen the qemu metrics tests are failing due to the results being slightly outside the max range for network-iperf3 parallel and minimum for network-iperf3 jitter tests on PRs that have no code changes, so we've increase the bounds to not see false negatives. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-29 10:52:16 +00:00
stevenhorsman	4011071526	metrics: Increase minval range for failing tests We've seen a couple of instances recently where the metrics tests are failing due to the results being below the minimum value by ~2%. For tests like latency I'm not sure why values being too low would be an issue, but I've updated the minpercent range of the failing tests to try and get them passing. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-29 10:50:02 +00:00
Hyounggyu Choi	de3452f8e1	gatekeeper: add run-k8s-tests-on-zvsi(devmapper) to required jobs As the following CI job has been marked as required: - kata-containers-ci-on-push / run-k8s-tests-on-zvsi / run-k8s-tests (devmapper, qemu, kubeadm) we need to add it to the gatekeeper's required job list. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-28 12:46:47 +01:00
Fabiano Fidêncio	bdf10e651a	Merge pull request #10597 from kata-containers/topic/unbreak-ci-3rd-time-s-a-charm Unbreak the CI, 3rd attempt	2024-11-28 12:36:09 +01:00
Fabiano Fidêncio	92b8091f62	Revert "ci: unbreak: Reallow no-op builds" This reverts commit `559018554b`. As we've noticed that this is causing issues with initrd builds in the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-28 12:02:40 +01:00
Fabiano Fidêncio	ca2098f828	build: Allow dummy builds (for when adding a new target) This will help us to simply allow a new dummy build whenever a new component is added. As long as the format `$(call DUMMY,$@)` is followed, we should be good to go without taking the risk of breaking the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-28 11:13:24 +01:00
Fabiano Fidêncio	f9930971a2	Merge pull request #10594 from sprt/sprt/unbreak-ci-noop-build ci: unbreak: Reallow no-op builds	2024-11-28 07:38:25 +01:00
Aurélien Bombo	559018554b	ci: unbreak: Reallow no-op builds #9838 previously modified the static build so as not to repeatedly copy the same assets on each matrix iteration: https://github.com/kata-containers/kata-containers/pull/9838#issuecomment-2169299202 However, that implementation breaks specifiying no-op/WIP build targets such as done in `e43c59a`. Such no-op builds have been a historical of the project requirement because of a GHA limitation. The breakage is due to no-op builds not generating a tar file corresponding to the asset: https://github.com/kata-containers/kata-containers/actions/runs/12059743390/job/33628926474?pr=10592 To address this breakage, we revert to the `cp -r` implementation and add the `--no-clobber` flag to still preserve the current behavior. Note that `-r` will also create the destination directory if it doesn't exist. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-27 18:40:29 -06:00
Fabiano Fidêncio	9699c7ed06	Merge pull request #10589 from kata-containers/sprt/fix-csi-publish gha: Unbreak CI and work around workflow limit	2024-11-27 23:52:55 +01:00
Aurélien Bombo	eac197d3b7	Merge pull request #10564 from microsoft/danmihai1/clh-endpoint-type runtime: clh: addNet() logging clean-up	2024-11-27 14:44:14 -06:00
Aurélien Bombo	7f659f3d63	gha: Unbreak CI and work around workflow limit #10561 inadvertently broke the CI by going over the limit of 20 reusable workflows: https://github.com/kata-containers/kata-containers/actions/runs/12054648658/workflow This commit fixes that by inlining the job. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-27 12:23:15 -06:00
Aurélien Bombo	16a91fccbe	Merge pull request #10561 from sprt/csi-driver-ci coco: ci: Lay groundwork for compiling and publishing CSI driver image [1/x]	2024-11-27 10:26:45 -06:00
Fabiano Fidêncio	175fe8bc66	Merge pull request #10585 from fidencio/topic/kata-deploy-use-drop-in-containerd-config-whenever-it-is-possible kata-deploy: Use drop-in files whenever it's possible	2024-11-27 16:36:18 +01:00
Steve Horsman	6bb00d9a1d	Merge pull request #10583 from squarti/agent-startup-cdh-client agent: fix startup when guest_components_procs is set to none	2024-11-27 11:43:07 +00:00
Fabiano Fidêncio	500508a592	kata-deploy: Use drop-in files whenever it's possible This will make our lives considerably easier when it comes to cleaning up content added, while it's also a groundwork needed for having multiple installations running in parallel. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-27 12:27:08 +01:00
Steve Horsman	3240f8a4b8	Merge pull request #10586 from stevenhorsman/delete-rootfs-binary-assets-after-rootfs-build workflows: Remove rootfs binary artifacts	2024-11-27 10:03:20 +00:00
Fabiano Fidêncio	c472fe1924	Merge pull request #10584 from fidencio/topic/kata-deploy-prepare-for-containerd-config-version-3 kata-deploy: Support containerd configuration version 3	2024-11-26 18:44:56 +01:00
stevenhorsman	3e5d360185	workflows: Remove rootfs binary artifacts We need the publish certain artefacts for the rootfs, like the agent, guest-components, pause bundle etc as they are consumed in the `build-asset-rootfs` step. However after this point they aren't needed and probably shouldn't be included in the overall kata tarball, so delete them once they aren't needed any more to avoid them being included. Fixes: #10575 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-26 15:24:20 +00:00
Fabiano Fidêncio	6f70ab9169	kata-deploy: Adapt how the containerd version is checked for k0s Let's actually mount the whole /etc/k0s as /etc/containerd, so we can easily access the containerd configuration file which has the version in it, allowing us to parse it instead of just making a guess based on kubernetes distro being used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-26 16:15:11 +01:00
Silenio Quarti	1230bc77f2	agent: fix startup when guest_components_procs is set to none This PR ensures that OCICRYPT_CONFIG_PATH file is initialized only when CDH socket exists. This prevents startup error if attestation binaries are not installed in PodVM. Fixes: https://github.com/kata-containers/kata-containers/issues/10568 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-11-26 09:57:04 -05:00
Fabiano Fidêncio	f5a9aaa100	kata-deploy: Support containerd config version 3 On Ubuntu 24.04, with the distro default containerd, we're already getting: ``` $ containerd config default \| grep "version = " version = 3 ``` With that in mind, let's make sure that we're ready to support this from the next release. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-26 14:01:50 +01:00
Fupan Li	28166c8a32	Merge pull request #10577 from Apokleos/fix-vfiodev-name runtime-rs: fix vfio device name combination issue	2024-11-26 09:35:45 +08:00
Dan Mihai	d93900c128	Merge pull request #10543 from microsoft/danmihai1/regorus-warning genpolicy: avoid regorus warning	2024-11-25 16:47:33 -08:00
Zvonko Kaiser	1b10e82559	Merge pull request #10516 from zvonkok/kata-agent-cdi ci: Fix error on self-hosted machines	2024-11-25 18:49:37 -05:00
Ryan Savino	e46d24184a	Merge pull request #10386 from kimullaa/fix-build-error-when-using-sev-snp docs: Fix several build failures when I tried the procedures in "Kata Containers with AMD SEV-SNP VMs"	2024-11-25 16:58:52 -06:00
Dan Mihai	f340b31c41	genpolicy: avoid regorus warning Avoid adding to the Guest console warnings about "agent_policy:10:8". "import input" is unnecessary. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-11-25 21:19:01 +00:00
Zvonko Kaiser	c3d1b3c5e3	Merge pull request #10464 from zvonkok/nvidia-gpu-rootfs gpu: NVIDIA GPU initrd/image build	2024-11-25 16:16:42 -05:00
Fabiano Fidêncio	8763a9bc90	Merge pull request #10520 from fidencio/topic/drop-clear-linux-rootfs osbuilder: Drop Clear Linux	2024-11-25 21:16:03 +01:00
Dan Mihai	78cbf33f1d	runtime: clh: addNet() logging clean-up Avoid logging the same endpoint fields twice from addNet(). Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-11-25 19:58:54 +00:00
alex.lyn	5dba680afb	runtime-rs: fix vfio device name combination issue Fixes #10576 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-11-25 14:01:43 +08:00
Hyounggyu Choi	48e2df53f7	runtime-rs: Add devno to DeviceVirtioScsi A new attribute named `devno` is added to DeviceVirtioScsi. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	2cc48f7822	runtime-rs: Add devno to DeviceVhostUserFs A new attribute named `devno` is added to DeviceVhostUserFs. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	920484918c	runtime-rs: Add devno to VhostVsock A new attribute named `devno` is added to VhostVsock. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	9486790089	runtime-rs: Add devno to DeviceVirtioSerial A new attribute named `devno` is added to DeviceVirtioSerial. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	516daecc50	runtime-rs: Add devno to DeviceVirtioBlk A new attribute named `devno` is added to DeviceVirtioBlk. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	30a64092a7	runtime-rs: Add CcwSubChannel to provide devno for CCW devices To explicitly specify a device number on the QEMU command line for the following devices using the CCW transport on s390x: - SerialDevice - BlockDevice - VhostUserDevice - SCSIController - VSOCKDevice this commit introduces a new structure CcwSubChannel and implements the following methods: - add_device() - remove_device() - address_format_ccw() - set_addr() You can see the detailed explanation for each method in the comment. This resolves the 1st part of #10573. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Steve Horsman	322073bea1	Merge pull request #10447 from ldoktor/required-jobs ci: Required jobs	2024-11-22 09:15:11 +00:00
Lukáš Doktor	e69635b376	ci.gatekeeper: Remove unused variable this is a left-over from previous way of iterating over jobs. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-22 09:27:11 +01:00
Lukáš Doktor	fa7bca4179	ci.gatekeeper: Print the older job id let's print the also the existing result's id when printing the information about ignoring older result id to simplify debugging. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-22 09:27:11 +01:00
Lukáš Doktor	6c19a067a0	ci.gatekeeper: Update existing results tha matching run_id means we're dealing with the same job but with updated results and not with an older job. Update the results in such case. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-22 09:27:09 +01:00
Aurélien Bombo	5e4990bcf5	coco: ci: Add no-op steps to deploy CSI driver This adds no-op steps that'll be used to deploy and clean up the CSI driver used for testing. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-21 16:08:06 -06:00
Aurélien Bombo	893f6a4ca0	ci: Introduce job to publish CSI driver image This adds a new job to build and publish the CSI driver Docker image. Of course this job will fail after we merge this PR because the CSI driver compilation job hasn't been implemented yet. However that will be implemented directly after in #10561. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-21 16:07:59 -06:00
Aurélien Bombo	e43c59a2c6	ci: Add no-op step to compile CSI driver This adds a no-op build step to compile the CSI driver. The actual compilation will be implemented in an ulterior PR, so as to ensure we don't break the CI. Addresses: #10560 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-21 16:06:55 -06:00
Zvonko Kaiser	0debf77770	gpu: NVIDIA gpu initrd/image build With each release make sure we ship a GPU enabled rootfs/initrd Fixes: #6554 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-21 18:57:23 +00:00
Steve Horsman	b4da4b5e3b	Merge pull request #10377 from coolljt0725/fix_build osbuilder: Fix build dependency of ubuntu rootfs with Docker	2024-11-21 08:45:59 +00:00
Jitang Lei	ed4c727c12	osbuilder: Fix build dependency of ubuntu rootfs with Docker Build ubuntu rootfs with Docker failed with error: `Unable to find libclang` Fix this error by adding libclang-dev to the dependency. Signed-off-by: Jitang Lei <leijitang@outlook.com>	2024-11-21 10:49:27 +08:00
Zvonko Kaiser	e9f36f8187	ci: Fixing simple typo change evn to env Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-20 18:40:14 +00:00
Zvonko Kaiser	a5733877a4	ci: Fix error on self-hosted machines We need to clean-up any created files/dirs otherwise we cause problems on self-hosted runners. Using tempdir which will be removed automatically. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-20 18:40:13 +00:00
Lukáš Doktor	62e8815a5a	ci: Add documentation to cover mapping format to help people with adding new entries. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-20 17:25:59 +01:00
Lukáš Doktor	64306dc888	ci: Set required-tests according to GH required tests this should record the current list of required tests from GH. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-20 17:25:57 +01:00
Steve Horsman	358ebf5134	Merge pull request #10558 from AdithyaKrishnan/main ci: Re-enable SNP CI	2024-11-20 10:27:41 +00:00
Steve Horsman	30bad4ee43	Merge pull request #10562 from stevenhorsman/remove-release-artifactor-skips workflows: Remove skipping of artifact uploads	2024-11-20 08:45:37 +00:00
Adithya Krishnan Kannan	2242aee099	ci: Skip the failing tests in SNP Per [Issue#10549](https://github.com/kata-containers/kata-containers/issues/10549), the following tests are failing on SNP. 1. k8s-guest-pull-image-encrypted.bats 2. k8s-guest-pull-image-authenticated.bats 3. k8s-guest-pull-image-signature.bats 4. k8s-confidential-attestation.bats Per @fidencio 's comment on [PR#10558](https://github.com/kata-containers/kata-containers/pull/10558), I am skipping the same. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-11-19 10:41:43 -06:00
stevenhorsman	da5f6b77c7	workflows: Remove skipping of artifact uploads Now we are downloading artifacts to create the rootfs we need to ensure they are uploaded always, even on releases Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-19 13:28:02 +00:00
Steve Horsman	817438d1f6	Merge pull request #10552 from stevenhorsman/3.11.0-release release: Bump version to 3.11.0	2024-11-19 09:44:35 +00:00
Saul Paredes	eab48c9884	Merge pull request #10545 from microsoft/cameronbaird/sync-clh-logging runtime: fix comment to accurately reflect clh behavior	2024-11-18 11:25:58 -08:00
Adithya Krishnan Kannan	ef367d81f2	ci: Re-enable SNP CI We've debugged the SNP Node and we wish to test the fixes on GHA. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-11-18 11:11:27 -06:00
stevenhorsman	7a8ba14959	release: Bump version to 3.11.0 Bump `VERSION` and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-18 11:13:15 +00:00
Steve Horsman	0ce3f5fc6f	Merge pull request #10514 from squarti/pause_command agent: overwrite OCI process spec when overwriting pause image	2024-11-15 18:03:58 +00:00
Fabiano Fidêncio	92f7526550	Merge pull request #10542 from Crypt0s/topic/enable-CONFIG_KEYS kernel: add CONFIG_KEYS=y to enable kernel keyring	2024-11-15 12:15:25 +01:00
Crypt0s	563a6887e2	kernel: add CONFIG_KEYS=y to enable kernel keyring KinD checks for the presence of this (and other) kernel configuration via scripts like https://blog.hypriot.com/post/verify-kernel-container-compatibility/ or attempts to directly use /proc/sys/kernel/keys/ without checking to see if it exists, causing an exit when it does not see it. Docker/it's consumers apparently expect to be able to use the kernel keyring and it's associated syscalls from/for containers. There aren't any known downsides to enabling this except that it would by definition enable additional syscalls defined in https://man7.org/linux/man-pages/man7/keyrings.7.html which are reachable from userspace. This minimally increases the attack surface of the Kata Kernel, but this attack surface is minimal (especially since the kernel is most likely being executed by some kind of hypervisor) and highly restricted compared to the utility of enabling this feature to get further containerization compatibility. Signed-off-by: Crypt0s <BryanHalf@gmail.com>	2024-11-15 09:30:06 +01:00
Shunsuke Kimura	706e8bce89	docs: change from OVMF.fd to AmdSev.fd change the build method to generate OVMF for AmdSev. This commit adds `ovmf_build=sev` env parameter. <`638c2c4164`> Fixes #10378 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2024-11-15 11:24:45 +09:00
Shunsuke Kimura	d7f6fabe65	docs: fix build-kernel.sh option `build-kernel.sh` no longer takes an argument for the -x option. <`6c3338271b`> Fixes #10378 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2024-11-15 11:24:45 +09:00
Cameron Baird	65881ceb8a	runtime: fix comment to accurately reflect clh behavior Fix the CLH log levels description Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2024-11-14 23:16:11 +00:00
Silenio Quarti	42b6203493	agent: overwrite OCI process spec when overwriting pause image The PR replaces the OCI process spec of the pause container with the spec of the guest provided pause bundle. Fixes: https://github.com/kata-containers/kata-containers/issues/10537 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-11-14 13:05:16 -05:00
Fabiano Fidêncio	6a9266124b	Merge pull request #10501 from kata-containers/topic/ci-split-tests ci: tdx: Split jobs to run in 2 different machines	2024-11-14 17:24:50 +01:00
Fabiano Fidêncio	9b3fe0c747	ci: tdx: Adjust workflows to use different machines This will be helpful in order to increase the OS coverage (we'll be using both Ubuntu 24.04 and CentOS 9 Stream), while also reducing the amount spent on the tests (as one machine will only run attestation related tests, and the other the tests that do not require attestation). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-14 15:52:00 +01:00
Fabiano Fidêncio	9b1a5f2ac2	tests: Add a way to run only tests which rely on attestation We're doing this as, at Intel, we have two different kind of machines we can plug into our CI. Without going much into details, only one of those two kinds of machines will work for the attestation tests we perform with ITA, thus in order to speed up the CI and improve test coverage (OS wise), we're going to run different tests in different machines. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-14 15:51:57 +01:00
Steve Horsman	915695f5ef	Merge pull request #9407 from mrIncompetent/root-fs-clang rootfs: Install missing clang in Ubuntu docker image	2024-11-14 10:35:06 +00:00
Henrik Schmidt	57a4dbedeb	rootfs: Install missing libclang-dev in Ubuntu docker image Fixes #9444 Signed-off-by: Henrik Schmidt <mrIncompetent@users.noreply.github.com>	2024-11-14 08:48:24 +00:00
Hyounggyu Choi	5869046d04	Merge pull request #9195 from UiPath/fix/vcpus-for-static-mgmt runtime: Set maxvcpus equal to vcpus for the static resources case	2024-11-14 09:38:20 +01:00
Dan Mihai	d9977b3e75	Merge pull request #10431 from microsoft/saulparedes/add-policy-state genpolicy: add state to policy	2024-11-13 11:48:46 -08:00
Aurélien Bombo	7bc2fe90f9	Merge pull request #10521 from ncppd/osbuilder-cleanup osbuilder: remove redundant env variable	2024-11-13 12:17:09 -06:00
Steve Horsman	a947d2bc40	Merge pull request #10539 from AdithyaKrishnan/main ci: Temporarily skip SNP CI	2024-11-13 17:58:32 +00:00
Adithya Krishnan Kannan	439a1336b5	ci: Temporarily skip SNP CI As discussed in the CI working group, we are temporarily skipping the SNP CI to unblock the remaining workflow. Will revert after fixing the SNP runner. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-11-13 11:44:16 -06:00
Fabiano Fidêncio	02d4c3efbf	Merge pull request #10519 from fidencio/topic/relax-restriction-for-qemu-tdx Reapply "runtime: confidential: Do not set the max_vcpu to cpu"	2024-11-13 16:09:06 +01:00
Saul Paredes	c207312260	genpolicy: validate container sandbox names Make sure all container sandbox names match the sandbox name of the first container. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-11-12 15:17:01 -08:00
Saul Paredes	52d1aea1f7	genpolicy: Add state Use regorous engine's add_data method to add state to the policy. This data can later be accessed inside rego context through the data namespace. Support state modifications (json-patches) that may be returned as a result from policy evaluation. Also initialize a policy engine data slice "pstate" dedicated for storing state. Fixes #10087 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-11-12 15:16:53 -08:00
Alexandru Matei	e83f8f8a04	runtime: Set maxvcpus equal to vcpus for the static resources case Fixes: #9194 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-11-12 16:36:42 +02:00
GabyCT	06fe459e52	Merge pull request #10508 from GabyCT/topic/installartsta gha: Get artifacts when installing kata tools in stability workflow	2024-11-11 15:59:06 -06:00
Nikos Ch. Papadopoulos	ab80cf8f48	osbuilder: remove redundant env variable Remove second declaration of GO_HOME in roofs-build ubuntu script. Signed-off-by: Nikos Ch. Papadopoulos <ncpapad@cslab.ece.ntua.gr>	2024-11-11 19:49:28 +02:00
Fabiano Fidêncio	780b36f477	osbuilder: Drop Clear Linux The Clear Linux rootfs is not being tested anywhere, and it seems Intel doesn't have the capacity to review the PRs related to this (combined with the lack of interested from the rest of the community on reviewing PRs that are specific to this untested rootfs). With this in mind, I'm suggesting we drop Clear Linux support and focus on what we can actually maintain. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-11 15:22:55 +01:00
Fabiano Fidêncio	5618180e63	Merge pull request #10515 from kata-containers/sprt/ubuntu-latest-fix gha: Hardcode ubuntu-22.04 instead of latest	2024-11-10 09:54:39 +01:00
Fabiano Fidêncio	2281342fb8	Merge pull request #10513 from fidencio/topic/ci-adjust-proxy-nightmare-for-tdx ci: tdx: kbs: Ensure https_proxy is taken in consideration	2024-11-10 00:17:10 +01:00
Fabiano Fidêncio	0d8c4ce251	Merge pull request #10517 from microsoft/saulparedes/remove_manifest_v1_test tests: remove manifest v1 test	2024-11-09 23:40:51 +01:00
Fabiano Fidêncio	56812c852f	Reapply "runtime: confidential: Do not set the max_vcpu to cpu" This reverts commit `f15e16b692`, as we don't have to do this since we're relying on the `static_sandbox_resource_mgmt` feature, which gives us the correct amount of memory and CPUs to be allocated. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-09 23:20:17 +01:00
Saul Paredes	461efc0dd5	tests: remove manifest v1 test This test was meant to show support for pulling images with v1 manifest schema versions. The nginxhttps image has been modified in https://hub.docker.com/r/ymqytw/nginxhttps/tags such that we are no longer able to pull it: $ docker pull ymqytw/nginxhttps:1.5 Error response from daemon: missing signature key We may remove this test since schema version 1 manifests are deprecated per https://docs.docker.com/engine/deprecated/#pushing-and-pulling-with-image-manifest-v2-schema-1 : "These legacy formats should no longer be used, and users are recommended to update images to use current formats, or to upgrade to more current images". This schema version was used by old docker versions. Further OCI spec https://github.com/opencontainers/image-spec/blob/main/manifest.md#image-manifest-property-descriptions only supports schema version 2. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-11-08 13:38:51 -08:00
Aurélien Bombo	19e972151f	gha: Hardcode ubuntu-22.04 instead of latest GHA is migrating ubuntu-latest to Ubuntu 24 so let's hardcode the current 22.04 LTS. https://github.blog/changelog/2024-11-05-notice-of-breaking-changes-for-github-actions/ Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-08 11:00:15 -06:00
Greg Kurz	2bd8fde44a	Merge pull request #10511 from ldoktor/fedora-python ci.ocp: Use the official python:3 container for sanity	2024-11-08 16:31:40 +01:00
Fabiano Fidêncio	baf88bb72d	ci: tdx: kbs: Ensure https_proxy is taken in consideration Trustee's deployment must set the correct https_proxy as env var on the container that will talk to the ITA / ITTS server, otherwise the kbs service won't be able to start, causing then issues in our CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Krzysztof Sandowicz <krzysztof.sandowicz@intel.com>	2024-11-08 16:06:16 +01:00
Steve Horsman	1f728eb906	Merge pull request #10498 from stevenhorsman/update-create-container-timeout-log tests: k8s: Update image pull timeout error	2024-11-08 10:47:39 +00:00
Steve Horsman	6112bf85c3	Merge pull request #10506 from stevenhorsman/skip-runk-ci workflow: Remove/skip runk CI	2024-11-08 09:54:06 +00:00
Steve Horsman	a5acbc9e80	Merge pull request #10505 from stevenhorsman/remove-stratovirt-metrics-tests metrics: Skip metrics on stratovirt	2024-11-08 08:53:05 +00:00
Lukáš Doktor	2f7d34417a	ci.ocp: Use the official python:3 container for sanity Fedora F40 removed python3 from the base container, to avoid such issues let's rely on the latest and greates official python container. Fixes: #10497 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-08 07:16:30 +01:00
Zvonko Kaiser	183bd2aeed	Merge pull request #9584 from zvonkok/kata-agent-cdi kata-agent: Add CDI support	2024-11-07 14:18:32 -05:00
Zvonko Kaiser	aa2e1a57bd	agent: Added test-case for handle_cdi_devices We are generating a simple CDI spec with device and global containerEdits to test the CDI crate. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-07 17:03:18 +00:00
Gabriela Cervantes	4274198664	gha: Get artifacts when installing kata tools in stability workflow This PR adds the get artifacts which are needed when installing kata tools in stability workflow to avoid failures saying that artifacts are missing. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-07 16:20:41 +00:00
stevenhorsman	a5f1a5a0ee	workflow: Remove/skip runk CI As discussed in the AC meeting, we don't have a maintainer, (or users?) of runk, and the CI is unstable, so giving we can't support it, we shouldn't waste CI cycles on it. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-07 14:16:30 +00:00
stevenhorsman	0efe9f4e76	metrics: Skip metrics on stratovirt As discussed on the AC call, we are lacking maintainers for the metrics tests. As a starting point for potentially phasing them out, we discussed starting with removing the test for stratovirt as a non-core hypervisor and a job that is problematic in leaving behind resources that need cleaning up. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-07 14:06:57 +00:00
Fabiano Fidêncio	c332e953f9	Merge pull request #10500 from squarti/fix-10499 runtime: Files are not synced between host and guest VMs	2024-11-07 08:28:53 +01:00
Silenio Quarti	be3ea2675c	runtime: Files are not synced between host and guest VMs This PR makes the root dir absolute after resolving the default root dir symlink. Fixes: https://github.com/kata-containers/kata-containers/issues/10499 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-11-06 17:31:12 -05:00
GabyCT	47cea6f3c6	Merge pull request #10493 from GabyCT/topic/katatoolsta gha: Add install kata tools as part of the stability workflow	2024-11-06 14:16:48 -06:00
Gabriela Cervantes	13e27331ef	gha: Add install kata tools as part of the stability workflow This PR adds the install kata tools step as part of the k8s stability workflow. To avoid the failures saying that certain kata components are not installed it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-06 20:07:06 +00:00
Fabiano Fidêncio	71c4c2a514	Merge pull request #10486 from kata-containers/topic/enable-AUTO_GENERATE_POLICY-for-qemu-coco-dev workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev	2024-11-06 21:04:45 +01:00
Zvonko Kaiser	3995fe71f9	kata-agent: Add CDI support For proper device handling add CDI support Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-06 17:50:20 +00:00
stevenhorsman	85554257f8	tests: k8s: Update image pull timeout error Currently the error we are checking for is `CreateContainerRequest timed out`, but this message doesn't always seem to be printed to our pod log. Try using a more general message that should be present more reliably. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-06 17:00:26 +00:00
Fabiano Fidêncio	a3c72e59b1	Merge pull request #10495 from littlejawa/ci/skip_nginx_connectivity_for_crio ci: skip nginx connectivity test with qemu/crio	2024-11-06 13:43:19 +01:00
Julien Ropé	da5e0c3f53	ci: skip nginx connectivity test with crio We have an error with service name resolution with this test when using crio. This error could not be reproduced outside of the CI for now. Skipping it to keep the CI job running until we find a solution. See: #10414 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-11-06 12:07:02 +01:00
Greg Kurz	5af614b1a4	Merge pull request #10496 from littlejawa/ci/expose_container_runtime ci: export CONTAINER_RUNTIME to the test scripts	2024-11-06 12:05:36 +01:00
Julien Ropé	6d0cb1e9a8	ci: export CONTAINER_RUNTIME to the test scripts This variable will allow tests to adapt their behaviour to the runtime (containerd/crio). Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-11-06 11:29:11 +01:00
Fabiano Fidêncio	72979d7f30	workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev By the moment we're testing it also with qemu-coco-dev, it becomes easier for a developer without access to TEE to also test it locally. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-06 10:47:08 +01:00
Fabiano Fidêncio	7d3f2f7200	runtime: Match TEEs for the static_sandbox_resource_mgmt option The qemu-coco-dev runtime class should be as close as possible to what the TEEs runtime classes are doing, and this was one of the options that ended up overlooked till now. Shout out to Dan Mihai for noticing that! Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-06 10:47:08 +01:00
Fabiano Fidêncio	ea8114833c	Merge pull request #10491 from fidencio/topic/fix-typo-in-the-ephemeral-handler agent: fix typo on getting EphemeralHandler size option	2024-11-06 10:31:48 +01:00
Fabiano Fidêncio	7e6779f3ad	Merge pull request #10488 from fidencio/topic/teach-our-machinery-to-deal-with-rc-kernels build: kernel: Teach our machinery to deal with -rc kernels	2024-11-05 16:19:57 +01:00
Zvonko Kaiser	a4725034b2	Merge pull request #9480 from zvonkok/build-image-suffix image: Add suffix to image or initrd depending on the NVIDIA driver version	2024-11-05 09:43:56 -05:00
Fabiano Fidêncio	77c87a0990	agent: fix typo on getting EphemeralHandler size option Most likely this was overlooked during the development / review, but we're actually interested on the size rather than on the pagesize of the hugepages. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 15:15:17 +01:00
Fabiano Fidêncio	2b16160ff1	versions: kernel-dragonball: Fix URL SSIA Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:55:34 +01:00
Fabiano Fidêncio	f7b31ccd6c	kernel: bump kata_config_version Due to the changes done in the previous commits. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:57 +01:00
Fabiano Fidêncio	a52ea32b05	build: kernel: Learn how to deal with release candidates So far we were not prepared to deal with release candidates as those: * Do not have a sha256sum in the sha256sums provided by the kernel cdn * Come from a different URL (directly from Linus) * Have a different suffix (.tar.gz, instead of .tar.xz) Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	9f2d4b2956	build: kernel: Always pass the url to the builder This doesn't change much on how we're doing things Today, but it simplifies a lot cases that may be added later on (and will be) like building -rc kernels. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	ee1a17cffc	build: kernel: Take kernel_url into consideration Let's make sure the kernel_url is actually used whenever it's passed to the function. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	9a0b501042	build: kernel: Remove tee specific function As, thankfully, we're relying on upstream kernels for TEEs. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	cc4006297a	build: kernel: Pass the yaml base path instead of the version path By doing this we can ensure this can be re-used, if needed (and it'll be needed), for also getting the URL. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	7057ff1cd5	build: kernel: Always pass -f to the kernel builder -f forces the (re)generaton of the config when doing the setup, which helps a lot on local development whilst not causing any harm in the CI builds. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	910defc4cf	Merge pull request #10490 from fidencio/topic/fix-ovmf-build builds: ovmf: Workaround Zeex repo becoming private	2024-11-05 12:25:00 +01:00
Fabiano Fidêncio	aff3d98ddd	builds: ovmf: Workaround Zeex repo becoming private Let's just do a simple `sed` and not use the repo that became private. This is not a backport of https://github.com/tianocore/edk2/pull/6402, but it's a similar approach that allows us to proceed without the need to pick up a newer version of edk2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 11:25:54 +01:00
Dan Mihai	03bf4433d7	Merge pull request #10459 from stevenhorsman/update-bats tests: k8s: Update bats	2024-11-04 12:26:58 -08:00
Aurélien Bombo	f639d3e87c	Merge pull request #10395 from Sumynwa/sumsharma/create_container agent-ctl: Add support to test kata-agent's container creation APIs.	2024-11-04 14:09:12 -06:00
GabyCT	7f066be04e	Merge pull request #10485 from GabyCT/topic/fixghast gha: Fix source for gha stability run script	2024-11-04 12:09:28 -06:00
Steve Horsman	a2b9527be3	Merge pull request #10481 from mkulke/mkulke/init-cdh-client-on-gcprocs-none agent: perform attestation init w/o process launch	2024-11-04 17:27:45 +00:00
Gabriela Cervantes	fd4d0dd1ce	gha: Fix source for gha stability run script This PR fixes the source to avoid duplication specially in the common.sh script and avoid failures saying that certain script is not in the directory. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-04 16:16:13 +00:00
Magnus Kulke	bf769851f8	agent: perform attestation init w/o process launch This change is motivated by a problem in peerpod's podvms. In this setup the lifecycle of guest components is managed by systemd. The current code skips over init steps like setting the ocicrypt-rs env and initialization of a CDH client in this case. To address this the launch of the processes has been isolated into its own fn. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-11-04 13:31:07 +01:00
Steve Horsman	4fd9df84e4	Merge pull request #10482 from GabyCT/topic/fixvirtdoc docs: Update virtualization document	2024-11-04 11:51:09 +00:00
stevenhorsman	175ebfec7c	Revert "k8s:kbs: Add trap statement to clean up tmp files" This reverts commit `973b8a1d8f`. As @danmihai1 points out https://github.com/bats-core/bats-core/issues/364 states that using traps in bats is error prone, so this could be the cause of the confidential test instability we've been seeing, like it was in the static checks, so let's try and revert this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:37 +00:00
stevenhorsman	75cb1f46b8	tests/k8s: Add skip is setup_common fails At @danmihai1's suggestion add a die message in case the call to setup_common fails, so we can see if in the test output. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:33 +00:00
stevenhorsman	3f5bf9828b	tests: k8s: Update bats We've seen some issues with tests not being run in some of the Coco CI jobs (Issue #10451) and in the envrionments that are more stable we noticed that they had a newer version of bats installed. Try updating the version to 1.10+ and print out the version for debug purposes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:33 +00:00
Steve Horsman	06d2cc7239	Merge pull request #10453 from bpradipt/remote-annotation runtime: Add GPU annotations for remote hypervisor	2024-11-04 09:10:06 +00:00
Zvonko Kaiser	3781526c94	gpu: Add VARIANT to the initrd and image build We need to know if we're building a nvidia initrd or image Additionally if we build a regular or confidential VARIANT Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-01 18:34:13 +00:00
Zvonko Kaiser	95b69c5732	build: initrd make it coherent to the image build Add -f for moving the initrd to the correct file path Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-01 18:34:13 +00:00
Zvonko Kaiser	3c29c1707d	image: Add suffix to image or initrd depending on the NVIDIA driver version Fixes: #9478 We want to keep track of the driver versions build during initrd/image build so update the artifact_name after the fact. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-01 18:34:13 +00:00
Sumedh Alok Sharma	4b7aba5c57	agent-ctl: Add support to test kata-agent's container creation APIs. This commit introduces changes to enable testing kata-agent's container APIs of CreateContainer/StartContainer/RemoveContainer. The changeset include: - using confidential-containers image-rs crate to pull/unpack/mount a container image. Currently supports only un-authenicated registry pull - re-factor api handlers to reduce cmdline complexity and handle request generation logic in tool - introduce an OCI config template for container creation - add test case Fixes #9707 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-11-01 22:18:54 +05:30
Fabiano Fidêncio	2efcb442f4	Merge pull request #10442 from Sumynwa/sumsharma/tools_use_ubuntu_static_build ci: Use ubuntu for static building of kata tools.	2024-11-01 16:04:31 +01:00
Gabriela Cervantes	1ca83f9d41	docs: Update virtualization document This PR updates the virtualization document by removing a url link which is not longer valid. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-31 17:28:02 +00:00
GabyCT	a3d594d526	Merge pull request #10480 from GabyCT/topic/fixstabilityrun gha: Add missing steps in Kata stability workflow	2024-10-31 09:57:33 -06:00
Fabiano Fidêncio	e058b92350	Merge pull request #10425 from burgerdev/darwin genpolicy: support darwin target	2024-10-31 12:16:44 +01:00
Markus Rudy	df5e6e65b5	protocols: only build RLimit impls on Linux The current version of the oci-spec crate compiles RLimit structs only for Linux and Solaris. Until this is fixed upstream, add compilation conditions to the type converters for the affected structs. Fixes: #10071 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-10-31 09:50:36 +01:00
Markus Rudy	091a410b96	kata-sys-util: move json parsing to protocols crate The parse_json_string function is specific to parsing capability strings out of ttRPC proto definitions and does not benefit from being available to other crates. Moving it into the protocols crate allows removing kata-sys-util as a dependency, which in turn enables compiling the library on darwin. Fixes: #10071 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-10-31 09:41:07 +01:00
Markus Rudy	8ab4bd2bfc	kata-sys-util: remove obsolete cgroups dependency The cgroups.rs source file was removed in `234d7bca04`. With cgroups support handled in runtime-rs, the cgroups dependency on kata-sys-util can be removed. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-10-31 09:41:07 +01:00
Sumedh Alok Sharma	0adf7a66c3	ci: Use ubuntu for static building of kata tools. This commit introduces changes to use ubuntu for statically building kata tools. In the existing CI setup, the tools currently build only for x86_64 architecture. It also fixes the build error seen for agent-ctl PR#10395. Fixes #10441 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-10-31 13:19:18 +05:30
Gabriela Cervantes	c4089df9d2	gha: Add missing steps in Kata stability workflow This PR adds missing steps in the gha run script for the kata stability workflow. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-30 19:13:15 +00:00
Xuewei Niu	1a216fecdf	Merge pull request #10225 from Chasing1020/main runtime-rs: Add basic boilerplate for remote hypervisor	2024-10-30 17:02:50 +08:00
Hyounggyu Choi	dca69296ae	Merge pull request #10476 from BbolroC/switch-to-kubeadm-s390x gha: Switch KUBERNETES from k3s to kubeadm on s390x	2024-10-30 09:52:06 +01:00
GabyCT	9293931414	Merge pull request #10474 from GabyCT/topic/removeunvarb packaging: Remove kernel config repo variable as it is unused	2024-10-29 12:52:07 -06:00
Gabriela Cervantes	69ee287e50	packaging: Remove kernel config repo variable as it is unused This PR removes the kernel config repo variable at the build kernel script as it is not used. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-29 17:09:52 +00:00
GabyCT	8539cd361a	Merge pull request #10462 from GabyCT/topic/increstress tests: Increase time to run stressng k8s tests	2024-10-29 11:08:47 -06:00
Chasing1020	425f6ad4e6	runtime-rs: add oci spec for prepare_vm method The cloud-api-adaptor needs to support different types of pod VM instance. We needs to pass some annotations like machine_type, default_vcpus and default_memory to prepare the VMs. Signed-off-by: Chasing1020 <643601464@qq.com>	2024-10-30 01:01:28 +08:00
Chasing1020	f1167645f3	runtime-rs: support for remote hypervisors type This patch adds the support of the remote hypervisor type for runtime-rs. The cloud-api-adaptor needs the annotations and network namespace path to create the VMs. The remote hypervisor opens a UNIX domain socket specified in the config file, and sends ttrpc requests to a external process to control sandbox VMs. Fixes: #10350 Signed-off-by: Chasing1020 <643601464@qq.com>	2024-10-30 00:54:17 +08:00
Pradipta Banerjee	6f1ba007ed	runtime: Add GPU annotations for remote hypervisor Add GPU annotations for remote hypervisor to help with the right instance selection based on number of GPUs and model Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2024-10-29 10:28:21 -04:00
Steve Horsman	68225b53ca	Merge pull request #10475 from stevenhorsman/revert-10452 Revert "tests: Add trap statement in kata doc script"	2024-10-29 13:58:00 +00:00
Hyounggyu Choi	aeef28eec2	gha: Switch to kubeadm for run-k8s-tests-on-zvsi Last November, SUSE discontinued support for s390x, leaving k3s on this platform stuck at k8s version 1.28, while upstream k8s has since reached 1.31. Fortunately, kubeadm allows us to create a 1.30 Kubernetes cluster on s390x. This commit switches the KUBERNETES option from k3s to kubeadm for s390x and removes a dedicated cluster creation step. Now, cluster setup and teardown occur in ACTIONS_RUNNER_HOOK_JOB_{STARTED,COMPLETED}. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-10-29 14:27:32 +01:00
Hyounggyu Choi	238f67005f	tests: Add `kubeadm` option for KUBERNETES in gha-run.sh When creating a k8s cluster via kubeadm, the devmapper setup for containerd requires a different configuration. This commit introduces a new `kubeadm` option for the KUBERNETES variable and adjusts the path to the containerd config file for devmapper setup. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-10-29 14:19:42 +01:00
stevenhorsman	b1cffb4b09	Revert "tests: Add trap statement in kata doc script" This reverts commit `093a6fd542`. as it is breaking the static checks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-29 09:57:18 +00:00
Aurélien Bombo	eb04caaf8f	Merge pull request #10074 from koct9i/log-vm-start-error runtime: log vm start error before cleanup	2024-10-28 14:39:00 -05:00
Fabiano Fidêncio	e675e233be	Merge pull request #10473 from fidencio/topic/build-cache-fix-shim-v2-root_hash.txt-location build: cache: Ensure shim-v2-root_hash.txt is in "${workdir}"	2024-10-28 16:53:06 +01:00
Fabiano Fidêncio	f19c8cbd02	build: cache: Ensure shim-v2-root_hash.txt is in "${workdir}" All the oras push logic happens from inside `${workdir}`, while the root_hash.txt extraction and renaming was not taking this into consideration. This was not caught during the manually triggered runs as those do not perform the oras push. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 15:17:16 +01:00
Steve Horsman	51bc71b8d9	Merge pull request #10466 from kata-containers/topic/ensure-shim-v2-sets-the-measured-rootfs-parameters-to-the-config re-enable measured rootfs build & tests	2024-10-28 13:11:50 +00:00
Fabiano Fidêncio	b70d7c1aac	tests: Enable measured rootfs tests for qemu-coco-dev Then it's on pair with what's being tested with TEEs using a rootfs image. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:54 +01:00
Fabiano Fidêncio	d23d057ac7	runtime: Enable measured rootfs for qemu-coco-dev Let's make sure we are prepared to test this with non-TEE environments as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	7d202fc173	tests: Re-enable measured_rootfs test for TDX As we're now building everything needed to test TDX with measured rootfs support, let's bring this test back in (for TDX only, at least for now). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	d537932e66	build: shim-v2: Ensure MEASURED_ROOTFS is exported The approach taken for now is to export MEASURED_ROOTFS=yes on the workflow files for the architectures using confidential stuff, and leave the "normal" build without having it set (to avoid any change of expectation on the current bevahiour). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	9c8b20b2bf	build: shim-v2: Rebuild if root_hashes do not match Let's make sure we take the root_hashes into consideration to decide whether the shim-v2 should or should not be used from the cached artefacts. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	9c84998de9	build: cache: Cache root_hash.txt used by the shim-v2 Let's cache the root_hash.txt from the confidential image so we can use them later on to decide whether there was a rootfs change that would require shim-v2 to be rebuilt. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	d2d9792720	build: Don't leave cached component behind if it can't be used Let's ensure we remove the component and any extra tarball provided by ORAS in case the cached component cannot be used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	ef29824db9	runtime: Don't do measured rootfs for "vanilla" kernel We may decide to add this later on, but for now this is only targetting TEEs and the confidential image / initrd. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	a65946bcb0	workflows: build: Ensure rootfs is present for shim-v2 build Let's ensure that we get the already built rootfs tarball from previous steps of the action at the time we're building the shim-v2. The reason we do that is because the rootfs binary tarballs has a root_hash.txt file that contains the information needed the shim-v2 build scripts to add the measured rootfs arguments to the shim-v2 configuration files. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	6ea0369878	workflows: build: Ensure rootfs is built before shim-v2 As the rootfs will have what we need to add as part of the shim-v2 configuration files for measured rootfs, we must ensure this is built before shim-v2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	13ea082531	workflows: Build rootfs after its deps are built By doing this we can just re-use the dependencies already built, saving us a reasonable amount of time. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	eb07a809ce	tests: Add a helper script to use prebuild components This is a helper script that does basically what's already being done by the s390x CI, which is: * Move a folder with the components that we were stored / downloaded during the GHA execution to the expected `build` location * Get rid of the dependencies for a specific asset, as the dependencies are already pulled in from previous GHA steps For now this script is only being added but not yet executed anywhere, and that will come as the next step in this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:52 +01:00
Fabiano Fidêncio	c2b18f9660	workflows: Store rootfs dependencies So far we haven't been storing the rootfs dependencies as part of our workflows, but we better do it to re-use them as part of the rootfs build. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:52 +01:00
Steve Horsman	b5f503b0b5	Merge pull request #10471 from fidencio/topic/possibly-fix-release-workflow workflows: Possibly fix the release workflow	2024-10-28 11:38:33 +00:00
Konstantin Khlebnikov	ee50582848	runtime: log vm start error before cleanup Return of proper error to the initiator is not guaranteed. Method StopVM could kill shim process together with VM pieces. Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>	2024-10-28 11:21:21 +01:00
Fabiano Fidêncio	a8fad6893a	workflows: Possibly fix the release workflow The only reason we had this one passing for amd64 is because the check was done using the wrong variable (`matrix.stage`, while in the other workflows the variable used is `inputs.stage`). The commit that broke the release process is `67a8665f51`, which blindly copy & pasted the logic from the matrix assets. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 11:15:53 +01:00
Steve Horsman	ad5749fd6b	Merge pull request #10467 from stevenhorsman/release-3.10.1 release: Bump version to 3.10.1	2024-10-25 20:19:23 +01:00
stevenhorsman	b22d4429fb	release: Bump version to 3.10.1 Fix release to pick up #10463 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-25 17:16:09 +01:00
Steve Horsman	19ac0b24f1	Merge pull request #10463 from skaegi/rustjail_filemode_perm_fix agent: Correct rustjail device filemode permission typo	2024-10-25 14:27:50 +01:00
Fabiano Fidêncio	cc815957c0	Merge pull request #10461 from kata-containers/topic/workflows-follow-up-on-manually-triggered-job workflows: devel: Follow-up on the manually triggered jobs	2024-10-25 08:31:14 +02:00
Simon Kaegi	322846b36f	agent: Correct rustjail device filemode permission typo Corrects device filemode permissions typo/regression in rustjail to `666` instead of `066`. `666` is the standard and expected value for these devices in containers. Fixes: #10454 Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>	2024-10-24 16:46:40 -04:00
GabyCT	a9af46ccd2	Merge pull request #10452 from GabyCT/topic/katadoctemp tests: Add trap statement in kata doc script	2024-10-24 13:21:11 -06:00
Gabriela Cervantes	a3ef8c0a16	tests: Increase time to run stressng k8s tests This PR increase the time to run the stressng k8s tests for the CoCo stability CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-24 16:34:17 +00:00
Fabiano Fidêncio	475ad3e06b	workflows: devel: Allow running more than one at once More than one developer can and should be able to run this workflow at the same time, without cancelling the job started by another developer. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-24 15:38:35 +02:00
Fabiano Fidêncio	8f634ceb6b	workflows: devel: Adjust the pr-number Let's use "dev" instead of "manually-triggered" as it avoids the name being too long, which results in failures to create AKS clusters. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-24 15:38:31 +02:00
GabyCT	41d1178e4a	Merge pull request #10438 from GabyCT/topic/fixspellreadme docs: Fix misspelling in CI documentation	2024-10-23 13:34:52 -06:00
Steve Horsman	c5c389f473	Merge pull request #10449 from kata-containers/topic/add-workflows-specifically-for-testing Add a specific workflow for testing the CI, without messing up with the "nightly" weather	2024-10-23 19:03:49 +01:00
Gabriela Cervantes	093a6fd542	tests: Add trap statement in kata doc script This PR adds the trap statement into the kata doc script to clean up properly the temporary files. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-23 15:56:58 +00:00
Gabriela Cervantes	701891312e	docs: Fix misspelling in CI documentation This PR fixes a misspelling in CI documentation readme. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-23 15:42:08 +00:00
Fabiano Fidêncio	829415dfda	workflows: Remove the possibility to manually trigger the nightly CI As a new workflow was added for the cases where developers want to test their changes in the workflow itself, let's make sure we stop allowing manual triggers on this workflow, which can lead to a polluted / misleading weather of the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-23 13:19:45 +02:00
Fabiano Fidêncio	cc093cdfdb	workflows: Add a manually trigger "devel" workflow for the CI This workflow is intended to replace the `workflow_dispatch` trigger currently present as part of the `ci-nightly.yaml`. The reasoning behind having this done in this way is because of our good and old GHA behaviour for `pull_request_target`, which requires a PR to be merged in order to check the changes in the workflow itself, which leads to: * when a change in a workflow is done, developers (should) do: * push their branch to the kata-containers repo * manually trigger the "nightly" CI in order to ensure the changes don't break anything * this can result in the "nightly" CI weather being polluted * we don't have the guarantee / assurance about the last n nightly runs anymore Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-23 13:14:50 +02:00
Greg Kurz	378f454fb9	Merge pull request #10208 from wtootw/main runtime: Failed to clean up resources when QEMU is terminated	2024-10-23 12:11:57 +02:00
Fabiano Fidêncio	ca416d8837	Merge pull request #10446 from kata-containers/topic/re-work-shim-v2-build-as-part-of-the-ci-and-release workflows: Ensure shim-v2 is built as the last asset	2024-10-23 09:27:29 +02:00
Fabiano Fidêncio	c082b99652	Merge pull request #10439 from microsoft/mahuber/azl-cfg-var tools: Change PACKAGES var for cbl-mariner	2024-10-23 08:39:49 +02:00
Manuel Huber	a730cef9cf	tools: Change PACKAGES var for cbl-mariner Change the PACKAGES variable for the cbl-mariner rootfs-builder to use the kata-packages-uvm meta package from packages.microsoft.com to define the set of packages to be contained in the UVM. This aligns the UVM build for the Azure Linux distribution with the UVM build done for the Kata Containers offering on Azure Kubernetes Services (AKS). Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2024-10-22 23:11:42 +00:00
Fabiano Fidêncio	67a8665f51	workflows: Ensure shim-v2 is built as the last asset By doing this we can ensure that whenever the rootfs changes, we'll be able to get the new root_hash.txt and use it. This is the very first step to bring the measured rootfs tests back. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-22 14:56:37 +02:00
Greg Kurz	3de6d09a86	Merge pull request #10443 from gkurz/release-3.10.0 release: Bump VERSION to 3.10.0	2024-10-22 14:46:30 +02:00
Greg Kurz	3037303e09	release: Bump VERSION to 3.10.0 Let's start the 3.10.0 release. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-10-22 11:28:15 +02:00
wangyaqi54	cf4b81344d	runtime: Failed to clean up resources when QEMU is terminated by signal 15 When QEMU is terminated by signal 15, it deletes the PidFile. Upon detecting that QEMU has exited, the shim executes the stopVM function. If the PidFile is not found, the PID is set to 0. Subsequently, the shim executes `kill -9 0`, which terminates the current process group. This prevents any further logic from being executed, resulting in resources not being cleaned up. Signed-off-by: wangyaqi54 <wangyaqi54@jd.com>	2024-10-22 17:04:46 +08:00
Fabiano Fidêncio	4c34cfb0ab	Merge pull request #10420 from pmores/add-support-for-virtio-scsi runtime-rs: support virtio-scsi device in qemu-rs	2024-10-22 11:00:33 +02:00
Pavel Mores	8cdd968092	runtime-rs: support virtio-scsi device in qemu-rs Semantics are lifted straight out of the go runtime for compatibility. We introduce DeviceVirtioScsi to represent a virtio-scsi device and instantiate it if block device driver in the configuration file is set to virtio-scsi. We also introduce ObjectIoThread which is instantiated if the configuration file additionally enables iothreads. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-22 08:55:54 +02:00
Greg Kurz	91b874f18c	Merge pull request #10421 from Apokleos/hostname-bugfix kata-agent: fixing bug of unable setting hostname correctly.	2024-10-22 00:26:51 +02:00
alex.lyn	b25538f670	ci: Introduce CI to validate pod hostname Fixes #10422 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-10-21 16:32:56 +01:00
alex.lyn	3dabe0f5f0	kata-agent: fixing bug of unable setting hostname correctly. When do update_container_namespaces updating namespaces, setting all UTS(and IPC) namespace paths to None resulted in hostnames set prior to the update becoming ineffective. This was primarily due to an error made while aligning with the oci spec: in an attempt to match empty strings with None values in oci-spec-rs, all paths were incorrectly set to None. Fixes #10325 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-10-21 16:32:56 +01:00
Steve Horsman	98886a7571	Merge pull request #10437 from mkulke/mkulke/dont-parse-oci-image-for-cached-artifacts ci: don't parse oci image for cached artifacts	2024-10-21 16:31:23 +01:00
Magnus Kulke	e27d70d47e	ci: don't parse oci image for cached artifacts Moved the parsing of the oci image marker into its own step, since we only need to perform that for attestation purposes and some cached images might not have that file in the tarball. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-10-21 14:50:00 +02:00
Magnus Kulke	9a33a3413b	Merge pull request #10433 from mkulke/mkulke/add-provenance-attestation-for-agent-builds ci: add provenance attestation for agent artifact	2024-10-18 15:00:18 +02:00
Anastassios Nanos	68d539f5c5	Merge pull request #10435 from nubificus/fix_fc_machineconfig runtime-rs: Use vCPU and memory values from config	2024-10-18 13:41:20 +01:00
Magnus Kulke	b93f5390ce	ci: add provenance attestation for agent artifact This adds provenance attestation logic for agent binaries that are published to an oci registry via ORAS. As a downstream consumer of the kata-agent binary the Peerpod project needs to verify that the artifact has been built on kata's CI. To create an attestation we need to know the exact digest of the oci artifact, at the point when the artifact was pushed. Therefore we record the full oci image as returned by oras push. The pushing and tagging logic has been slightly reworked to make this task less repetetive. The oras cli accepts multiple tags separated by comma on pushes, so a push can be performed atomically instead of iterating through tags and pushing each individually. This removes the risk of partially successful push operations (think: rate limits on the oci registry). So far the provenance creation has been only enabled for agent builds on amd64 and xs390x. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-10-18 10:24:00 +02:00
Anastassios Nanos	23f5786cca	runtime-rs: Use vCPU and memory values from config Use values from the config for the setup of the microVM. Fixes: #10434 Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2024-10-17 23:17:02 +01:00
GabyCT	4ae9317675	Merge pull request #10430 from GabyCT/topic/ciaz docs: Update CI documentation	2024-10-17 15:09:24 -06:00
GabyCT	b00203ba9b	Merge pull request #10428 from GabyCT/topic/archk8sc gha: Use a arch_to_golang variable to have uniformity	2024-10-17 11:00:59 -06:00
Chengyu Zhu	cca77f0911	Merge pull request #10412 from stevenhorsman/agent-config-rstest agent: config: Use rstest for unit tests	2024-10-17 23:01:21 +08:00
Gabriela Cervantes	e3efad8ed2	docs: Update CI documentation This PR updates the CI documentation referring to the several tests and in which kind of instances is running them. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-16 19:23:19 +00:00
stevenhorsman	4adb454ed0	agent: config: Use rstest for unit tests Use rstest for unit test rather than TestData arrays where possible to make the code more compact, easier to read and open the possibility to enhance test cases with a description more easily. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-16 16:55:44 +01:00
Gabriela Cervantes	f0e0c74fd4	gha: Use a arch_to_golang variable to have uniformity This PR replaces the arch uname -m to use the arch_to_golang variable in the script to have a better uniformity across the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-15 20:03:09 +00:00
Dan Mihai	69509eff33	Merge pull request #10417 from microsoft/danmihai1/k8s-inotify.bats tests: k8s-inotify.bats improvements	2024-10-15 11:22:53 -07:00
Dan Mihai	ece0f9690e	tests: k8s-inotify: longer pod termination timeout inotify-configmap-pod.yaml is using: "inotifywait --timeout 120", so wait for up to 180 seconds for the pod termination to be reported. Hopefully, some of the sporadic errors from #10413 will be avoided this way: not ok 1 configmap update works, and preserves symlinks waitForProcess "${wait_time}" "$sleep_time" "${command}" failed Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-15 16:01:25 +00:00
Dan Mihai	ccfb7faa1b	tests: k8s-inotify.bats: don't leak configmap Delete the configmap if the test failed, not just on the successful path. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-15 16:01:25 +00:00
Aurélien Bombo	f13d13c8fa	Merge pull request #10416 from microsoft/danmihai1/mariner_static_sandbox_resource_mgmt ci: static_sandbox_resource_mgmt for cbl-mariner	2024-10-15 10:40:17 -05:00
Aurélien Bombo	c371b4e1ce	Merge pull request #10426 from 3u13r/fix/genpolicy/handle-config-map-binary-data genpolicy: read binaryData value as String	2024-10-14 21:31:23 -05:00
Leonard Cohnen	c06bf2e3bb	genpolicy: read binaryData value as String While Kubernetes defines `binaryData` as `[]byte`, when defined in a YAML file the raw bytes are base64 encoded. Therefore, we need to read the YAML value as `String` and not as `Vec<u8>`. Fixes: #10410 Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-10-14 20:03:11 +02:00
Aurélien Bombo	f9b7a8a23c	Merge pull request #10402 from Sumynwa/sumsharma/agent-ctl-dependencies ci: Install build dependencies for building agent-ctl with image pull.	2024-10-14 10:28:32 -05:00
Sumedh Alok Sharma	bc195d758a	ci: Install build dependencies for building agent-ctl with image pull. Adds dependencies of 'clang' & 'protobuf' to be installed in runners when building agent-ctl sources having image pull support. Fixes #10400 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-10-14 10:36:04 +05:30
Aurélien Bombo	614e21ccfb	Merge pull request #10415 from GabyCT/topic/egreptim tools/osbuilder/tests: Remove egrep in test images script	2024-10-11 13:47:30 -05:00
Gabriela Cervantes	aae654be80	tools/osbuilder/tests: Remove egrep in test images script This PR removes egrep command as it has been deprecated and it replaces by grep in the test images script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-11 17:23:35 +00:00
Dan Mihai	3622b5e8b4	ci: static_sandbox_resource_mgmt for cbl-mariner Use the configuration used by AKS (static_sandbox_resource_mgmt=true) for CI testing on Mariner hosts. Hopefully pod startup will become more predictable on these hosts - e.g., by avoiding the occasional hotplug timeouts described by #10413. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-10 22:17:39 +00:00
Fabiano Fidêncio	02f5fd94bd	Merge pull request #10409 from fidencio/topic/ci-add-ita_image-and-ita_image_tag kbs: ita: Ensure the proper image / image_tag is used for ITA	2024-10-10 11:46:26 +02:00
Fabiano Fidêncio	cf5d3ed0d4	kbs: ita: Ensure the proper image / image_tag is used for ITA When dealing with a specific release, it was easier to just do some adjustments on the image that has to be used for ITA without actually adding a new entry in the versions.yaml. However, it's been proven to be more complicated than that when it comes to dealing with staged images, and we better explicitly add (and update) those versions altogether to avoid CI issues. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-10 10:01:33 +02:00
Steve Horsman	0c4a7c8771	Merge pull request #10406 from ChengyuZhu6/fix-unit agent:cdh: fix unit tests about sealed secret	2024-10-10 08:57:28 +01:00
Fabiano Fidêncio	3f7ce1d620	Merge pull request #10401 from stevenhorsman/kbs-deploy-overlays-update Kbs deploy overlays update	2024-10-10 09:50:19 +02:00
Fabiano Fidêncio	036b04094e	Merge pull request #10397 from fidencio/topic/build-remove-initrd-mariner-target build: mariner: Remove the ability to build the marine initrd	2024-10-10 09:44:36 +02:00
ChengyuZhu6	65ecac5777	agent:cdh: fix unit tests about sealed secret The root cause is that the CDH client is a global variable, and unit tests `test_unseal_env` and `test_unseal_file` share this lock-free global variable, leading to resource contention and destruction. Merging the two unit tests into one test_sealed_secret will resolve this issue. Fixes: #10403 Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>	2024-10-10 08:38:06 +08:00
ChengyuZhu6	a992feb7f3	Revert "Revert "agent:cdh: unittest for sealed secret as file"" This reverts commit `b5142c94b9`. Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>	2024-10-10 08:37:06 +08:00
GabyCT	0cda92c6d8	Merge pull request #10407 from GabyCT/topic/fixbuildk packaging: Remove unused variable in build kernel script	2024-10-09 16:53:45 -06:00
Gabriela Cervantes	616eb8b19b	packaging: Remove unused variable in build kernel script This PR removes an unused variable in the build kernel script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-09 20:02:56 +00:00
Fabiano Fidêncio	652ba30d4a	build: mariner: Remove the ability to build the marine initrd As mariner has switched to using an image instead of an initrd, let's just drop the abiliy to build the initrd and avoid keeping something in the tree that won't be used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 21:42:55 +02:00
Fabiano Fidêncio	59e3ab07e4	Merge pull request #10396 from fidencio/topic/ci-mariner-test-using-mariner-image-instead-of-initrd ci: mariner: Use the image instead of the initrd	2024-10-09 21:39:44 +02:00
stevenhorsman	b2fb19f8f8	versions: Bump KBS version Bump to the commit that had the overlays changes we want to adapt to. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-09 17:49:21 +01:00
Fabiano Fidêncio	01a957f7e1	ci: mariner: Stop building mariner initrd As the mariner image is already in place, and the tests were modified to use them (as part of this series), let's just stop building it as part of the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 18:23:35 +02:00
Fabiano Fidêncio	091ad2a1b2	ci: mariner: Ensure kernel_params can be set The reason we're doing this is because mariner image uses, by default, cgroups default-hierarchy as `unified` (aka, cgroupsv2). In order to keep the same initrd behaviour for mariner, let's enforce that `SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 systemd.legacy_systemd_cgroup_controller=yes systemd.unified_cgroup_hierarchy=0` is passed to the kernel cmdline, at least for now. Other tests that are setting `kernel_params` are not running on mariner, then we're safe taking this path as it's done as part of this PR. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 18:23:35 +02:00
Fabiano Fidêncio	3bbf3c81c2	ci: mariner: Use the image instead of the initrd As an image has been added for mariner as part of the commit `63c1f81c2`, let's start using it in the CI, instead of using the initrd. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 18:23:32 +02:00
Fabiano Fidêncio	9c0c159b25	Merge pull request #10404 from fidencio/topic/rever-sealed-secrets-tests Revert "agent:cdh: unittest for sealed secret as file"	2024-10-09 18:09:09 +02:00
GabyCT	2035d638df	Merge pull request #10388 from GabyCT/topic/testimtemp tools/osbuilder/tests: Add trap statement in test images script	2024-10-09 09:49:45 -06:00
Fabiano Fidêncio	b5142c94b9	Revert "agent:cdh: unittest for sealed secret as file" This reverts commit `31e09058af`, as it's breaking the agent unit tests CI. This is a stop gap till Chengyu Zhu finds the time to properly address the issue, avoiding the CI to be blocked for now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 16:06:09 +02:00
stevenhorsman	8763880e93	tests/k8s: kbs: Update overlays logic In https://github.com/confidential-containers/trustee/pull/521 the overlays logic was modified to add non-SE s390x support and simplify non-ibm-se platforms. We need to update the logic in `kbs_k8s_deploy` to match and can remove the dummying of `IBM_SE_CREDS_DIR` for non-SE now Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-09 09:39:41 +01:00
Gabriela Cervantes	e08749ce58	tools/osbuilder/tests: Add trap statement in test images script This PR adds the trap statement in the test images script to clean up tmp files. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-08 19:54:23 +00:00
Fabiano Fidêncio	80196c06ad	Merge pull request #10390 from microsoft/danmihai1/new-rootfs-image-mariner local-build: add ability to build rootfs-image-mariner	2024-10-08 21:40:43 +02:00
Fabiano Fidêncio	083b2f24d8	Merge pull request #10363 from ChengyuZhu6/secret-as-volume Support Confidential Sealed Secrets (as volume)	2024-10-08 19:23:40 +02:00
Dan Mihai	63c1f81c23	local-build: add rootfs-image-mariner Kata CI will start testing the new rootfs-image-mariner instead of the older rootfs-initrd-mariner image. The "official" AKS images are moving from a rootfs-initrd-mariner format to the rootfs-image-mariner format. Making the same change in Kata CI is useful to keep this testing in sync with the AKS settings. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-08 17:15:56 +00:00
GabyCT	7a38cce73c	Merge pull request #10383 from kata-containers/topic/imagevar image-builder: Remove unused variable	2024-10-08 10:27:03 -06:00
Aurélien Bombo	e56af7a370	Merge pull request #10389 from emanuellima1/fix-agent-policy build: Fix RPM build fail due to AGENT_POLICY	2024-10-08 09:59:21 -05:00
ChengyuZhu6	a94024aedc	tests: add test for sealed file secrets add a test for sealed file secrets. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-10-08 16:01:48 +08:00
ChengyuZhu6	fe307303c8	agent:rpc: Refactor CDH-related operations Refactor CDH-related operations into the cdh_handler function to make the `create_container` code clearer. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-10-08 16:01:48 +08:00
ChengyuZhu6	31e09058af	agent:cdh: unittest for sealed secret as file add unittest for sealed secret as file. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-10-08 16:01:48 +08:00
ChengyuZhu6	974d6b0736	agent:cdh: initialize cdhclient with the input cdh socket uri Refactor cdh code to initialize cdhclient with the input cdh socket uri. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-10-08 14:58:07 +08:00
ChengyuZhu6	1f33fd4cd4	agent:rpc: handle the sealed secret in createcontainer Users must set the mount path to `/sealed/<path>` for kata agent to detect the sealed secret mount and handle it in createcontainer stage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-10-08 14:58:07 +08:00
ChengyuZhu6	da281b4444	agent:cdh: support to unseal secret as file Introduced `unseal_file` function to unseal secret as files: - Implemented logic to handle symlinks and regular files within the sealed secret directory. - For each entry, call CDH to unseal secrets and the unsealed contents are written to a new file, and a symlink is created to replace the sealed symlink. Fixes: #8123 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-10-08 14:58:07 +08:00
Fabiano Fidêncio	71d0c46e0a	Merge pull request #10384 from microsoft/danmihai1/virtio-fs-policy tests: k8s: AUTO_GENERATE_POLICY=yes for local testing	2024-10-07 21:25:52 +02:00
Emanuel Lima	e989e7ee4e	build: Fix RPM build fail due to AGENT_POLICY By checking for AGENT_POLICY we ensure we only try to read allow-all.rego if AGENT_POLICY is set to "yes" Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-10-07 15:43:23 -03:00
Dan Mihai	6d5fc898b8	tests: k8s: AUTO_GENERATE_POLICY=yes for local testing The behavior of Kata CI doesn't change. For local testing using kubernetes/gha-run.sh and AUTO_GENERATE_POLICY=yes: 1. Before these changes users were forced to use: - SEV, SNP, or TDX guests, or - KATA_HOST_OS=cbl-mariner 2. After these changes users can also use other platforms that are configured with "shared_fs = virtio-fs" - e.g., - KATA_HOST_OS=ubuntu + KATA_HYPERVISOR=qemu Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-04 18:26:00 +00:00
Dan Mihai	5aaef8e6eb	Merge pull request #10376 from microsoft/danmihai1/auto-generate-just-for-ci gha: enable AUTO_GENERATE_POLICY where needed	2024-10-04 10:52:31 -07:00
Gabriela Cervantes	4cd737d9fd	image-builder: Remove unused variable This PR removes an unused variable in the image builder script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-04 15:56:28 +00:00
Greg Kurz	77c5db6267	Merge pull request #9637 from ldoktor/selective-ci CI: Select jobs by touched code	2024-10-04 11:29:05 +02:00
GabyCT	2d089d9695	Merge pull request #10381 from GabyCT/topic/archrootfs osbuilder: Remove duplicated arch variable definition	2024-10-03 14:48:08 -06:00
Wainer Moschetta	b9025462fb	Merge pull request #10134 from ldoktor/ci-sort-range ci.ocp: Sort images according to git	2024-10-03 15:08:41 -03:00
Chelsea Mafrica	9138f55757	Merge pull request #10375 from GabyCT/topic/mktempkbs k8s:kbs: Add trap statement to clean up tmp files	2024-10-03 12:32:30 -04:00
Gabriela Cervantes	d7c2b7d13c	osbuilder: Remove duplicated arch variable definition This PR removes duplicated arch variable definition in the rootfs script as this variable and its value is already defined at the top of the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-03 16:22:27 +00:00
Greg Kurz	96336d141b	Merge pull request #10165 from pmores/add-network-device-hotplugging runtime-rs: add network device hotplugging to qemu-rs	2024-10-03 17:44:50 +02:00
Pavel Mores	23927d8a94	runtime-rs: plug in netdev hotplugging functionality and actually call it add_device() now checks if QEMU is running already by checking if we have a QMP connection. If we do a new function hotplug_device() is called which hotplugs the device if it's a network one. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:23:10 +02:00
Pavel Mores	ac393f6316	runtime-rs: implement netdev hotplugging for qemu-rs With the helpers from previous commit, the actual hotplugging implementation, though lengthy, is mostly just assembling a QMP command to hotplug the network device backend and then doing the same for the corresponding frontend. Note that hotplug_network_device() takes cmdline_generator types Netdev and DeviceVirtioNet. This is intentional and aims to take advantage of the similarity between parameter sets needed to coldplug and hotplug devices reuse and simplify our code. To enable using the types from qmp, accessors were added as needed. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:20:02 +02:00
Pavel Mores	4eb7e2966c	runtime-rs: add netdev hotplugging helpers to qemu-rs Before adding network device hotplugging functionality itself we add a couple of helpers in a separate commit since their functionality is non-trivial. To hotplug a device we need a free PCI slot. We add find_free_slot() which can be called to obtain one. It looks for PCI bridges connected to the root bridge and looks for an unoccupied slot on each of them. The first found is returned to the caller. The algorithm explicitly doesn't support any more complex bridge hierarchies since those are never produced when coldplugging PCI bridges. Sending netdev queue and vhost file descriptors to QEMU is slightly involved and implemented in pass_fd(). The actual socket has to be passed in an SCM_RIGHTS socket control message (also called ancillary data, see man 3 cmsg) so we have to use the msghdr structure and sendmsg() call (see man 2 sendmsg) to send the message. Since qapi-rs doesn't support sending messages with ancillary data we have to do the sending sort of "under it", manually, by retrieving qapi-rs's socket and using it directly. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:15:31 +02:00
Pavel Mores	3f46dfcf2f	runtime-rs: don't treat NetworkConfig::index as unique in qemu-rs NetworkConfig::index has been used to generate an id for a network device backend. However, it turns out that it's not unique (it's always zero as confirmed by a comment at its definition) so it's not suitable to generate an id that needs to be unique. Use the host device name instead. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:12:37 +02:00
Pavel Mores	cda04fa539	runtime-rs: factor setup of network device out of QemuCmdLine Network device hotplugging will use the same infrastructure (Netdev, DeviceVirtioNet) as coldplugging, i.e. QemuCmdLine. To make the code of network device setup visible outside of QemuCmdLine we factor it out to a non-member function `get_network_device()` and make QemuCmdLine just delegate to it. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:03:32 +02:00
Pavel Mores	efc8e93bfe	runtime-rs: factor bus_type() out of QemuCmdLine The function takes a whole QemuCmdLine but only actually uses HypervisorConfig. We increase callability of the function by limiting its interface to what it needs. This will come handy shortly. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:03:32 +02:00
Pavel Mores	720265c2d8	runtime-rs: support adding PCI bridges to qemu VM At least one PCI bridge is necessary to hotplug PCI devices. We only support PCI (at this point at least) since that's what the go runtime does (note that looking at the code in virtcontainers it might seem that other bus types are supported, however when the bridge objects are passed to govmm, all but PCI bridges are actually ignored). The entire logic of bridge setup is lifted from runtime-go for compatibility's sake. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:03:32 +02:00
Lukáš Doktor	63b6e8a215	ci: Ensure we check the latest workflow run in gatekeeper with multiple iterations/reruns we need to use the latest run of each workflow. For that we can use the "run_id" and only update results of the same or newer run_ids. To do that we need to store the "run_id". To avoid adding individual attributes this commit stores the full job object that contains the status, conclussion as well as other attributes of the individual jobs, which might come handy in the future in exchange for slightly bigger memory overhead (still we only store the latest run of required jobs only). Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:10:45 +02:00
Lukáš Doktor	2ae090b44b	ci: Add extra gatekeeper debug output to stderr which might be useful to assess the amount of querries. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:35 +02:00
Lukáš Doktor	2440a39c50	ci: Check required lables before checking tests in gatekeeper some tests require certain labels before they are executed. When our PR is not labeled appropriately the gatekeeper detects skipped required tests and reports a failure. With this change we add "required-labeles" to the tests mapping and check the expected labels first informing the user about the missing labeles before even checking the test statuses. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:35 +02:00
Lukáš Doktor	dd2878a9c8	ci: Unify character for separating items the test names are using `;` and regexps were designed to use `,` but during development simply joined the expressions by `\|`. This should work but might be confusing so let's go with the semi-colon separator everywhere. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta	fdcfac0641	workflows/gatekeeper: export COMMIT_HASH variable The Github SHA of triggering PR should be exported in the environment so that gatekeeper can fetch the right workflows/jobs. Note: by default github will export GITHUB_SHA in the job's environment but that value cannot be used if the gatekeeper was triggered from a pull_request_target event, because the SHA correspond to the push branch. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta	4abfc11b4f	workflows/gatekeeper: configure concurrency properly This will allow to cancel-in-progress the gatekeeper jobs. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:35 +02:00
Lukáš Doktor	5c1cea1601	ci: Select jobs by touched code to allow selective testing as well as selective list of required tests let's add a mapping of required jobs/tests in "skips.py" and a "gatekeaper" workflow that will ensure the expected required jobs were successful. Then we can only mark the "gatekeaper" as the required job and modify the logic to suit our needs. Fixes: #9237 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:33 +02:00
Dan Mihai	1a4928e710	gha: enable AUTO_GENERATE_POLICY where needed The behavior of Kata CI doesn't change. For local testing using kubernetes/gha-run.sh: 1. Before these changes: - AUTO_GENERATE_POLICY=yes was always used by the users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner. 2. After these changes: - Users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner must specify AUTO_GENERATE_POLICY=yes if they want to auto-generate policy. - These users have the option to test just using hard-coded policies (e.g., using the default policy built into the Guest rootfs) by using AUTO_GENERATE_POLICY=no. AUTO_GENERATE_POLICY=no is the default value of this env variable. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-02 23:20:33 +00:00
Gabriela Cervantes	973b8a1d8f	k8s:kbs: Add trap statement to clean up tmp files This PR adds the trap statement in the confidential kbs script to clean up temporary files and ensure we are leaving them. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-02 19:59:08 +00:00
Steve Horsman	8412c09143	Merge pull request #10371 from fidencio/topic/k8s-tdx-re-enable-empty-dir-tests k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev	2024-10-02 18:41:19 +01:00
Dan Mihai	9a8341f431	Merge pull request #10370 from microsoft/danmihai1/k8s-policy-rc tests: k8s-policy-rc: remove default UID from YAML	2024-10-02 09:32:17 -07:00
GabyCT	a1d380305c	Merge pull request #10369 from GabyCT/topic/egrepfastf metrics: Update fast footprint script to use grep	2024-10-02 10:10:12 -06:00
Fabiano Fidêncio	b3ed7830e4	k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev The tests is disabled for qemu-coco-dev / qemu-tdx, but it doesn't seen to actually be failing on those. Plus, it's passing on SEV / SNP, which means that we most likely missed re-enabling this one in the past. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-01 20:51:01 +02:00
Hyounggyu Choi	b179598fed	Merge pull request #10374 from BbolroC/skip-block-volume-qemu-runtime-rs tests: Skip k8s-block-volume.bats for qemu-runtime-rs	2024-10-01 19:45:10 +02:00
Lukáš Doktor	820e000f1c	ci.ocp: Sort images according to git The quay.io registry returns the tags sorted alphabetically and doesn't seem to provide a way to sort it by age. Let's use "git log" to get all changes between the commits and print all tags that were actually pushed. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-01 16:08:00 +02:00
Hyounggyu Choi	4ccf1f29f9	tests: Skip k8s-block-volume.bats for qemu-runtime-rs Currently, `qemu-runtime-rs` does not support `virtio-scsi`, which causes the `k8s-block-volume.bats` test to fail. We should skip this test until `virtio-scsi` is supported by the runtime. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-10-01 09:09:47 +02:00
Dan Mihai	3b24219310	tests: k8s-policy-rc: remove default UID from YAML The nginx container seems to error out when using UID=123. Depending on the timing between container initialization and "kubectl wait", the test might have gotten lucky and found the pod briefly in Ready state before nginx errored out. But on some of the nodes, the pod never got reported as Ready. Also, don't block in "kubectl wait --for=condition=Ready" when wrapping that command in a waitForProcess call, because waitForProcess is designed for short-lived commands. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-01 00:10:30 +00:00
Saul Paredes	94bc54f4d2	Merge pull request #10340 from microsoft/saulparedes/validate_create_sandbox_storages genpolicy: validate create sandbox storages	2024-09-30 14:24:56 -07:00
Aurélien Bombo	b49800633d	Merge pull request #7165 from sprt/k8s-block-volume-test tests: Add `k8s-block-volume` test to GHA CI	2024-09-30 13:26:18 -07:00
Dan Mihai	7fe44d3a3d	genpolicy: validate create sandbox storages Reject any unexpected values from the CreateSandboxRequest storages field. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-30 11:31:12 -07:00
Gabriela Cervantes	52ef092489	metrics: Update fast footprint script to use grep This PR updates the fast footprint script to remove the use of egrep as this command has been deprecated and change it to use grep command. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-30 17:43:08 +00:00
Aurélien Bombo	c037ac0e82	tests: Add k8s-block-volume test This imports the k8s-block-volume test from the tests repo and modifies it slightly to set up the host volume on the AKS host. This is a follow-up to #7132. Fixes: #7164 Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com> Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-09-30 10:58:30 -05:00
Alex Lyn	dfd0ca9bfe	Merge pull request #10312 from sidneychang/configurable-build-dragonball runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs	2024-09-29 22:33:54 +08:00
GabyCT	6a9e3ccddf	Merge pull request #10305 from GabyCT/topic/ita ci:tdx: Use an ITA key for TDX	2024-09-27 16:44:53 -06:00
Fabiano Fidêncio	66bcfe7369	k8s: kbs: Properly delete ita kustomization The ita kustomization for Trustee, as well as previously used one (DCAP), doesn't have a $(uname -m) directory after the deployment directory name. Let's follow the same logic used for the deploy-kbs script and clean those up accordingly. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-27 21:47:29 +02:00
Gabriela Cervantes	bafa527be0	ci: tdx: Test attestation with ITTS Intel Tiber Trust Services (formerly known as Intel Trust Authority) is Intel's own attestation service, and we want to take advantage of the TDX CI in order to ensure ITTS works as expected. In order to do so, let's replace the former method used (DCAP) to use ITTS instead. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-27 21:47:25 +02:00
GabyCT	36750b56f1	Merge pull request #10342 from GabyCT/topic/updevguide docs: Remove qemu information not longer valid	2024-09-27 11:15:11 -06:00
Fabiano Fidêncio	86b8c53d27	Merge pull request #10357 from fidencio/topic/add-ita-secret gha: Add ita_key as a github secret	2024-09-27 17:40:41 +02:00
Gabriela Cervantes	d91979d7fa	gha: Add ita_key as a github secret This PR adds ita_key as a github secret at the kata coco tests yaml workflow. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-27 17:15:22 +02:00
Xuewei Niu	ad0f2b2a55	Merge pull request #10219 from sidneychang/decouple-runtime-rs-from-dragonball runtime-rs: Port TAP implementation from dragonball	2024-09-27 11:17:55 +08:00
Xuewei Niu	11b1a72442	Merge pull request #10349 from lifupan/main_nsandboxapi sandbox: refactor the sandbox init process	2024-09-27 11:10:45 +08:00
Xuewei Niu	3911bd3108	Merge pull request #10351 from lifupan/main_agent agent: fix the issue of setup sandbox pidns	2024-09-27 10:49:47 +08:00
Fupan Li	f7bc627a86	sandbox: refactor the sandbox init process Inorder to support sandbox api, intorduce the sandbox_config struct and split the sandbox start stage from init process. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-26 23:50:24 +08:00
Hyounggyu Choi	b1275bed1b	Merge pull request #10346 from BbolroC/minor-improvement-k8s-tests tests: Minor improvement k8s tests	2024-09-26 17:01:32 +02:00
Hyounggyu Choi	01d460ac63	tests: Add teardown_common() to tests_common.sh There are many similar or duplicated code patterns in `teardown()`. This commit consolidates them into a new function, `teardown_common()`, which is now called within `teardown()`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-26 13:56:36 +02:00
Hyounggyu Choi	e8d1feb25f	tests: Validate node name for exec_host() The current `exec_host()` accepts a given node name and creates a node debugger pod, even if the name is invalid. This could result in the creation of an unnecessary pending pod (since we are using nodeAffinity; if the given name does not match any actual node names, the pod won’t be scheduled), which wastes resources. This commit introduces validation for the node name to prevent this situation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-26 13:20:50 +02:00
Xuewei Niu	3a7f9595b6	Merge pull request #10318 from lsc2001/ci-add-docker ci: Enable basic docker tests for runtime-rs	2024-09-26 17:41:09 +08:00
Xuewei Niu	cb5a2b30e9	Merge pull request #10293 from lsc2001/solve-docker-compatibility runtime-rs: Notify containerd when process exits	2024-09-26 14:51:20 +08:00
Sicheng Liu	e4733748aa	ci: Enable basic docker tests for runtime-rs This commit enables basic amd64 tests of docker for runtime-rs by adding vmm types "dragonball" and "cloud-hypervisor". Signed-off-by: Sicheng Liu <lsc2001@outlook.com>	2024-09-26 06:27:05 +00:00
Sicheng Liu	08eb5fc7ff	runtime-rs: Notify containerd when process exits Docker cannot exit normally after the container process exits when used with runtime-rs since it doesn't receive the exit event. This commit enable runtime-rs to send TaskExit to containerd after process exits. Also, it moves "system_time_into" and "option_system_time_into" from crates/runtimes/common/src/types/trans_into_shim.rs to a new utility mod. Signed-off-by: Sicheng Liu <lsc2001@outlook.com>	2024-09-26 02:52:50 +00:00
Fupan Li	71afeccdf1	agent: fix the issue of setup sandbox pidns When the sandbox api was enabled, the pasue container wouldn't be created, thus the shared sandbox pidns should be fallbacked to the first container's init process, instead of return any error here. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-26 10:21:25 +08:00
Xuewei Niu	857222af02	Merge pull request #10330 from lifupan/main_sandboxapi Some prepared work for sandbox api support	2024-09-26 09:47:47 +08:00
Hyounggyu Choi	caf3b19505	Merge pull request #10348 from BbolroC/delete-node-debugger-by-trap tests: Delete custom node debugger pod on EXIT	2024-09-25 23:39:43 +02:00
Hyounggyu Choi	57e8cbff6f	tests: Delete custom node debugger pod on EXIT It was observed that the custom node debugger pod is not cleaned up when a test times out. This commit ensures the pod is cleaned up by triggering the cleanup on EXIT, preventing any debugger pods from being left behind. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-25 20:36:05 +02:00
Fabiano Fidêncio	edf4ca4738	Merge pull request #10345 from ldoktor/kata-webhook ci: Reorder webhook deployment	2024-09-25 18:16:46 +02:00
Fabiano Fidêncio	09ed9c5c50	Merge pull request #10328 from BbolroC/improve-negative-tests tests: Improve k8s negative tests	2024-09-25 18:16:28 +02:00
Xuewei Niu	e1825c2ef3	Merge pull request #9977 from l8huang/dan-2-vfio runtime: add DAN support for VFIO network device in Go kata-runtime	2024-09-25 10:11:38 +08:00
Lei Huang	39b0e9aa8f	runtime: add DAN support for VFIO network device in Go kata-runtime When using network adapters that support SR-IOV, a VFIO device can be plugged into a guest VM and claimed as a network interface. This can significantly enhance network performance. Fixes: #9758 Signed-off-by: Lei Huang <leih@nvidia.com>	2024-09-24 09:53:28 -07:00
Hyounggyu Choi	c70588fafe	tests: Use custom-node-debugger pod With #10232 merged, we now have a persistent node debugger pod throughout the test. As a result, there’s no need to spawn another debugger pod using `kubectl debug`, which could lead to false negatives due to premature pod termination, as reported in #10081. This commit removes the `print_node_journal()` call that uses `kubectl debug` and instead uses `exec_host()` to capture the host journal. The `exec_host()` function is relocated to `tests/integration/kubernetes/lib.sh` to prevent cyclical dependencies between `tests_common.sh` and `lib.sh`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-24 17:25:24 +02:00
Lukáš Doktor	8355eee9f5	ci: Reorder webhook deployment in `b9d88f74ed` the `runtime_class` CM was added which overrides the one we previously set. Let's reorder our logic to first deploy webhook and then override the default CM in order to use the one we really want. Since we need to change dirs we also have to use realpath to ensure the files are located well. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-09-24 17:01:28 +02:00
Hyounggyu Choi	2c2941122c	tests: Fail fast in assert_pod_fail() `assert_pod_fail()` currently calls `k8s_create_pod()` to ensure that a pod does not become ready within the default 120s. However, this delays the test's completion even if an error message is detected earlier in the journal. This commit removes the use of `k8s_create_pod()` and modifies `assert_pod_fail()` to fail as soon as the pod enters a failed state. All failing pods end up in one of the following states: - CrashLoopBackOff - ImagePullBackOff The function now polls the pod's state every 5 seconds to check for these conditions. If the pod enters a failed state, the function immediately returns 0. If the pod does not reach a failed state within 120 seconds, it returns 1. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-24 16:09:20 +02:00
Gabriela Cervantes	6a8b137965	docs: Remove qemu information not longer valid This PR removes some qemu information which is not longer valid as this is referring to the tests repository and to kata 1.x. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-23 16:58:24 +00:00
Aurélien Bombo	e738054ddb	Merge pull request #10311 from pawelpros/pproskur/fixyq ci: don't require sudo for yq if already installed	2024-09-23 08:57:11 -07:00
Alex Lyn	6b94cc47a8	Merge pull request #10146 from Apokleos/intro-cdi Introduce cdi in runtime-rs	2024-09-23 21:45:42 +08:00
Alex Lyn	b8ba346e98	runtime-rs: Add test for container devices with CDI. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-23 17:20:22 +08:00
Steve Horsman	0e0cb24387	Merge pull request #10329 from Bickor/webhook-check tools.kata-webhook: Specify runtime class using configMap	2024-09-23 09:59:12 +01:00
Steve Horsman	6f0b3eb2f9	Merge pull request #10337 from stevenhorsman/update-release-process-post-3.9.0 doc: Update the release process	2024-09-23 09:55:57 +01:00
Hyounggyu Choi	8a893cd4ee	Merge pull request #10232 from BbolroC/fix-loop-device-for-exec_host tests: Fix loop device handling for exec_host()	2024-09-23 08:15:03 +02:00
Fupan Li	f1f5bef9ef	Merge pull request #10339 from lifupan/main_fix runtime-rs: fix the issue of using block_on	2024-09-23 09:28:40 +08:00
Fupan Li	52397ca2c1	sandbox: rename the task_service to service rename the task_service to service, in order to incopperate with the following added sandbox services. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:44:19 +08:00
Fupan Li	20b4be0225	runtime-rs: rename the Request/Response to TaskRequest/TaskResponse In order to make different from sandbox request/response, this commit changed the task request/response to TaskRequest/TaskResponse. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:44:11 +08:00
Fupan Li	ba94eed891	sandbox: fix the issue of hypervisor's wait_vm Since the wait_vm would be called before calling stop_vm, which would take the reader lock, thus blocking the stop_vm getting the writer lock, which would trigge the dead lock. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:44:03 +08:00
Fupan Li	fb27de3561	runtime-rs: fix the issue of using block_on Since the block_on would block on the current thread which would prevent other async tasks to be run on this worker thread, thus change it to use the async task for this task. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:40:44 +08:00
Aurélien Bombo	79a3b4e2e5	Merge pull request #10335 from kata-containers/sprt/fix-kata-deploy-docs kata-deploy: clean up and fix docs for k0s	2024-09-20 13:33:14 -07:00
stevenhorsman	4f745f77cb	doc: Update the release process - Reflect the need to update the versions in the Helm Chart - Add the lock branch instruction - Add clarity about the permissions needed to complete tasks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-20 19:04:33 +01:00
Aurélien Bombo	78c63c7951	kata-deploy: clean up and fix docs for k0s * Clarifies instructions for k0s. * Adds kata-deploy step for each cluster type. * Removes the old kata-deploy-stable step for vanilla k8s. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-09-20 11:59:40 -05:00
sidney chang	456e13db98	runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs rename DEFAULT_HYPERVISOR to HYPERVISOR in Makefile Fixes #10310 Signed-off-by: sidney chang <2190206983@qq.com>	2024-09-20 05:41:34 -07:00
sidneychang	b85a886694	runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs This PR introduces support for selectively compiling Dragonball in runtime-rs. By default, Dragonball will continue to be compiled into the containerd-shim-kata-v2 executable, but users now have the option to disable Dragonball compilation. Fixes #10310 Signed-off-by: sidney chang <2190206983@qq.com>	2024-09-20 05:38:59 -07:00
Hyounggyu Choi	2d6ac3d85d	tests: Re-enable guest-pull-image tests for qemu-coco-dev Now that the issue with handling loop devices has been resolved, this commit re-enables the guest-pull-image tests for `qemu-coco-dev`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	c6b86e88e4	tests: Increase timeouts for qemu-coco-dev in trusted image storage tests Timeouts occur (e.g. `create_container_timeout` and `wait_time`) when using qemu-coco-dev. This commit increases these timeouts for the trusted image storage test cases Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	9cff9271bc	tests: Run all commands in _loop_device() using exec_host() If the host running the tests is different from the host where the cluster is running, the _loop_device() functions do not work as expected because the device is created on the test host, while the cluster expects the device to be local. This commit ensures that all commands for the relevant functions are executed via exec_host() so that a device should be handled on a cluster node. Additionally, it modifies exec_host() to return the exit code of the last executed command because the existing logic with `kubectl debug` sometimes includes unexpected characters that are difficult to handle. `kubectl exec` appears to properly return the exit code for a given command to it. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	374b8d2534	tests: Create and delete node debugger pod only once Creating and deleting a node debugger pod for every `exec_host()` call is inefficient. This commit changes the test suite to create and delete the pod only once, globally. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	aedf14b244	tests: Mimic node debugger with full privileges This commit addresses an issue with handling loop devices via a node debugger due to restricted privileges. It runs a pod with full privileges, allowing it to mount the host root to `/host`, similar to the node debugger. This change enables us to run tests for trusted image storage using the `qemu-coco-dev` runtime class. Fixes: #10133 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Alex Lyn	63b25e8cb0	runtime-rs: Introduce cdi devices in container creation Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Alex Lyn	03735d78ec	runtime-rs: add cdi devices definition and related methods Add cdi devices including ContainerDevice definition and annotation_container_device method to annotate vfio device in OCI Spec annotations which is inserted into Guest with its mapping of vendor-class and guest pci path. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Alex Lyn	020e3da9b9	runtime-rs: extend DeviceVendor with device class We need vfio device's properties device, vendor and class, but we can only get property device and vendor. just extend it with class is ok. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Fabiano Fidêncio	77c844da12	Merge pull request #10239 from fidencio/topic/remove-acrn acrn: Drop support	2024-09-19 23:10:29 +02:00
GabyCT	6eef58dc3e	Merge pull request #10336 from GabyCT/topic/extendtimeout gha: Increase timeout to run k8s tests on TDX	2024-09-19 13:12:55 -06:00
Martin	b9d88f74ed	tools.kata-webhook: Specify runtime class using configMap The kata webhook requires a configmap to define what runtime class it should set for the newly created pods. Additionally, the configmap allows others to modify the default runtime class name we wish to set (in case the handler is kata but the name of the runtimeclass is different). Finally, this PR changes the webhook-check to compare the runtime of the newly created pod against the specific runtime class in the configmap, if said confimap doesn't exist, then it will default to "kata". Signed-off-by: Martin <mheberling@microsoft.com>	2024-09-19 11:51:38 -07:00
Fabiano Fidêncio	51dade3382	docs: Fix spell checker tokio is not a valid word, it seeems, so let's use `tokio`. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 20:25:21 +02:00
Gabriela Cervantes	49b3a0faa3	gha: Increase timeout to run k8s tests on TDX This PR increases the timeout to run k8s tests for Kata CoCo TDX to avoid the random failures of timeout. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-19 17:15:47 +00:00
Fabiano Fidêncio	31438dba79	docs: Fix qemu link Otherwise static checks will fail, as we woke up the dogs with changes on the same file. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 16:05:43 +02:00
Fabiano Fidêncio	fefcf7cfa4	acrn: Drop support As we don't have any CI, nor maintainer to keep ACRN code around, we better have it removed than give users the expectation that it should or would work at some point. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 16:05:43 +02:00
Fabiano Fidêncio	cdaaf708a1	Merge pull request #10334 from emanuellima1/bump-version release: Bump version to 3.9.0	2024-09-19 15:27:50 +02:00
Emanuel Lima	a6ee15c5c7	release: Bump VERSION to 3.9.0 Starting the v3.9.0 release Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-09-19 10:14:55 -03:00
Fabiano Fidêncio	e9593b53a4	Merge pull request #10234 from pmores/add-support-for-disabled-guest-selinux runtime-rs: add support for disabled guest selinux	2024-09-19 15:03:24 +02:00
Fabiano Fidêncio	4d11fecc2d	Merge pull request #10274 from ajaypvictor/remote_image-os_types runtime: Enable Image annotation for remote hypervisor	2024-09-19 13:39:20 +02:00
Fabiano Fidêncio	3d5f48e02e	Merge pull request #10283 from alexman-stripe/alexman-stripe/fix-kata-shim-not-reporting-inactive-file-cgroup-v2 shim: Fix memory usage reporting for cgroup v2	2024-09-19 12:50:36 +02:00
Pavel Mores	5e5eb9759f	runtime-rs: handle disabled guest selinux in virtiofsd This is just a port of functionality existing in the golang runtime. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	8c92f3bfec	runtime-rs: enable/disable selinux in guest based on disable_guest_selinux This change technically affects the path for enabled guest selinux as well, however since this is not implemented in runtime-rs anyway nothing should break. When guest selinux support is added this change will come handy. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	204ee21bc8	runtime-rs: handle disabled guest selinux in OCI spec If guest selinux is off the runtime has to ensure that container OCI spec contains no selinux labels for the container rootfs and process. Failure to do so causes kata agent to try and apply the labels which fails since selinux is not enabled in guest, which in turn causes container launch to fail. This is largely inspired by golang runtime() with a slight deviation in ordering of checks. This change simply checks the disable_guest_selinux config setting and if it's true it clears both rootfs and process label if necessary. Golang runtime, on the other hand, seems to first check if process label is non-empty and only then it checks the config setting, meaning that if process label is empty the rootfs label is not reset even if it's non-empty. Frankly, this looks like a potential bug though probably unlikely to manifest since it can be assumed that the labels are either both empty, or both non-empty. () `4fd4b02f2e/src/runtime/virtcontainers/kata_agent.go (L1005)` Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	eb1227f47d	runtime-rs: parse the disable_guest_selinux config key In order to handle the setting we have to first parse it and make its value available to the rest of the program. The yes() function is added to comply with serde which seems to insist on default values being returned from functions. Long term, this is surely not the best place for this function to live, however given that this is currently the first and only place where it's used it seems appropriate to put it near its use. If it ends up being reused elsewhere a better place will surely emerge. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Steve Horsman	8789551fe6	Merge pull request #10333 from fidencio/topic/ci-bump-ubuntu-20.04-runners-to-22.04 ci: Bump ubuntu 20.04 runners to 22.04	2024-09-19 11:44:33 +01:00
Fabiano Fidêncio	35c7f8d1ba	ci: Bump ubuntu 20.04 runners to 22.04 Azure internal mirrors for Ubuntu 20.04 have gone awry, leading to a situation where dependencies cannot be installed (such as libdevmapper-dev), blocking then our CI. Let's bump the runners to 22.04 regardless, even knowing it'll cause an issue with the runk tests, as the agent check tests are considered more crucial to the project at this point. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 12:29:20 +02:00
Fabiano Fidêncio	eccdffebf7	Merge pull request #10243 from katexochen/nydus-overlayfs-path virtcontainers: allow specifying nydus-overlayfs binary by path	2024-09-19 11:35:45 +02:00
Ajay Victor	a19f2eacec	runtime: Enable ImageName annotation for remote hypervisor Enables ImageName to support multiple VM images in remote hypervisor scenario Fixes https://github.com/kata-containers/kata-containers/issues/10240 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>	2024-09-19 14:48:46 +05:30
Alex Man	27f8f69195	shim: Fix memory usage reporting for cgroup v2 kata-shim was not reporting `inactive_file` in memory stat. This memory is deducted by containerd when calculating the size of container working set, as it can be paged out by the operating system under memory pressure. Without reporting `inactive_file`, containerd will over report container memory usage. [Here](https://github.com/containerd/containerd/blob/v1.7.22/pkg/cri/server/container_stats_list_linux.go#L117) is where containerd deducts `inactive_file` from memory usage. Note that kata-shim correctly reports `total_inactive_file` for cgroup v1, but this was not implemented for cgroup v2. This commit: - Adds code in kata-shim to report "inactive_file" memory for cgroup v2 - Implements reporting of all available cgroup v2 memory stats to containerd - Uses defensive coding to avoid assuming existence of any memory.stat fields The list of available cgroup v2 memory stats defined by containerd can be found [here](https://pkg.go.dev/github.com/containerd/cgroups/v2/stats#MemoryStat). Fixes #10280 Signed-off-by: Alex Man <alexman@stripe.com>	2024-09-18 14:04:24 -07:00
Fabiano Fidêncio	1597f8ba00	Merge pull request #10279 from alexman-stripe/alexman-stripe/fix-cgroup-v2-wrong-cpu-usage-unit agent: Fix CPU usage reporting for cgroup v2 in kata-agent	2024-09-18 21:36:52 +02:00
Fabiano Fidêncio	593cbb8710	Merge pull request #10306 from microsoft/danmihai1/more-security-contexts genpolicy: get UID from PodSecurityContext	2024-09-18 21:33:39 +02:00
Aurélien Bombo	5402f2c637	Merge pull request #10308 from Sumynwa/sumsharma/add_setpolicy_agent_ctl agent-ctl: Add SetPolicy support	2024-09-18 10:09:07 -07:00
Pawel Proskurnicki	b63d49b34a	ci: don't require sudo for yq if already installed Yq installation shouldn't force to use sudo in case yq is already installed in correct version. Signed-off-by: Pawel Proskurnicki <pawel.proskurnicki@intel.com>	2024-09-18 11:01:07 +02:00
Sumedh Alok Sharma	18c887f055	agent-ctl: Add SetPolicy support This patch adds support to call kata agents SetPolicy API. Also adds tests for SetPolicy API using agent-ctl. Fixes #9711 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-18 10:53:49 +05:30
GabyCT	28d430ec42	Merge pull request #10324 from GabyCT/topic/fixinlib ci: Fix indentation of install libseccomp script	2024-09-17 14:21:24 -06:00
Fabiano Fidêncio	da2377346d	Merge pull request #10323 from stevenhorsman/update-kubectl-release-url kata-deploy: Switch Kubernetes URL	2024-09-17 20:47:17 +02:00
Gabriela Cervantes	096f32cc52	ci: Fix indentation of install libseccomp script This PR fixes the indentation of the install libseccomp script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-17 16:38:53 +00:00
Aurélien Bombo	9d29ce460d	Merge pull request #10303 from Sumynwa/sumsharma/agent_policy_set_env agent: add support to provide default agent policy via env	2024-09-17 09:04:11 -07:00
stevenhorsman	c0d35a66aa	ci: kata-deploy: Update kubectil install URL The `deploy_k0s` and `deploy_k3s` kubectl installs aren't failing yet, but let get ahead of this and bump them as well Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-17 15:35:42 +01:00
stevenhorsman	1abeffdac6	kata-deploy: Switch Kubernetes URL The payload build is failing with: ``` ERROR: failed to solve: process "/bin/sh -c apk --no-cache add bash curl && ARCH=$(uname -m) && if [ \"${ARCH}\" = \"x86_64\" ]; then ARCH=amd64; fi && if [ \"${ARCH}\" = \"aarch64\" ]; then ARCH=arm64; fi && DEBIAN_ARCH=${ARCH} && if [ \"${DEBIAN_ARCH}\" = \"ppc64le\" ]; then DEBIAN_ARCH=ppc64el; fi && curl -fL --progress-bar -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/ \ $(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/${ARCH}/kubectl && chmod +x /usr/bin/kubectl && curl -fL --progress-bar -o /usr/bin/jq https://github.com/jqlang/jq/releases/download/jq-1.7.1/jq-linux-${DEBIAN_ARCH} && chmod +x /usr/bin/jq && mkdir -p ${DESTINATION} && tar xvf ${WORKDIR}/${KATA_ARTIFACTS} -C ${DESTINATION} && rm -f ${WORKDIR}/${KATA_ARTIFACTS} && apk del curl && apk --no-cache add py3-pip && pip install --no-cache-dir yq==3.2.3" did not complete successfully: exit code: 22 ``` Looking into this, the problem is that https://storage.googleapis.com/kubernetes-release/release/v1.31.1/bin/linux/amd64/kubectl doesn't exist. The [kubectl install doc](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-kubectl-on-linux) recommends the `dl.k8s.io` site, so let's switch to this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-17 15:35:42 +01:00
Steve Horsman	5448f7fbbf	Merge pull request #10321 from BbolroC/fix-build-boot-image-se local-build: Fix unbound variable issue for lib_se.sh	2024-09-17 15:35:04 +01:00
Hyounggyu Choi	72471d1a18	local-build: Fix unbound variable for lib_se.sh As #10315 introduced an `unbound variable` error, this is a hot-fix for it. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-17 10:01:14 +02:00
Hyounggyu Choi	72df3004e8	gha: Rebase build-secure-image-se atop of latest target branch This commit adds a step called `Rebase atop of the latest target branch` to the job named `build-asset-boot-image-se` which can test the PR properly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-17 09:54:51 +02:00
Hyounggyu Choi	03cd02a006	Merge pull request #10315 from BbolroC/update-ibm-se-doc doc: Update how-to-run-kata-containers-with-SE-VMs.md	2024-09-16 15:12:18 +02:00
Sumedh Alok Sharma	cefba08903	agent: add support to provide default agent policy via env agent built with policy feature initializes the policy engine using a policy document from a default path, which is installed & linked during UVM rootfs build. This commit adds support to provide a default agent policy as environment variable. This targets development/testing scenarios where kata-agent is wanted to be started as a local process. Fixes #10301 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-16 18:05:21 +05:30
Hyounggyu Choi	8d609e47fb	doc: Update how-to-run-kata-containers-with-SE-VMs.md The following changes have been made: - Remove unnecessary `sudo` - Add an error message where an incorrect host key document is used - Add a missing artifact `kernel-confidential-modules` - Make a variable `kernel_version` and replace it with relevant hits Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-16 12:53:30 +02:00
Fabiano Fidêncio	fc5a631791	Merge pull request #10009 from Xynnn007/feat-cosign Merge to main: supporting pull cosign signed images	2024-09-16 12:08:26 +02:00
stevenhorsman	aa9f21bd19	test: Add support for s390x in cosign testing We've added s390x test container image, so add support to use them based on the arch the test is running on Fixes: #10302 Signed-off-by: stevenhorsman <steven@uk.ibm.com> fixuop	2024-09-16 09:20:57 +01:00
stevenhorsman	3087ce17a6	tests: combined pod yaml creation for CoCo tests This commit brings some public parts of image pulling test series like encrypted image pulling, pulling images from authenticated registry and image verification. This would help to reduce the cost of maintainance. Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-16 09:20:57 +01:00
Xynnn007	c80c8d84c3	test: add cosign signature verificaton tests Close #8120 Case 1 Create a pod from an unsigned image, on an insecureAcceptAnything registry works. Image: quay.io/prometheus/busybox:latest Policy rule: ``` "default": [ { "type": "insecureAcceptAnything" } ] ``` Case 2 Create a pod from an unsigned image, on a 'restricted registry' is rejected. Image: ghcr.io/confidential-containers/test-container-image-rs:unsigned Policy rule: ``` "quay.io/confidential-containers/test-container-image-rs": [ { "type": "sigstoreSigned", "keyPath": "kbs:///default/cosign-public-key/test" } ] ``` Case 3 Create a pod from a signed image, on a 'restricted registry' is successful. Image: ghcr.io/confidential-containers/test-container-image-rs:cosign-signed Policy rule: ``` "ghcr.io/confidential-containers/test-container-image-rs": [ { "type": "sigstoreSigned", "keyPath": "kbs:///default/cosign-public-key/test" } ] ``` Case 4 Create a pod from a signed image, on a 'restricted registry', but with the wrong key is rejected Image: ghcr.io/confidential-containers/test-container-image-rs:cosign-signed-key2 Policy: ``` "ghcr.io/confidential-containers/test-container-image-rs": [ { "type": "sigstoreSigned", "keyPath": "kbs:///default/cosign-public-key/test" } ] ``` Case 5 Create a pod from an unsigned image, on a 'restricted registry' works if enable_signature_verfication is false Image: ghcr.io/kata-containers/confidential-containers:unsigned image security enable: false Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-16 09:20:57 +01:00
Xynnn007	9606e7ac8b	agent: Set image-rs image security policy Add two parameters for enabling cosign signature image verification. - `enable_signature_verification`: to activate signature verification - `image_policy`: URI of the image policy config Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-09-16 09:20:57 +01:00
Xynnn007	653bc3973f	agent: fix make test for kata-agent of dependency anyhow new version of the anyhow crate has changed the backtrace capture thus unit tests of kata-agent that compares a raised error with an expected one would fail. To fix this, we need only panics to have backtraces, thus set `RUST_BACKTRACE=1` and `RUST_LIB_BACKTRACE=0` for tests due to document https://docs.rs/anyhow/latest/anyhow/ Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-09-16 09:20:57 +01:00
Fabiano Fidêncio	dfcb41b5cc	Merge pull request #10313 from stevenhorsman/coco-components-0.10-bump CoCo: Bump Coco components to 0.10 releases	2024-09-14 21:43:28 +02:00
stevenhorsman	705e469696	rootf: Change initrd alpine mirror The rootfs-initrd build is failing with: ``` fetch https://mirror.math.princeton.edu/pub/alpinelinux//v3.18/main/aarch64/APKINDEX.tar.gz 6684368:error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914: ERROR: https://mirror.math.princeton.edu/pub/alpinelinux//v3.18/main: Permission denied ``` so try bumping to a newer version of alpine to see if that helps the issue Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-14 18:47:45 +02:00
Dan Mihai	5777869cf4	tests: k8s-policy-rc: add unexpected UID test Change pod runAsUser value of a Replication Controller after generating the RC's policy, and verify that the RC pods get rejected due to this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 22:05:31 +00:00
Dan Mihai	6773f14667	tests: k8s-policy-job: add unexpected UID test Change pod runAsUser value of a Job after generating the Job's policy, and verify that the Job gets rejected due to this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 22:05:31 +00:00
Dan Mihai	124f01beb3	tests: k8s-policy-deployment: add bad UID test Change pod runAsUser value of a Deployment after generating the Deployment's policy, and verify that the Deployment fails due to this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 22:05:31 +00:00
Dan Mihai	16f5ebf5f9	genpolicy: get UID from PodSecurityContext Get UID from PodSecurityContext for other k8s resource types too, not just for Pods. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 22:05:31 +00:00
Dan Mihai	5badc30a69	Merge pull request #10316 from microsoft/danmihai1/k8s-inotify tests: k8s-inotify: pod termination polling	2024-09-13 15:02:38 -07:00
GabyCT	6f363bba18	Merge pull request #10304 from GabyCT/topic/fixcricont tests: Fix indentation in the cri containerd tests	2024-09-13 14:49:12 -06:00
Dan Mihai	d3127af9c5	tests: k8s-inotify: pod termination polling Poll/wait for pod termination instead of sleeping 2 minutes. This change typically saves ~90 seconds in my test cluster. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 17:12:55 +00:00
sidney chang	5a7d0ed3ad	runtime-rs: introduce tap in hypervisor by extrating it from dragonball It's a prerequisite PR to make built-in vmm dragonball compilation options configurable. Extract TAP device-related code from dragonball's dbs_utils into a separate library within the runtime-rs hypervisor module. To enhance functionality and reduce dependencies, the extracted code has been reimplemented using the libc crate and the ifreq structure. Fixes #10182 Signed-off-by: sidney chang <2190206983@qq.com>	2024-09-13 07:32:14 -07:00
Fabiano Fidêncio	b09eba8c46	Merge pull request #10309 from BbolroC/helm-install-with-retry tests: Introduce retry mechanism for helm install	2024-09-13 15:08:46 +02:00
stevenhorsman	00e657cdb7	agent: image-rs: Update to v0.10.0 release Update image-rs to use the latest release of guest-components Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-13 13:29:54 +01:00
stevenhorsman	5e03890562	versions: Bump trustee and guest-components Bump to the v0.10.1 release of trustee and v0.10.0 release of guest-components Signed-off-by: stevenhorsman <steven@uk.ibm.com> fixup	2024-09-13 13:28:54 +01:00
Hyounggyu Choi	0aae847ae5	tests: Update secure boot image verification for IBM SE In the latest `s390-tools`, there has been update on how to verify a secure boot image. A host key revocation list (CRL), which was optinoal, now becomes mandatory for verification. This commit updates the relevant scripts and documentation accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-13 14:14:02 +02:00
Hyounggyu Choi	4c933a5611	tests: Introduce retry mechanism for helm install Kata-deploy often fails due to a transiently unreachable k8s cluster for the qemu-coco-dev test on s390x. (e.g. https://github.com/kata-containers/kata-containers/actions/runs/10831142906/job/30058527098?pr=10009) This commit introduces a retry mechanism to mitigate these failures by retrying the command two more times with a 10-second interval as a workaround. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-13 14:03:44 +02:00
Dan Mihai	e937cb1ded	Merge pull request #10291 from microsoft/danmihai1/user-name-to-uid genpolicy: fix and re-enable create container UID verification	2024-09-12 15:47:59 -07:00
Dan Mihai	0c5ac042e7	tests: k8s-policy-pod: add workaround for #10297 If the CI platform being tested doesn't support yet the prometheus container image: - Use busybox instead of prometheus. - Skip the test cases that depend on the prometheus image. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-12 18:26:38 +00:00
Gabriela Cervantes	0346b32a90	tests: Fix indentation in the cri containerd tests This PR fixes the indentation in the cri containerd tests as we have in several places a misalignment in the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-12 16:18:34 +00:00
Dan Mihai	94d95fc055	tests: k8s-policy-pod: test container UID changes Add test cases for changing container UID after generating the policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
Dan Mihai	db1ca4b665	tests: k8s-policy-pod: remove UID workaround Remove the workaround for #9928, now that genpolicy is able to convert user names from container images into the corresponding UIDs from these images. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
Dan Mihai	d2d8d2e519	genpolicy: remove default UID/GID values Remove the recently added default UID/GID values, because the genpolicy design is to initialize those fields before this new code path gets executed. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
Hernan Gatta	871476c3cb	genpolicy: pull UID:GID values from /etc/passwd Some container images are configured such that the user (and group) under which their entrypoint should run is not a number (or pair of numbers), but a user name. For example, in a Dockerfile, one might write: > USER 185 indicating that the entrypoint should run under UID=185. Some images, however, might have: > RUN groupadd --system --gid=185 spark > RUN useradd --system --uid=185 --gid=spark spark > ... > USER spark indicating that the UID:GID pair should be resolved at runtime via /etc/passwd. To handle such images correctly, read through all /etc/passwd files in all layers, find the latest version of it (i.e., the top-most layer with such a file), and, in so doing, ensure that whiteouts of this file are respected (i.e., if one layer adds the file and some subsequent layer removes it, don't use it). Signed-off-by: Hernan Gatta <hernan.gatta@opaque.co>	2024-09-11 22:38:20 +00:00
Hernan Gatta	f9249b4476	genpolicy: add tar dependency Used to read /etc/passwd from tar files. Signed-off-by: Hernan Gatta <hernan.gatta@opaque.co>	2024-09-11 22:38:20 +00:00
Dan Mihai	eb7f747df1	genpolicy: enable create container UID verification Disabling the UID Policy rule was a workaround for #9928. Re-enable that rule here and add a new test/CI temporary workaround for this issue. This new test workaround will be removed after fixing #9928. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
Dan Mihai	71ede4ea3f	tests: k8s-policy-pod: use prometheus container Change quay.io/prometheus/busybox to quay.io/prometheus/prometheus in this test. The prometheus image will be helpful for testing the future fix for #9928 because it specifies user = "nobody". Also, change: sh -c "ls -l /" to: echo -n "readinessProbe with space characters" as the test readinessProbe command line. Both include a command line argument containing space characters, but "sh -c" behaves differently when using the prometheus container image (causes the readinessProbe to time out, etc.). Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
GabyCT	614328f342	Merge pull request #10295 from GabyCT/topic/removeimgvar metrics: Remove unused remove img var in common script	2024-09-11 15:02:39 -07:00
GabyCT	095c5ed961	Merge pull request #10289 from GabyCT/topic/enablestresst tests: Enable stressng k8s stability test for Kata CoCo CI	2024-09-11 10:47:33 -07:00
Fabiano Fidêncio	97ecdabde9	Merge pull request #10294 from fidencio/topic/bring-ita-support Bump guest-components / trustee to a version that supports ITA	2024-09-11 19:45:48 +02:00
Gabriela Cervantes	fdaf12d16c	metrics: Remove unused remove img var in common script This PR removes the remove_img variable in the metrics common script as it is not being used. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-11 17:45:18 +00:00
Gabriela Cervantes	04d1122a46	tests: Decrease iterations in soak test This PR decreases the number of iterations in the kubernetes soak test as this is already taking more than 2 hours for the kata coco ci stability. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-11 17:39:06 +00:00
Gabriela Cervantes	c48c6f974e	tests: Enable stressng k8s stability test for Kata CoCo CI This PR enables the stressng k8s stability test for Kata CoCo CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-11 17:38:13 +00:00
Alex Man	7e400f7bb2	agent: Fix CPU usage reporting for cgroup v2 in kata-agent kata-agent incorrectly reports CPU time for cgroup v2, causing 1000x underreporting. For cgroup v2, kata-agent reads the cpu.stat file, which reports the time consumed by the processes in the cgroup in µs. However, there was a bug in kata-agent where it returned this value in µs without converting it to ns. This commit adds the necessary µs to ns conversion for cgroup v2, aligning it with v1 behavior and kata-shim's expectations. This fixes #10278 Signed-off-by: Alex Man <alexman@stripe.com>	2024-09-11 10:29:03 -07:00
Fabiano Fidêncio	1178fe20e9	tests: Adapt error parser for failed image decryption With an older version of image-rs, we were getting the following error: ``` Message: failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key no suitable key found for decrypting layer key: ``` However, with the version of image-rs we are bumping to, the error comes as: ``` Message: failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key Caused by: no suitable key found for decrypting layer key: keyprovider: failed to unwrap key by ttrpc ``` Due to this change, I'm splitting the check in two different ones. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 17:07:56 +02:00
Dan Mihai	66dda37877	Merge pull request #10271 from Sumynwa/sumsharma/agent_ctl_issue_9689_local agent-ctl: Refactor CopyFile Handler	2024-09-11 07:35:09 -07:00
Fabiano Fidêncio	f6cfc33314	Merge pull request #10292 from fidencio/topic/ci-tdx-adapt-how-we-get-the-host-ip ci: tdx: Adapt how we get the host IP	2024-09-11 14:42:22 +02:00
Fabiano Fidêncio	e2200f0690	versions: trustee: Update to a version that supports ITA ITA stands for Intel Trust Authority, which is in the process to being renamed to ITTS (Intel Tiber Trust Services). Proper ITA / ITTS support on Trustee was finished as part of: * `6f767fa15f` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 13:39:35 +02:00
Fabiano Fidêncio	d3e3ee7755	versions: guest-components: Update to a version that supports ITA ITA stands for Intel Trust Authority, which is in the process to being renamed to ITTS (Intel Tiber Trust Services). As we've bumped guest-components on trustee, let's make sure we also bump image-rs to the commit that brings ITA support in: * https://github.com/confidential-containers/guest-components/commit/1db6c3a87665dde58d0efa56f4e4af5fc Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 13:36:56 +02:00
Fabiano Fidêncio	f94d80783d	agent: image-rs: Update to a version that supports ITA ITA stands for Intel Trust Authority, which is in the process to being renamed to ITTS (Intel Tiber Trust Services). As we've bumped guest-components on trustee, let's make sure we also bump image-rs to the commit that brings ITA support in: * `1db6c3a876` The reason we need to bump the dependency here is to avoid kbs_protocol mismatch between the version used by the agent and the trustee one. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 13:36:46 +02:00
Fabiano Fidêncio	3946aa7283	ci: tdx: Adapt how we get the host IP In the process of switching the TDX CI machine we've noticed that `hostname -i` in one of the machines returns an one and only IP address, while in another machine it returns a full list of IPs. As we're only interested in the first one, let's adapt the code to always return the first one. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 09:31:43 +02:00
Sumedh Alok Sharma	b4bbbf65c6	ci: Do not start CDH/attestation procs with kata-agent as local process. Since CDH/attestation related processes and its dependencies are not fully available, the setup fails to start kata-agent as local process. This fix removes these procs to prevent kata-agent from trying to start them. Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 11:53:59 +05:30
Sumedh Alok Sharma	8045a7a2ba	ci: Install policy document on host to run kata-agent as local process. The test setup starts kata-agent as a local process without the UVM. The agent policy initialization fails due to missing policy document at `/etc/kata-opa/default-policy.rego`. The fix - installs a relaxed `allow-all.rego` policy document - cleans up the install during exit Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 11:25:08 +05:30
Sumedh Alok Sharma	822f898433	ci: Install bats as dependencies Install bats as part of dependencies for running the tests. Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 10:57:15 +05:30
Sumedh Alok Sharma	2c774fb207	ci: Add tests for CopyFile api. This commit introduces test cases for testing CopyFile API using kata-agent-ctl with improved command semantics and handling. - copy a file to /run/kata-containers - copy symlink to /run/kata-containers - copy directory to /run/kata-containers - copy file to /tmp - copy large file to /run/kata-containers Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 10:54:01 +05:30
Sumedh Alok Sharma	2af1113426	agent-ctl: Refactor CopyFile handler In the existing implementation for the CopyFile subcommand, - cmd line argument list is too long, including various metadata information. - in case of a regular file, passing the actual data as bytes stream adds to the size and complexity of the input. - the copy request will fail when the file size exceeds that of the allowed ttrpc max data length limit of 4Mb. This change refactors the CopyFile handler and modifies the input to a known 'source' 'destination' syntax. Fixes #9708 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 10:54:01 +05:30
Alex Lyn	d0968032f7	Merge pull request #10276 from Apokleos/fix-runtime-cdi runtime: Fix runtime/cdi panic with assignment to entry in nil map	2024-09-11 09:00:11 +08:00
Alex Lyn	3f541aff4a	Merge pull request #10282 from teawater/dup runtime-rs: configuration-dragonball.toml.in: Remove duplication	2024-09-10 11:46:40 +08:00
Hui Zhu	dfea12bc53	runtime-rs: configuration-dragonball.toml.in: Remove duplication Remove duplicated description of enable_balloon_f_reporting from configuration-dragonball.toml.in. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-09-10 07:34:29 +08:00
David Esparza	6f8897249b	Merge pull request #10277 from GabyCT/topic/fixsk tests: Increase timeout to wait for soak stability test deployment	2024-09-09 14:07:10 -06:00
Gabriela Cervantes	5a52fe1a75	tests: Increase timeout to wait for soak stability test deployment This PR increases the timeout to wait that the deployment for the soak stability test is ready in order to avoid random failures saying that the deployment is not ready yet. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-09 16:13:40 +00:00
Alex Lyn	1684c1962c	runtime: Fix runtime/cdi panic with assignment to entry in nil map It will panic when users do GPU vfio passthrough with cdi in runtime. The root cause is that CustomSpec.Annotations is nil when new element added. To address this issue, initialization is introduced when it's nil. Fixes #10266 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-09 20:15:10 +08:00
Alex Lyn	f31839af63	Merge pull request #10253 from teawater/enable_balloon_f_reporting Add support of dragonball virtio-balloon free page reporting	2024-09-09 17:37:52 +08:00
Fabiano Fidêncio	026a4d92a9	Merge pull request #10272 from fidencio/topic/add-tdx-mrconfigid-mrowner-mrownerconfig-support runtime: qemu: tdx: Add support for setting mrconfigid / mrowner / mrownerconfig	2024-09-08 14:11:30 +02:00
Fabiano Fidêncio	51ee4c381a	Merge pull request #10257 from fidencio/topic/kata-deploy-remove-unused-vars-for-cleanup kata-deploy: Remove kata-cleanup unneeded vars	2024-09-07 11:27:14 +02:00
Chengyu Zhu	3a37652d01	Merge pull request #10213 from ChengyuZhu6/device Refine device management for kata-agent	2024-09-07 12:02:32 +08:00
ChengyuZhu6	75816d17f1	agent: switch to new device subsystem Switch to new device subsystem to handle various devices in kata-agent. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	df55f37dfe	agent: Move unit tests about vfio device to vfio_device_handler Move unit tests about vfio device to vfio_device_handler. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	41c2d81fd3	agent: Move unit tests about scsi device to scsi_device_handler Move unit tests about scsi device to scsi_device_handler. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	f45129cb44	agent: Move unit tests about network device to network_device_handler Move unit tests about network device to network_device_handler. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	52203db760	agent: Move unit tests about block device to block_device_handler Move unit tests about block device to block_device_handler. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	e1afb92a28	agent: Move common unit tests about device Move common unit tests about device to mod.rs Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	25bd04c02a	agent: Use DeviceHandlerManager to handle various devices Use DeviceHandlerManager to handle various devices. Fixes: #10218 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:42 +08:00
ChengyuZhu6	5fc645c869	agent: Move network device code to network_device_handler Move network device code to network_device_handler to simplify the code. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	07f104085a	agent: Move vfio device code to vfio_device_handler Move vfio device code to vfio_device_handler to simplify the code. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	0cb87767ae	agent: Move device code with virtio scsi driver to scsi_device_handler Move scsi device code to scsi_device_handler to simplify the code. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	0738d75a92	agent: Move device code with nvdimm driver to nvdimm_device_handler Move device code with nvdimm driver to nvdimm_device_handler, including nvdimm device and pmem device. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	bbf934161b	agent: Move virtio-block device handlers to block_device_handler Move virtio-block device handlers to block_device_handler to simplify the code. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	4e33665be8	kata-types: Move device driver constants to kata-types Move device driver constants and add DeviceHandlerManager type alias. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	0b3ad2f830	kata-types: Replace StorageHandlerManager with type alias Removed the `StorageHandlerManager` struct and its associated implementations and introduced a type alias `StorageHandlerManager` for `HandlerManager` to simplify the code. The new type alias maintains the same functionality while reducing redundancy. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 07:53:31 +08:00
ChengyuZhu6	281f0d7f29	kata-types: Add HandlerManager to manage registered handlers Introduced `HandlerManager` struct to manage registered handlers, which will be used to storage and device management for kata-agent. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 07:51:48 +08:00
GabyCT	b05811587e	Merge pull request #10245 from ChengyuZhu6/handler-manager agent: Refactor storage handler registration	2024-09-06 09:45:39 -06:00
GabyCT	37ddb837c4	Merge pull request #10267 from GabyCT/topic/updatemlcomments metrics: Update openVINO and oneDNN tests references	2024-09-06 09:42:21 -06:00
Fabiano Fidêncio	65a4562050	runtime: qemu: tdx: Add `omitempty` to QuoteGenerationSocket I know right now we're always passing a value for that, but this doesn't really have to be set unless attestation is used. Thus, let's also omit it in case it's empty. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-06 15:05:55 +02:00
Fabiano Fidêncio	7818484120	runtime: qemu: tdx: Support mrconfigid / mrowner/ mrownerconfig This is a quick and simple pre-req for supporting initData, which will take advantage of the mrconfigid in the TDX case. While already adding mrconfigid, which is hardcoded empty right now, let's do the same for mrowner and mrownerconfig, and leave it prepared for future expansions. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-06 15:05:54 +02:00
Fabiano Fidêncio	8285957678	runtime: qemu: Rename prepareObjectWithTDXQgs to prepareTDXObject The reason we're relying on yet another function to do so is because the TDX object will be used in its qom / qapi json format. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-06 14:36:09 +02:00
Fabiano Fidêncio	29ce2205a1	Merge pull request #10268 from microsoft/saulparedes/pdb-support genpolicy: add support for PodDisruptionBudget yaml	2024-09-06 09:53:36 +02:00
Dan Mihai	1885478e2e	Merge pull request #10270 from Sumynwa/sumsharma/enable_agent_tests_in_ci ci: Enable kata agent API tests	2024-09-05 14:24:49 -07:00
Archana Choudhary	f2625b0014	genpolicy: add support for PodDisruptionBudget yaml Prevent panic for PDB specs Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-09-05 11:33:47 -07:00
Sumedh Alok Sharma	e1ac2f4416	ci: Enable kata agent api tests This commit enables running tests for kata agent apis. The 'api-tests' directory will contain bats test files for individual APIs. Fixes #10269 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-06 00:02:55 +05:30
GabyCT	4b257bcbb6	Merge pull request #10255 from Sumynwa/sumsharma/metrics_ci_kill_kata_components ci: send SIGKILL to kill kata components	2024-09-05 12:04:57 -06:00
Aurélien Bombo	cc9aeee81a	Merge pull request #10263 from Sumynwa/sumsharma/add_ci_workflow ci: Add workflow to run kata-agent api tests using kata-agent-ctl	2024-09-05 09:32:34 -07:00
Dan Mihai	7ab95b56f1	Merge pull request #10251 from microsoft/saulparedes/support_readonly_hostpath genpolicy: support readonly hostpath	2024-09-05 09:27:15 -07:00
GabyCT	deb6d12ff6	Merge pull request #10237 from GabyCT/topic/k8soakcoco tests: Enable k8s soak stability test for Kata CoCo CI	2024-09-05 09:56:48 -06:00
Gabriela Cervantes	fcc35dd3a7	metrics: Update openVINO and oneDNN tests references This PR updates the machine learning tests references or urls for the openVINO and oneDNN scripts as currently they are refering to a different performance benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-05 15:39:21 +00:00
GabyCT	bb5d8bbcb5	Merge pull request #10229 from GabyCT/topic/ufcv versions: Update firecracker version to 1.8.0	2024-09-05 09:19:36 -06:00
Fabiano Fidêncio	70491ff29f	Merge pull request #10244 from BbolroC/turn-on-kbs-qemu-coco-dev-s390x gha: Turn on KBS for qemu-coco-dev on s390x	2024-09-05 13:02:42 +02:00
Sumedh Alok Sharma	ad66f4dfc9	ci: Add workflow to run kata-agent api tests using kata-agent-ctl enable CI to add test cases for testing kata-agent APIs. This commit introduces: - a workflow to run tests - setup scripts to prepare the test environment Fixes #10262 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-05 14:38:29 +05:30
Saul Paredes	24c2d13fd3	genpolicy: support readonly emptyDir mount Set emptyDir access based on volume mount readOnly value Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-09-04 15:05:44 -07:00
Saul Paredes	36a4104753	genpolicy: support readonly hostpath Set hostpath access based on volume mount readOnly value Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-09-04 14:55:22 -07:00
Fabiano Fidêncio	7d048f5963	Merge pull request #10254 from fidencio/topic/remove-amd-specific-warning-from-non-amd-systems runtime: Don't error out about SNP cert path on non SNP platforms	2024-09-04 23:42:32 +02:00
Fabiano Fidêncio	d44d66ddf6	kata-deploy: Remove kata-cleanup unneeded vars As kata-cleanup will only call `reset_runtime()`, there's absolutely no need to export the other set of environment variables in its yaml file. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-04 19:09:02 +02:00
Steve Horsman	f66e8c41a1	Merge pull request #10250 from squarti/remote-machine-type-default runtime: fix bad default machine_type for remote hypervisor	2024-09-04 17:34:04 +01:00
Sumedh Alok Sharma	4025468e27	ci: send SIGKILL to kill kata components metrics tests sometimes fail with kata components still running. sending SIGKILL and waiting for the processes to reap. Fixes #8651 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-04 18:58:17 +05:30
Fabiano Fidêncio	b10256a7ca	runtime: Don't error out about SNP cert path on non SNP platforms This error is specific to SNP platforms, so let's make sure we only error this out when an SNP platform is used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-04 11:54:52 +02:00
Hui Zhu	447a7feccf	runtime-rs: configuration-dragonball.toml.in: Add config for balloon Add enable_balloon_f_reporting config to configuration-dragonball.toml.in. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-09-04 17:25:38 +08:00
Hui Zhu	9c1b5238b3	kernel/configs: Add ballon and f_reporting to dragonball-experimental Add CONFIG_PAGE_REPORTING, CONFIG_BALLOON_COMPACTION and CONFIG_VIRTIO_BALLOON to dragonball-experimental configs to open dragonball function and free page reporting function. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-09-04 17:25:30 +08:00
Hui Zhu	ad9968ce2d	runtime-rs: Add enable_balloon_f_reporting for dragonball Under normal circumstances, the virtual machine only requests memory from the host and does not actively release it back to host when it is no longer needed, leading to a waste of memory resources. Free page reporting is a sub-feature of virtio-balloon. When this feature is enabled, the Linux guest kernel will send information about released pages to dragonball via virtio-balloon, and dragonball will then release these pages. This commit adds an option enable_balloon_f_reporting to runtime-rs. When this option is enabled, runtime-rs will insert a virtio-balloon device with the f_reporting option enabled during the Dragonball virtual machine startup. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-09-04 16:38:13 +08:00
Fabiano Fidêncio	13517cf9c1	Merge pull request #10192 from fidencio/topic/helm-add-post-delete-job helm: Several fixes, including some reasonable re-work on kata-deploy.sh script	2024-09-04 09:34:57 +02:00
Paul Meyer	3be719c805	virtcontainers: allow specifying nydus-overlayfs binary by path ...or by using a binary with additional suffix. This allows having multiple versions of nydus-overlayfs installed on the host, telling nydus-snapshotter which one to use while still detecting Nydus is used. Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>	2024-09-04 08:29:40 +02:00
Chengyu Zhu	f0066568eb	Merge pull request #10233 from ChengyuZhu6/cdh-instance agent:cdh: Refactor CDHClient usage and initialization	2024-09-04 13:34:36 +08:00
Silenio Quarti	9e1388728e	runtime: fix bad default machine_type for remote hypervisor Fixes: #10249 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-09-03 20:53:19 -04:00
GabyCT	c2774b09dd	Merge pull request #10247 from GabyCT/topic/removereportm metrics: Remove metrics report for Kata Containers	2024-09-03 15:10:04 -06:00
Fabiano Fidêncio	bb9bcd886a	kata-deploy: Add reset_cri_runtime() This will help to avoid code duplication on what's needed on the helm and non-helm cases. The reason it's not been added as part of the commit which adds the post-delete hook is simply for helping the reviewer (as the diff would be less readable with this change). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	a773797594	ci: Pass --debug to helm Just to make ourlives a little bit easier. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	64ccb1645d	helm: Add a post-delete hook Instead of using a lifecycle.preStop hook, as done when we're using using the helm chat, let's add a post-delete hook to take care of properly cleaning up the node during when uninstalling kata-deploy. The reason why the lifecyle.preStop hook would never work on our case is simply because each helm chart operation follows the Kuberentes "declarative" approach, meaning that an operation won't wait for its previous operation to successfully finish before being called, leading to us trying to access content that's defined by our RBAC, in an operation that was started before our RBAC was deleted, but having the RBAC being deleted before the operation actually started. Unfortunately this hook brings in some code duplicatioon, mainly related to the RBAC parts, but that's not new as the same happens with our deamonset. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-09-03 23:08:22 +02:00
Wainer dos Santos Moschetta	3b23d62635	tests/k8s: fix wait for pods on deploy-kata action On commit `51690bc157` we switched the installation from kubectl to helm and used its `--wait` expecting the execution would continue when all kata-deploy Pods were Ready. It turns out that there is a limitation on helm install that won't wait properly when the daemonset is made of a single replica and maxUnavailable=1. In order to fix that issue, let's revert the changes partially to keep using kubectl and waitForProcess to the exection while Pods aren't Running. Fixes #10168 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	40f8aae6db	Reapply "ci: make cleanup_kata_deploy really simple" This reverts commit `21f9f01e1d`, as the pacthes for helm are coming as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	cfe6e4ae71	Reapply "ci: Use helm to deploy kata-deploy" (partially) This reverts commit `36f4038a89`, as the pacthes for helm are coming as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	424347bf0e	Reapply "kata-deploy: Add Helm Chart" (partially) This reverts commit `b18c3dfce3`, as the pacthes for helm are coming as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
ChengyuZhu6	77521cc8d2	agent:cdh: introduce a function to check initialization of cdh client introduce a function to check initialization of cdh client. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-04 04:52:50 +08:00
ChengyuZhu6	07e0e843e8	agent:cdh: switch to the new method for initializing cdh client Decouple the cdh client from AgentService and refactor cdh client usage and initialization. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-04 04:51:55 +08:00
ChengyuZhu6	bc8156c3ae	agent:cdh: Refactor cdh client methods for better integration Move `unseal_env` and `secure_mount` functions on the global `CDH_CLIENT` instance to access the CDH client. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-04 04:51:54 +08:00
ChengyuZhu6	0ad35dc91b	agent:cdh: Initialize CDH client as a global asynchronous instance Introduced a global `CDH_CLIENT` instance to hold the cdh client and implemented `init_cdh_client` function to initialize the cdh client if not already set. Fixes: #10231 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-04 04:49:54 +08:00
Gabriela Cervantes	5b0ab7f17c	metrics: Remove metrics report for Kata Containers This PR removes the metrics report which is not longer being used in Kata Containers. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-03 16:11:07 +00:00
Hyounggyu Choi	1cefa48047	gha: Add necessary steps for KBS enablement The following steps are required for enabling KBS: - Set environment variables `KBS` and `KBS_INGRESS` - Uninstall and install `kbs-client` - Deploy KBS This commit adds the above stpes to the existing workflow for `qemu-coco-dev`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-03 16:26:12 +02:00
Hyounggyu Choi	b0a912b8b4	tests: Enable KBS deployment for qemu-coco-dev on s390x To deploy KBS on s390x, the environment variable `IBM_SE_CREDS_DIR` must be exported, and the corresponding directory must be created. This commit enables KBS deployment for `qemu-coco-dev`, in addition to the existing `qemu-se` support on the platform. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-03 15:51:18 +02:00
Fabiano Fidêncio	057612f18f	Merge pull request #10238 from fidencio/topic/remove-stdio-test ci: Remove stdio tests	2024-09-03 14:50:46 +02:00
ChengyuZhu6	0d519162b5	agent:storage: Refactor storage handler registration - Added `driver_types` method to `StorageHandler` trait to return driver types managed by each handler. - Implemented driver_types method for all storage handlers. - Updated `STORAGE_HANDLERS` initialization to use `driver_types` for handler registration. Fixes: #10242 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-03 18:38:52 +08:00
ChengyuZhu6	e47eb0d7d4	kata-types:mount: support registering multiple IDs to a single handler - Updated the `add_handler` function in `StorageHandlerManager` to accept a slice of IDs (`&[&str]`) instead of a single ID (`&str`). This change allows a single handler to be registered for multiple storage device types. - Refactored calls to `add_handler` in `Storage` of kata-agent to use the new function, passing arrays of storage drivers instead of single driver. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-03 18:38:36 +08:00
Fabiano Fidêncio	e8657c502d	Revert "CI: Add tests for stdio" This reverts commit `704da86e9b`, as the tests never became stable to run. This was discussed and agreed with the maintainer. Conflicts: .github/workflows/basic-ci-amd64.yaml tests/integration/stdio/gha-run.sh Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 11:52:30 +02:00
Greg Kurz	4698235e59	Merge pull request #10204 from fidencio/topic/kata-deploy-add-installation-prefix kata-deploy: helm: Add INSTALLATION_PREFIX	2024-09-03 09:26:51 +02:00
Fabiano Fidêncio	e1d3fb8c00	Merge pull request #10236 from fidencio/topic/bump-image-rs-to-properly-handle-gzip-whiteouts agent: Update image-rs to 02af65abc	2024-09-02 21:43:19 +02:00
Fabiano Fidêncio	0cb93ed1bb	kata-deploy: helm: Add INSTALLATION_PREFIX option This will allow users to properly set the INSTALLATION_PREFIX when deploying Kata Containers. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-02 20:25:22 +02:00
Gabriela Cervantes	c2aa288498	gha: Increase time to run Kata CoCo stability tests This PR increases the time to run the Kata CoCo stability tests as this tests are design to run for more than 2 hours. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-02 16:40:47 +00:00
Gabriela Cervantes	825cb2d22e	tests: Enable k8s soak stability test for Kata CoCo CI This PR enables the k8s soak stability test to run on the weekly Kata CoCo stability CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-02 16:30:44 +00:00
Fabiano Fidêncio	1309c49c09	agent: Update image-rs to 02af65abc As this brings in proper support to handle gzip whiteouts. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-02 14:15:04 +02:00
Fabiano Fidêncio	7be77ebee5	kata-deploy: helm: Stop mounting /opt/kata It's simply easier if we just use /host/opt/kata instead in our scripts, which will simplify a lot the logic of adding an INSTALLATION_PREFIX later on. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-02 09:38:51 +02:00
Fabiano Fidêncio	6ce5e62c48	kata-deploy: Add a $dest_dir var As we build our binaries with the `/opt/kata` prefix, that's the value of $dest_dir. Later in thise series it'll become handy, as we'll introduce a way to install the Kata Containers artefacts in a different location. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-02 09:36:33 +02:00
Fabiano Fidêncio	ef5a5ea26e	Merge pull request #10038 from sprt/move-free-runner-iii ci: Transition GARM tests to free runners, pt. III	2024-08-31 01:29:08 +02:00
Gabriela Cervantes	19d8f11345	versions: Update firecracker version to 1.8.0 This PR updates the firecracker version to 1.8.0 which includes the following changes: - Added ACPI support to Firecracker for x86_64 microVMs. Currently, we pass ACPI tables with information about the available vCPUs, interrupt controllers, VirtIO and legacy x86 devices to the guest. This allows booting kernels without MPTable support. Please see our kernel policy documentation for more information regarding relevant kernel configurations. - Added support for the Virtual Machine Generation Identifier (VMGenID) device on x86_64 platforms. VMGenID is a virtual device that allows VMMs to notify guests when they are resumed from a snapshot. Linux includes VMGenID support since version 5.18. It uses notifications from the device to reseed its internal CSPRNG. Please refer to snapshot support and random for clones documention for more info on VMGenID. VMGenID state is part of the snapshot format of Firecracker. As a result, Firecracker snapshot version is now 2.0.0. - Changed T2CL template to pass through bit 27 and 28 of MSR_IA32_ARCH_CAPABILITIES (RFDS_NO and RFDS_CLEAR) since KVM consider they are able to be passed through and T2CL isn't designed for secure snapshot migration between different processors. - Avoid setting kvm_immediate_exit to 1 if are already handling an exit, or if the vCPU is stopped. This avoids a spurious KVM exit upon restoring snapshots. - Changed T2S template to set bit 27 of MSR_IA32_ARCH_CAPABILITIES (RFDS_NO) to 1 since it assumes that the fleet only consists of processors that are not affected by RFDS. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-30 20:49:29 +00:00
Aurélien Bombo	886b3047ac	Merge pull request #10222 from microsoft/danmihai1/log-level-false-positives agent: avoid policy.txt log without debug enabled	2024-08-30 10:09:04 -07:00
Alex Lyn	4fd4b02f2e	Merge pull request #10228 from GabyCT/topic/removeionednn metrics: Remove unused variable in oneDNN benchmark	2024-08-30 09:31:14 +08:00
Gabriela Cervantes	aa8635727d	metrics: Remove unused variable in oneDNN benchmark This PR removes an unused variable in oneDNN metrics benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-29 15:52:47 +00:00
Alex Lyn	8241423ba5	Merge pull request #10224 from amshinde/update-image-rs-xattr agent: image-rs: check xattrs for image unpacking	2024-08-29 09:33:22 +08:00
GabyCT	dd9f41547c	Merge pull request #10160 from microsoft/saulparedes/support_priority_class genpolicy: add priorityClassName as a field in PodSpec interface	2024-08-28 14:36:20 -06:00
GabyCT	394480e7ff	Merge pull request #10221 from GabyCT/topic/addopendmmread docs: Add oneDNN benchmark information to metrics README	2024-08-28 14:22:22 -06:00
GabyCT	83b031ca7a	Merge pull request #10214 from GabyCT/topic/ciweekly gha: Add GHA workflow to run Kata CoCo stability tests	2024-08-28 11:46:29 -06:00
Archana Shinde	c747852bce	agent: image-rs: check xattrs for image unpacking This commit includes a fix for pulling an image on platforms that do not support xattr. Some platforms/file-systems do not support xattrs, this would make the image pull fail because of failing to set xattr. This commit will check whether the target path supports xattr. If yes, the unpacking will maintain xattrs; if not, it will not set xattrs. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-08-28 00:02:46 -07:00
Archana Choudhary	ae2cdedba8	genpolicy: add priorityClassName as a field in PodSpec interface This allows generation of policy for pods specifying priority classes. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-08-27 19:54:02 -07:00
Dan Mihai	aa8bdbde5a	agent: avoid policy.txt log without debug enabled slog's is_enabled() is documented as: - "best effort", and - Sometime resulting in false positives. Use AGENT_CONFIG.log_level.as_usize() instead, to avoid those false positives. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-08-28 02:33:56 +00:00
Aurélien Bombo	de98e467b4	ci: Use `ubuntu-22.04` instead of `ubuntu-latest` 22.04 is the default today: `23da668261/README.md` Being more specific will avoid unexpected errors when Github updates the default. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-27 16:44:39 +00:00
Aurélien Bombo	ceab66b1ce	ci: Run `build-checks-depending-on-kvm` for free Also keeps the Rust installation step even though it's preinstalled, so that we use the version specified in versions.yaml. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-27 16:43:59 +00:00
Aurélien Bombo	b4ce84b9d2	ci: Move `run-runk` to free runner No change other than switching the runner - no dependency issue expected. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-27 16:43:33 +00:00
Aurélien Bombo	645aaa6f7f	ci: Move `run-monitor` to free runner No change other than switching the runner - no dependency issue expected. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-27 16:43:33 +00:00
Gabriela Cervantes	3affde5b28	docs: Add oneDNN benchmark information to metrics README This PR adds the oneDNN benchmark information to the machine learning metrics README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-27 16:32:50 +00:00
Dan Mihai	9f6f5dac4b	Merge pull request #10037 from sprt/reinstate-mariner-host ci: reinstate Mariner host and guest kernel	2024-08-27 08:24:51 -07:00
Alex Lyn	f24983b3cf	Merge pull request #10210 from l8huang/cold-vf runtime: check if cold_plug_vfio is enabled before create PhysicalEndpoint	2024-08-27 15:23:55 +08:00
Alex Lyn	3a749cfb44	Merge pull request #10212 from squarti/remote-machine-type runtime: Allow machine_type in kata config for remote hypervisors	2024-08-27 14:05:36 +08:00
Aurélien Bombo	a3dba3e82b	ci: reinstate Mariner host GH-9592 addressed a bug in a previous version of the AKS Mariner host kernel that blocked the CH v39 upgrade. This bug has now been fixed so we undo that PR. Note we also specify a different OCI version for Mariner as it differs from Ubuntu's. Fixes: #9594 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-26 21:07:25 +00:00
Gabriela Cervantes	3a14b04621	gha: Fix entry for ci coco stability yaml This PR fixes the entry or use of the ci weekly GHA workflow to run properly the weekly k8s tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-26 17:14:35 +00:00
Gabriela Cervantes	95f6246858	gha: Add GHA workflow to run Kata CoCo stability tests This PR adds a GHA workflow to run Kata CoCo weekly stablity tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-26 17:05:21 +00:00
Silenio Quarti	11ba8f05ca	runtime: Allow machine_type in kata config for remote hypervisors Fixes: #10211 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-08-26 10:17:40 -04:00
Lei Huang	70168a467d	runtime: check if cold_plug_vfio is enabled before create PhysicalEndpoint PhysicalEndpoint unbinds its VF interface and rebinds it as a VFIO device, then cold-plugs the VFIO device into the guest kernel. When `cold_plug_vfio` is set to "no-port", cold-plugging the VFIO device will fail. This change checks if `cold_plug_vfio` is enabled before creating PhysicalEndpoint to avoid unnecessary VFIO rebind operations. Fixes: #10162 Signed-off-by: Lei Huang <leih@nvidia.com>	2024-08-23 15:42:17 -07:00
GabyCT	6b0272d6bf	Merge pull request #10193 from GabyCT/topic/k8ssoak stability: Add kubernetes parallel test	2024-08-23 15:51:01 -06:00
GabyCT	83177efb9b	Merge pull request #10201 from GabyCT/topic/readmeopenvino metrics: Add OpenVINO general information into README	2024-08-23 14:11:26 -06:00
Bo Chen	a0bd78b358	Merge pull request #10205 from likebreath/0819/upgrade_clh_v41.0 Upgrade to Cloud Hypervisor v41.0	2024-08-23 10:01:41 -07:00
Hyounggyu Choi	169b4490d2	Merge pull request #10209 from fidencio/topic/kata-manager-avoid-rate-pull-limit kata-manager: Avoid docker rate-limit	2024-08-23 12:52:14 +02:00
Fabiano Fidêncio	7f0289de60	kata-manager: Avoid docker rate-limit To do so, use a test image from quay.io instead of docker.io. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-23 11:56:09 +02:00
Fabiano Fidêncio	45f69373a6	Merge pull request #10199 from BbolroC/make-cdh-api-timeout-configurable agent/config: Make CDH_API_TIMEOUT configurable	2024-08-23 11:04:10 +02:00
Hyounggyu Choi	4cd83d2b98	Merge pull request #10202 from BbolroC/fix-k8s-tests-s390x tests: Fix k8s test issues on s390x	2024-08-23 09:51:11 +02:00
Fabiano Fidêncio	11bb9231c2	Merge pull request #10207 from amshinde/remove-image-check-cc Revert "tests: add image check before running coco tests"	2024-08-23 09:33:39 +02:00
Alex Lyn	44bf7ccb46	Merge pull request #10141 from soulfy/fix-delete-failed agent: kill child process when console socket closed	2024-08-23 14:00:53 +08:00
Archana Shinde	b0be03a93f	Revert "tests: add image check before running coco tests" This reverts commit `41b7577f08`. We were seeing a lot of issues in the TDX CI of the nature: "Error: failed to create containerd container: create instance 470: object with key "470" already exists: unknown" With the TDX CI, we moved to having the nydus snapsotter pre-installed. Essentially the `deploy-snapshotter` step was performed once before any actual CI runs. We were seeing failures related to the error message above. On reverting this change, we are no longer seeing errors related to "key exists" with the TDX CI passing now. The change reverted here is related to downloading incomplete images, but this seems to be messing up TDX CI. It is possible to pass --snapshotter to `ctr image check` but that does not seem to have any effect on the data set returned. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-08-22 18:05:42 -07:00
Bo Chen	254f8bca74	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v41.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #10203 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-08-22 11:05:54 -07:00
Bo Chen	e69535326d	versions: Upgrade to Cloud Hypervisor v41.0 Details of this release can be found in our roadmap project as iteration v41.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #10203 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-08-22 11:02:26 -07:00
Gabriela Cervantes	2fa8e85439	metrics: Add OpenVINO general information into README This PR adds the OpenVINO benchmark general information into the machine learning README metrics information. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-22 16:08:06 +00:00
Hyounggyu Choi	274de8c6af	tests: Introduce wait_time to k8s_create_pod() In certain environments (e.g., those with lower performance), `k8s_create_pod()` may require additional wait time, especially when dealing with large images. Since `k8s_wait_pod_be_ready()` — which is called by `k8s_create_pod()` — already accepts `wait_time` as a second argument, it makes sense to introduce `wait_time` to `k8s_create_pod()` and propagate it to the callee. This commit adds `wait_time` to `k8s_create_pod()` as the 2nd (optional) argument. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 17:46:53 +02:00
Hyounggyu Choi	5d7397cc69	tests: Load confidential_kbs.sh in k8s-guest-pull-iamge.bats Some of the tests call set_metadata_annotation() for updating the kernel parameters. For `kata-qemu-se`, repack_secure_image() is called which is defined in `lib_se.sh` and sourced by `confidential_kbs.sh`. This commit ensures that the function call chain for the relevant `KATA_HYPERVISOR` is properly handled. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 17:33:38 +02:00
Fabiano Fidêncio	890fa26767	Merge pull request #10196 from fidencio/topic/ci-commit-message-take-reapply-into-consideration ci: commit-message-check: Take re-revert into consideration	2024-08-22 17:31:27 +02:00
Fabiano Fidêncio	2f6edc4b9b	Merge pull request #10194 from fidencio/topic/kata-deploy-re-work-logic kata-deploy: Rework the logic a little bit	2024-08-22 16:46:36 +02:00
Hyounggyu Choi	baa8af3f8e	doc: Update how-to-set-sandbox-config-kata.md This commit add a row for `cdh_api_timeout` to the agent options in how-to-set-sandbox-config-kata.md. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 14:50:51 +02:00
Hyounggyu Choi	7d0aba1a24	runtime: Enable to get cdh_api_timeout from configuration file This commit allows `cdh_api_timeout` to be configured from the configuration file. The configuration is commented out with specifying a default value (50s) because the default value is configured in the agent. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 14:47:37 +02:00
Hyounggyu Choi	8615516823	agent: Add agent.cdh_api_timeout to README This commit adds an explanation for `cdh_api_timeout` to the README file. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 14:47:37 +02:00
Fabiano Fidêncio	a9a1345a31	kata-deploy: Print the action the script was invoked with This increases debuggability. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-22 14:32:33 +02:00
Fabiano Fidêncio	ab493b6028	kata-deploy: Move general logic to the correct actions therwise we may end up running into unexpected issues when calling the cleanup option, as the same checks would be done, and files could end up being copied again, overwriting the original content which was backked up by the install option. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-22 14:32:29 +02:00
Fabiano Fidêncio	6596012956	kata-deploy: Simplify check for runtime Let's write the runtime check in a shorter and simpler to read form. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-22 14:32:02 +02:00
Hyounggyu Choi	2512ddeab2	agent/cdh: Use AGENT_CONFIG.cdh_api_timeout for CDH_API_TIMEOUT This commit updates CDH_API_TIMEOUT to use AGENT_CONFIG.cdh_api_timeout and changes it from a `const` to `lazy_static` to accommodate runtime-determined values. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 10:09:16 +02:00
Hyounggyu Choi	6139e253a0	agent/config: Add cdh_api_timeout to AgentConfig To make the `cdh_api_timeout` variable configurable, it has been added to the `AgentConfig` structure. This change includes storing the variable as a `time::Duration` type and generalizing the existing `hotplug_timeout` code to handle both timeouts. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 10:09:16 +02:00
GabyCT	3fd108b09a	Merge pull request #10198 from GabyCT/topic/remvaropenvino metrics: Remove unused variable in openvino script	2024-08-21 15:48:56 -06:00
Dan Mihai	8ccc8a8d0b	Merge pull request #9911 from microsoft/saulparedes/mounts genpolicy: deny UpdateEphemeralMountsRequest	2024-08-21 10:12:28 -07:00
Gabriela Cervantes	59e31baaee	metrics: Remove unused variable in openvino script This PR removes an unused variable in the openvino script for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-21 16:05:55 +00:00
Greg Kurz	09a13da8ec	Merge pull request #10197 from beraldoleal/release-3.8 release: Bump VERSION to 3.8.0	2024-08-21 17:50:10 +02:00
Beraldo Leal	55bdb380fb	release: Bump VERSION to 3.8.0 Let's start the 3.8.0 release. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-08-21 10:24:07 -04:00
Gabriela Cervantes	27d5539954	stability: Add pod deployment yaml for soak test This PR adds the pod deployment yaml for soak test which is part of the stability k8s tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-21 14:23:22 +00:00
Fabiano Fidêncio	3fd021a9b3	ci: commit-message-check: Take re-revert into consideration `Reapply "` should be taken into sonsideration as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-21 14:19:16 +02:00
Fabiano Fidêncio	f071c8cada	Merge pull request #10191 from fidencio/topic/ci-temporarily-revert-helm-usage ci: Let's temporarily revert the helm charts usage in our CI	2024-08-21 10:52:23 +02:00
Dan Mihai	6654491cc3	genpolicy: deny UpdateEphemeralMountsRequest * genpolicy: deny UpdateEphemeralMountsRequest Deny UpdateEphemeralMountsRequest by default, because paths to critical Guest components can be redirected using such request. Signed-off-by: Dan Mihai <Daniel.Mihai@microsoft.com>	2024-08-20 18:28:17 -07:00
Gabriela Cervantes	c04a805215	stability: Add kubernetes parallel test This PR adds a kubernetes parallel test that will launch multiple replicas from a kubernetes deployment and we will iterate this multiple times to verify that we are able to do this using CoCo Kata. This test will be part of the CoCo Kata stability CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-20 23:24:22 +00:00
Fabiano Fidêncio	b18c3dfce3	Revert "kata-deploy: Add Helm Chart" (partially) This partially reverts commit `94b3348d3c`, as there's more work needed in order to have this one done in a robust way, and we are taking the safer path of reverting for now, and adding it back as soon as the release is cut out. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-21 00:09:11 +02:00
Fabiano Fidêncio	36f4038a89	Revert "ci: Use helm to deploy kata-deploy" (partially) This partially reverts commit `51690bc157`, as there's more work needed in order to have this one done in a robust way, and we are taking the safer path of reverting for now, and adding it back as soon as the release is cut out. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-21 00:09:11 +02:00
Fabiano Fidêncio	21f9f01e1d	Revert "ci: make cleanup_kata_deploy really simple" This reverts commit `1221ab73f9`, as there's more work needed in order to have this one done in a robust way, and we are taking the safer path of reverting for now, and adding it back as soon as the release is cut out. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-21 00:09:11 +02:00
GabyCT	e0bff7ed14	Merge pull request #10177 from GabyCT/topic/cocoghas gha: Add k8s stability Kata CoCo GHA workflow	2024-08-20 15:12:29 -06:00
Gabriela Cervantes	ca3d778479	gha: Add Kata CoCo Stability workflow This PR adds the Kata CoCo Stability workflow that will setup the environment to run the k8s tests on a non-tee environment. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-20 16:34:33 +00:00
Gabriela Cervantes	3ebaa5d215	gha: Add Kata CoCo stability weekly yaml This PR adds the Kata CoCo stability weekly yaml that will trigger weekly the k8s stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-20 16:32:03 +00:00
Fabiano Fidêncio	aeb6f54979	Merge pull request #10180 from fidencio/topic/ci-ensure-the-key-was-created-on-kbs ci: Ensure the KBS resources are created	2024-08-20 09:07:56 +02:00
Fabiano Fidêncio	40d385d401	Merge pull request #10188 from wainersm/kbs_key tests/k8s: check and save kbs.key	2024-08-19 23:29:10 +02:00
Fabiano Fidêncio	c0d7222194	ci: Ensure the KBS resources are created Otherwise we may have tests failing due to the resource not being created yet. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-19 23:27:06 +02:00
Wainer dos Santos Moschetta	e014eee4e8	tests/k8s: check and save kbs.key The deploy-kbs.sh script generates the kbs.key that's used to install KBS. This same file is used lately by kbs-client to authenticate. This ensures that the file was created, otherwise fail. Another problem solved here is that on bare-metal machines the key doesn't survive a reboot as it is created in a temporary directory (/tmp/trustee). So let's save the file to a non-temporary location. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-19 16:03:03 -03:00
Wainer Moschetta	6a982930e2	Merge pull request #10183 from fidencio/topic/kata-deploy-use-runtime_path kata-deploy: Stop symlinking into /usr/local/bin	2024-08-19 13:17:21 -03:00
Fabiano Fidêncio	42d48efcc2	Merge pull request #10181 from fidencio/topic/ci-fix-stdio-typo ci: stdio: Fix typo on getting the containerd version	2024-08-18 16:05:42 +02:00
Fabiano Fidêncio	e0ae398a2e	Merge pull request #10151 from squarti/rootdir2 runtime: Files are not synced between host and guest VMs	2024-08-18 12:32:52 +02:00
Fabiano Fidêncio	d03b72f19b	kata-deploy: Stop linking binaries to /usr/local/bin Neither CRI-O nor containerd requires that, and removing such symlinks makes everything less intrusive from our side. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-18 01:25:12 +02:00
Fabiano Fidêncio	c2393dc467	kata-deploy: Use shim's absolute path for crio's runtime_path This will allow us, in the future, not have to do symlinks here and there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-18 01:25:12 +02:00
Fabiano Fidêncio	58623723b1	kata-deploy: Use runtime_path for containerd It's already being used with CRi-O, let's simplify what we do and also use this for containerd, which will allow us to do further cleanups in the coming patches. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-18 01:25:12 +02:00
Fabiano Fidêncio	e75c149dec	ci: stdio: Properly start running the test "gha-run.sh" requires a `run` argument in order to run the tests, which seems to be forgotten when the test was added. This PR needs to get merged before the test can successfully run. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-17 14:41:44 +02:00
Fabiano Fidêncio	dd2d9e5524	ci: stdio: Fix typo on getting the containerd version I assume the PR that introduced this was based on an older version of yq, and as the test couldn't run before it got merged we never noticed the error. However, this test has been failing for a reasonable amount of time, which makes me think that we either need a maintainer for it, or just remove it completely, but that's a discussion for another day. For now, let's make it, at least, run. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-17 14:06:24 +02:00
Fabiano Fidêncio	7113490cb1	Merge pull request #10179 from fidencio/topic/switch-nginx-image ci: k8s: Replace nginx alpine images	2024-08-17 13:07:31 +02:00
Fabiano Fidêncio	0831081399	ci: k8s: Replace nginx alpine images The previous ones are gone, so let's switch to our own multi-arch image for the tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-17 12:19:33 +02:00
Fabiano Fidêncio	a78d82f4f1	Merge pull request #10159 from squarti/main agent: Handle EINVAL error when umounting container rootfs	2024-08-16 22:07:50 +02:00
Dan Mihai	79c1d0a806	Merge pull request #10136 from microsoft/danmihai1/docker-image-volume2 genpolicy: add bind mounts for image volumes	2024-08-16 13:07:01 -07:00
Fabiano Fidêncio	28aa4314ba	Merge pull request #10175 from ChengyuZhu6/error_message runtime: Add specific error message for gRPC request timeouts	2024-08-16 22:06:49 +02:00
Fabiano Fidêncio	720edbe3fc	Merge pull request #10174 from ChengyuZhu6/install_script tools: install luks-encrypt-storage script by guest-components	2024-08-16 22:04:56 +02:00
Fabiano Fidêncio	7b5da45059	Merge pull request #10178 from fidencio/topic/revert-trustee-bump Revert "version: bump trustee version"	2024-08-16 21:48:30 +02:00
Gabriela Cervantes	6ea34f13e1	gha: Add k8s stability Kata CoCo GHA workflow This PR adds the k8s stability Kata CoCo GHA workflow to run weekly the k8s stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-16 16:14:15 +00:00
Fabiano Fidêncio	45f43e2a6a	Revert "version: bump trustee version" This reverts commit `d35320472c`. Although the commit in question does solve an issue related to the usage of busybox from docker.io, as it's reasonably easy to hit the rate limit, the commit also brings in functionalities that are causing issues in, at least, the TDX CI, such as: ```sh [2024-08-16T16:03:52Z INFO actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/attest HTTP/1.1" 401 259 "-" "attestation-agent-kbs-client/0.1.0" 0.065266 [2024-08-16T16:03:53Z INFO kbs::http::attest] Auth API called. [2024-08-16T16:03:53Z INFO actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/auth HTTP/1.1" 200 74 "-" "attestation-agent-kbs-client/0.1.0" 0.000169 [2024-08-16T16:03:54Z INFO kbs::http::attest] Attest API called. [2024-08-16T16:03:54Z INFO verifier::tdx] Quote DCAP check succeeded. [2024-08-16T16:03:54Z INFO verifier::tdx] MRCONFIGID check succeeded. [2024-08-16T16:03:54Z INFO verifier::tdx] CCEL integrity check succeeded. [2024-08-16T16:03:54Z ERROR kbs::http::error] Attestation failed: Verifier evaluate failed: TDX Verifier: failed to parse AA Eventlog from evidence Caused by: at least one line should be included in AAEL ``` Let's revert this for now, and then once we get this one fixed on trustee side we'll update again. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-16 18:10:38 +02:00
Dan Mihai	c22ac4f72c	genpolicy: add bind mounts for image volumes Add bind mounts for volumes defined by docker container images, unless those mounts have been defined in the input K8s YAML file too. For example, quay.io/opstree/redis defines two mounts: /data /node-conf Before these changes, if these mounts were not defined in the YAML file too, the auto-generated policy did not allow this container image to start. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-08-16 15:11:05 +00:00
Fabiano Fidêncio	b203f715e5	Merge pull request #10170 from beraldoleal/deploy-reset-fix kata-deploy: fix kata-deploy reset	2024-08-16 16:51:14 +02:00
Fabiano Fidêncio	8d63723910	Merge pull request #10161 from microsoft/saulparedes/ignore_role_resource genpolicy: ignore Role resource	2024-08-16 16:50:16 +02:00
Fabiano Fidêncio	6c58ae5b95	Merge pull request #10171 from fidencio/topic/ci-treat-nydus-snapshotter-as-a-dep ci: nydus: Treat the snapshotter as a dependency	2024-08-16 16:39:48 +02:00
ChengyuZhu6	1eda6b7237	tests: update error message with guest pulling image timeout update error message with guest pulling image timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-16 20:26:33 +08:00
ChengyuZhu6	ca05aca548	runtime: Add specific error message for gRPC request timeouts Improved error handling to provide clearer feedback on request failures. For example: Improve createcontainer request timeout error message from "Error: failed to create containerd task: failed to create shim task:context deadline exceed" to "Error: failed to create containerd task: failed to create shim task: CreateContainerRequest timed out: context deadline exceed". Fixes: #10173 -- part II Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-16 20:24:48 +08:00
Beraldo Leal	b3a4cd1a06	Merge pull request #10172 from deagon/fix-typo osbuilder: fix typo in ubuntu rootfs depends	2024-08-16 08:01:59 -04:00
Beraldo Leal	b843b236e4	kata-deploy: improve kata-deploy script For the rare cases where containerd_conf_file does not exist, cp could fail and let the pod in Error state. Let's make it a little bit more robust. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-08-16 07:52:38 -04:00
ChengyuZhu6	aa31a9d3c4	tools: install luks-encrypt-storage script by guest-components Install luks-encrypt-storage script by guest-components. So that we can maintain a single source and prevent synchronization issues. Fixes: #10173 -- part I Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-16 16:28:20 +08:00
Chengyu Zhu	ba3c484d12	Merge pull request #9999 from ChengyuZhu6/trusted-storage Trusted image storage	2024-08-16 15:39:50 +08:00
Fabiano Fidêncio	0f3eb2451e	Merge pull request #10169 from fidencio/topic/revert-reset_runtime-to-cleanup Revert "ci: add reset_runtime to cleanup"	2024-08-16 07:29:58 +02:00
Aurélien Bombo	e1775e4719	Merge pull request #10164 from BbolroC/make-exec_host-stable tests: Ensure exec_host() consistently captures command output	2024-08-15 21:43:32 -07:00
Guoqiang Ding	1d21ff9864	osbuilder: fix typo in ubuntu rootfs depends Remove the duplicate package "xz-utils". Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-08-16 11:33:55 +08:00
Silenio Quarti	5d815ffde1	runtime: Files are not synced between host and guest VMs This PR resolves the default kubelet root dir symbolic link and uses it as the absolute path for the fs watcher regexs Fixes: https://github.com/kata-containers/kata-containers/issues/9986 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-08-15 23:19:08 -04:00
Silenio Quarti	0dd16e6b25	agent: Handle EINVAL error when umounting container rootfs Container/Sandbox clean up should not fail if root FS is not mounted. This PR handles EINVAL errors when umount2 is called. Fixes: #10166 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-08-15 19:41:46 -04:00
Fabiano Fidêncio	3733266a60	ci: nydus: Treat the snapshotter as a dependency Instead of deploying and removing the snapshotter on every single run, let's make sure the snapshotter is always deploy on the TDX case. We're doing this as an experiment, in order to see if we'll be able to reduce the failures we've been facing with the nydus snapshotter. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-15 22:44:30 +02:00
Hyounggyu Choi	ba3e5f6b4a	Revert "tests: Disable k8s file volume test" This reverts commit `e580e29246`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-15 21:10:39 +02:00
Hyounggyu Choi	758e650a28	tests: Ensure exec_host() consistently captures command output The `exec_host()` function often fails to capture the output of a given command because the node debugger pod is prematurely terminated. To address this issue, the function has been refactored to ensure consistent output capture by adjusting the `kubectl debug` process as follows: - Keep the node debugger pod running - Wait until the pod is fully ready - Execute the command using `kubectl exec` - Capture the output and terminate the pod This commit refactors `exec_host()` to implement the above steps, improving its reliability. Fixes: #10081 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-15 21:10:39 +02:00
Beraldo Leal	74662a0721	Merge pull request #10137 from hex2dec/fix-image-warning tools: Fix container image build warning	2024-08-15 14:45:41 -04:00
Dan Mihai	905c76bd47	Merge pull request #10153 from microsoft/saulparedes/support_cron_job genpolicy: Add support for cron jobs	2024-08-15 11:11:00 -07:00
Aurélien Bombo	0223eedda5	Merge pull request #10050 from burgerdev/request-hardening genpolicy: hardening some agent requests	2024-08-15 08:31:21 -07:00
Fabiano Fidêncio	1f6a8baaf1	Revert "ci: add reset_runtime to cleanup" This reverts commit `8d9bec2e01`, as it causes issues in the operator and kata-deploy itself, leading to the node to be NotReady. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-15 16:09:34 +02:00
ChengyuZhu6	5f4209e008	agent:README: add secure_image_storage_integrity to agent's README add secure_image_storage_integrity to agent's README. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 20:32:44 +08:00
ChengyuZhu6	6ecb2b8870	tests: skip test trusted storage in qemu-coco-dev I can't set up loop device with `exec_host`, which the command is necessary for qemu-coco-dev. See issue #10133. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 20:32:44 +08:00
ChengyuZhu6	51b9d20d55	tests: update error message in pulling image encrypted tests Update error message in pulling image encrypted to "failed to get decrypt key no suitable key found for decrypting layer key". Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 20:32:44 +08:00
ChengyuZhu6	b4d10e7655	version: update the version of coco-guest-components update the version of coco-guest-components. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 20:32:43 +08:00
Fupan Li	365df81d5e	Merge pull request #10148 from lifupan/main_sandboxapi runtime-rs: Add the wait_vm support for hypervisors	2024-08-15 17:08:38 +08:00
ChengyuZhu6	a9b436f788	agent:cdh: Introduces secure_mount API in cdh Introduces `secure_mount` API in the cdh. It includes: - Adding the `SecureMountServiceClient`. - Implementing the `secure_mount` function to handle secure mounting requests. - Updating the confidential_data_hub.proto file to define SecureMountRequest and SecureMountResponse messages and adding the SecureMountService service. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:23 +08:00
ChengyuZhu6	1528d543b2	agent:cdh: Rename sealed_secret API namespace to confidential_data_hub renames the sealed_secret.proto file to confidential_data_hub.proto and updates the corresponding API namespace from sealed_secret to confidential_data_hub. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:23 +08:00
ChengyuZhu6	37bd2406e0	docs: add content about how to pull large image Add content about how to pull large image in the guest with trust storage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:22 +08:00
ChengyuZhu6	c5a973e68c	tests:k8s: add tests for guest pull with configured timeout add tests for guest pull with configured timeout: 1) failed case: Test we cannot pull a large image that pull time exceeds a short creatcontainer timeout(10s) inside the guest 2) successful case: Test we can pull a large image inside the guest with increasing createcontainer timeout(120s) Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:22 +08:00
ChengyuZhu6	6c506cde86	tests:k8s: add tests for pull images in the guest using trusted storage add tests for pull images in the guest using trusted storage: 1) failed case: Test we cannot pull an image that exceeds the memory limit inside the guest 2) successful case: Test we can pull an image inside the guest using trusted ephemeral storage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:22 +08:00
GabyCT	ecfbc9515a	Merge pull request #10158 from GabyCT/topic/k8sstabil tests: Add kubernetes stability test	2024-08-14 14:44:49 -06:00
Saul Paredes	5ad47b8372	genpolicy: ignore Role resource Ignore Role resources because they don't need a Policy. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-08-14 12:57:06 -07:00
Gabriela Cervantes	d48ad94825	tests: Add kubernetes stability test This PR adds a k8s stability test that will be part of the CoCo Kata stability tests that will run weekly. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-14 15:30:49 +00:00
Fupan Li	cadcf5f92d	runtime-rs: Add the wait_vm support for hypervisors Add the wait_vm method for hypervisors. This is a prerequisite for sandbox api support. Fixes: #7043 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-08-14 12:01:34 +08:00
Fupan Li	506977b102	Merge pull request #10156 from GabyCT/topic/disablevolume tests: Disable k8s file volume test	2024-08-14 12:00:47 +08:00
GabyCT	b0b6a1baea	Merge pull request #10154 from GabyCT/topic/stressk8s tests: Add kubernetes stress-ng tests	2024-08-13 15:09:59 -06:00
Gabriela Cervantes	e580e29246	tests: Disable k8s file volume test This PR disables the k8s file volume test as we are having random failures in multiple GHA CIs mainly because the exec_host function sometimes does it not work properly. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-13 20:50:18 +00:00
Saul Paredes	af598a232b	tests: add test for cron job support Add simple test for cron job support Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-08-13 10:47:42 -07:00
Saul Paredes	88451d26d0	genpolicy: add support for cron jobs Add support for cron jobs Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-08-13 10:47:42 -07:00
Gabriela Cervantes	bdca5ca145	tests: Add kubernetes stress-ng tests This PR adds kubernetes stress-ng tests as part of the stability testing for kata. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-13 16:23:52 +00:00
Fabiano Fidêncio	99730256a2	Merge pull request #10149 from fidencio/topic/kata-manager-relax-opt-check kata-manager: Only check files when tarball is not passed	2024-08-13 16:26:16 +02:00
Markus Rudy	bce5cb2ce5	genpolicy: harden CreateSandboxRequest checks Hooks are executed on the host, so we don't expect to run hooks and thus require that no hook paths are set. Additional Kernel modules expand the attack surface, so require that none are set. If a use case arises, modules should be allowlisted via settings. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-08-13 09:01:58 +02:00
Markus Rudy	aee23409da	genpolicy: harden CopyFileRequest checks CopyFile is invoked by the host's FileSystemShare.ShareFile function, which puts all files into directories with a common pattern. Copying files anywhere else is dangerous and must be prevented. Thus, we check that the target path prefix matches the expected directory pattern of ShareFile, and that this directory is not escaped by .. traversal. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-08-13 09:01:58 +02:00
soulfy	722b576eb3	agent: kill child process when console socket closed when use debug console, the shell run in child process may not be exited, in some scenes. eg. directly Ctrl-C in the host to terminate the kata-runtime process, that will block the task handling the console connection，while waiting for the child to exit. Signed-off-by: soulfy <liukai254@jd.com>	2024-08-13 10:18:03 +08:00
Steve Horsman	91084058ae	Merge pull request #10007 from wainersm/run_k8s_on_free_runners ci: Transition GARM tests to free runners, pt. II	2024-08-12 18:12:18 +01:00
Fabiano Fidêncio	5fe65e9fc2	kata-manager: Only check files when tarball is not passed Only do the checking in case the tarball was not explicitly passed by the user. We have no control of what's passed and we cannot expect that all the files are going to be under /opt. Fixes: #10147 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-12 13:54:24 +02:00
ChengyuZhu6	c3a0ab4b93	tests:k8s: Re-enable and refactor the tests with guest pull Currently, setting `io.containerd.cri.runtime-handler` annotation in the yaml is not necessary for pulling images in the guest. All TEE hypervisors are already running tests with guest-pulling enabled. Therefore, we can remove some duplicate tests and re-enable the guest-pull test for running different runtime pods at the same time. While considering to support different containerd version, I recommend to keep setting "io.containerd.cri.runtime-handler". Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-12 16:36:54 +08:00
ChengyuZhu6	47be9c7c01	osbuilder:rootfs: install init_trusted_storage script Install init_trusted_storage script if enable MEASURED_ROOTFS. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: Anand Krishnamoorthi <anakrish@microsoft.com>	2024-08-12 16:36:54 +08:00
ChengyuZhu6	df993b0f88	agent:rpc: initialize trusted storage device Initialize the trusted stroage when the device is defined as "/dev/trusted_store" with shell script as first step. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com>	2024-08-12 16:36:54 +08:00
ChengyuZhu6	94347e2537	agent:config: Support secure_storage_integrity option for trusted storage After enable secure storage integrity for trusted storage, the initialize time will take more times, the default value will be NOT enabled but add this config to allow the user to enable if they care more strict security. Fixes: #8142 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com>	2024-08-12 16:36:54 +08:00
GabyCT	775f6bdc5c	Merge pull request #10142 from GabyCT/topic/updatestress tests: Update ubuntu image for stress Dockerfile	2024-08-09 16:11:35 -06:00
Gabriela Cervantes	5e5fc145cd	tests: Update ubuntu image for stress Dockerfile This PR updates the ubuntu image for stress Dockerfile. The main purpose is to have a more updated image compared with the one that is in libpod which has not been updated in a while. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-09 15:29:10 +00:00
Steve Horsman	e4c023a9fa	Merge pull request #10140 from stevenhorsman/kata-version-in-artefact-version ci: cache: Include kata version in artefact versions	2024-08-09 11:37:09 +01:00
Fabiano Fidêncio	44b08b84b0	Merge pull request #10113 from Freax13/fix/no-scsi-off qemu: don't emit scsi parameter	2024-08-08 16:23:36 +02:00
stevenhorsman	b6a3a3f8fe	ci: cache: Include kata version in artefact versions - At the moment we aren't factoring in the kata version on our caches, so it means that when we bump this just before release, we don't rebuilt components that pull in the VERSION content, so the release build ends up with incorrect versions in it's binaries Fixes: #10092 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-08-08 14:58:58 +01:00
GabyCT	584d7a265e	Merge pull request #10127 from GabyCT/topic/execimage tests:k8s: Update image in kubectl debug for the exec host function	2024-08-07 17:00:52 -06:00
Archana Shinde	1012449141	Merge pull request #10129 from hex2dec/qemu-aio-native tools: Support for building qemu with linux aio	2024-08-07 14:32:52 -07:00
Archana Shinde	a6a736eeaf	Merge pull request #10089 from amshinde/enable-nerdctl-clh ci: Enable nerdctl tests for clh	2024-08-07 12:13:00 -07:00
Wainer dos Santos Moschetta	374405aed1	workflows/run-k8s-tests-on-amd64: remove 'instance' from matrix The jobs are all executed on ubuntu-22.04 so it's invariant and can be removed from the matrix (this will shrink the jobs names). Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 16:00:39 -03:00
Wainer dos Santos Moschetta	d11ce129ac	workflows: merge run-k8s-tests-on-garm and run-k8s-tests-with-crio-on-garm Created the run-k8s-tests-on-amd64.yaml which is a merge of run-k8s-tests-on-garm.yaml and run-k8s-tests-with-crio-on-garm.yaml ps: renamed the job from 'run-k8s-tests' to 'run-k8s-tests-on-amd64' to it is easier to find on Github UI and be distinguished from s390x, ppc64le, etc... Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 15:50:43 -03:00
Wainer dos Santos Moschetta	ed0732c75d	workflows: migrate run-k8s-tests-with-crio-on-garm to free runners Switch to Github managed runners just like the run-k8s-tests-on-garm workflow. See: #9940 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 15:20:42 -03:00
Wainer dos Santos Moschetta	3d053a70ab	workflows: migrate run-k8s-tests-on-garm to free runners Switched to Github managed runners. The instance_type parameter was removed and K8S_TEST_HOST_TYPE is set to "all" which combine the tests of "small" and "normal". This way it will reduze to half of the jobs. See: #9940 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 15:20:42 -03:00
Wainer dos Santos Moschetta	dfb92e403e	tests/k8s: add "deploy-kata"/"cleanup" actions to gh-run.sh These new "kata-deploy" and "cleanup" actions are equivalent to "kata-deploy-garm" "cleanup-garm", respectively, and should be used on the workflows being migrated from GARM to Github's managed runners. Eventually "kata-deploy-garm" and "cleanup-garm" won't be used anymore then we will be able to remove them. See: #9940 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 15:20:23 -03:00
Zhiwei Huang	7270a7ba48	tools: Fix container image build warning All commands within the Dockerfile should use the same casing (either upper or lower).[1] [1]: https://docs.docker.com/reference/build-checks/consistent-instruction-casing/ Signed-off-by: Zhiwei Huang <ai.william@outlook.com>	2024-08-07 15:49:01 +08:00
Dan Mihai	2da77c6979	Merge pull request #10068 from burgerdev/genpolicy-test genpolicy: add crate-scoped integration test	2024-08-06 16:10:46 -07:00
GabyCT	fb166956ab	Merge pull request #10132 from fidencio/topic/support-image-pull-with-nerdctl runtime: image-pull: Make it work with nerdctl	2024-08-06 15:33:40 -06:00
Gabriela Cervantes	d0ca43162d	tests:k8s: Update image in kubectl debug for the exec host function This PR updates the image that we are using in the kubectl debug command as part of the exec host function, as the current alpine image does not allow to create a temporary file for example and creates random kubernetes failures. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-06 21:13:46 +00:00
Fabiano Fidêncio	63802ecdd9	Merge pull request #9880 from zvonkok/helm-chart kata-deploy: Add Helm Chart	2024-08-06 22:55:31 +02:00
Archana Shinde	ba884aac13	ci: Enable nerdctl tests for clh A recent fix should resolve some the issues seen earlier with clh with the go runtime. Enabling this test to check if the issue is still seen. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-08-06 10:41:42 -07:00
Fabiano Fidêncio	f33f2d09f7	runtime: image-pull: Make it work with nerdctl Our code for handling images being pulled inside the guest relies on a containerType ("sandbox" or "container") being set as part of the container annotations, which is done by the CRI Engine being used, and depending on the used CRI Engine we check for a specfic annotation related to the image-name, which is then passed to the agent. However, when running kata-containers without kubernetes, specifically when using `nerdctl`, none of those annotations are set at all. One thing that we can do to allow folks to use `nerdctl`, however, is to take advantage of the `--label` flag, and document on our side that users must pass `io.kubernetes.cri.image-name=$image_name` as part of the label. By doing this, and changing our "fallback" so we can always look for such annotation, we ensure that nerdctl will work when using the nydus snapshotter, with kata-containers, to perform image pulling inside the pod sandbox / guest. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-06 17:07:45 +02:00
Zvonko Kaiser	8d9bec2e01	ci: add reset_runtime to cleanup Adding reset_cleanup to cleanup action so that it is done automatically without the need to run yet another DS just to reset the runtime. This is now part of the lifecycle hook when issuing kata-deploy.sh cleanup Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-08-06 11:57:04 +02:00
Zvonko Kaiser	1221ab73f9	ci: make cleanup_kata_deploy really simple Remove the unneeded logic for cleanup the values are encapsulated in the deployed helm release Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-08-06 11:57:04 +02:00
Zvonko Kaiser	51690bc157	ci: Use helm to deploy kata-deploy Rather then modifying the kata-depoy scripts let's use Helm and create a values.yaml that can be used to render the final templates Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-08-06 11:57:04 +02:00
Zvonko Kaiser	94b3348d3c	kata-deploy: Add Helm Chart For easier handling of kata-deploy we can leverage a Helm chart to get rid of all the base and overlays for the various components Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-08-06 11:57:04 +02:00
Zhiwei Huang	d455883b46	tools: Support for building qemu with linux aio The kata containers hypervisior qemu configuration supports setting block_device_aio="native", but the kata static build of qemu does not add the linux aio feature. The libaio-dev library is a necessary dependency for building qemu with linux aio. Fixes: #10130 Signed-off-by: Zhiwei Huang <ai.william@outlook.com>	2024-08-06 14:30:45 +08:00
Markus Rudy	69535e5458	genpolicy: add crate-scoped integration test Provides a test runner that generates a policy and validates it with canned requests. The initial set of test cases is mostly for illustration and will be expanded incrementally. In order to enable both cross-compilation on Ubuntu test runners as well as native compilation on the Alpine tools builder, it is easiest to switch to the vendored openssl-src variant. This builds OpenSSL from source, which depends on Perl at build time. Adding the test to the Makefile makes it execute in CI, on a variety of architectures. Building on ppc64le requires a newer version of the libz-ng-sys crate. Fixes: #10061 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-08-05 11:52:01 +02:00
Markus Rudy	4d1416529d	genpolicy: fix clippy v1.78.0 warnings cargo clippy has two new warnings that need addressing: - assigning_clones These were fixed by clippy itself. - suspicious_open_options I added truncate(false) because we're opening the file for reading. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-08-05 11:48:30 +02:00
Fabiano Fidêncio	43dca8deb4	Merge pull request #10121 from microsoft/saulparedes/add_version_flag genpolicy: add --version flag	2024-08-03 21:22:10 +02:00
Fabiano Fidêncio	3b2173c87a	Merge pull request #10124 from fidencio/topic/ci-enable-encrypted-image-tests-for-tees ci: Enable encrypted image tests for TEEs	2024-08-03 11:39:51 +02:00
Fabiano Fidêncio	89f1581e54	ci: Enable encrypted image tests for TEEs After experimenting a little bit with those tests, they seem to be passing on all the available TEE machines. With this in mind, let's just enable them for those machines. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-03 09:27:32 +02:00
Fabiano Fidêncio	3b896cf3ef	Merge pull request #10125 from fidencio/topic/un-break-ci ci: Remove jobs that are not running	2024-08-03 09:27:04 +02:00
Fabiano Fidêncio	62a086937e	ci: Remove jobs that are not running When re-enabling those we'll need a smart way to do so, as this limit of 20 workflows referenced is just ... weird. However, for now, it's more important to add the jobs related to the new platforms than keep the ones that are actively disabled. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-03 09:24:05 +02:00
GabyCT	76af5a444b	Merge pull request #10075 from microsoft/saulparedes/hooks genpolicy: reject create custom hook settings	2024-08-02 15:36:34 -06:00
GabyCT	aadde2c25b	Merge pull request #10120 from kata-containers/fix_metrics_json_results_file Fix metrics json results file	2024-08-02 11:29:02 -06:00
Fabiano Fidêncio	b93a0642e0	Merge pull request #10123 from fidencio/topic/re-enable-arm-ci ci: re-enable arm CI	2024-08-02 17:48:35 +02:00
Dan Mihai	2628b34435	Merge pull request #10098 from microsoft/danmihai1/allow-failing agent: fix the AllowRequestsFailingPolicy functionality	2024-08-02 08:42:47 -07:00
GabyCT	8da5f7a72f	Merge pull request #10102 from ChengyuZhu6/fix-debug tests: Fix error with `kubectl debug`	2024-08-02 09:25:13 -06:00
Fabiano Fidêncio	551e0a6287	Merge pull request #10116 from GabyCT/topic/kbsdependencies tests: kbs: Add missing dependencies to install kbs cli	2024-08-02 14:22:28 +02:00
Fabiano Fidêncio	ed57ef0297	ci; aarch64: Enable builders as part of the CI As we have new runners added, let's enable the builders so we can prevent build failures happening after something gets merged. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-02 14:13:53 +02:00
Fabiano Fidêncio	388b5b0e58	Revert "ci: Temporarily remove arm64 builds" This reverts commit `e9710332e7`, as there are now 2 arm64-builders (to be expanded to 4 really soon). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-02 13:53:50 +02:00
Fabiano Fidêncio	08be9c3601	Revert "ci: Temporarily remove arm64 builds -- part II" This reverts commit `c5dad991ce`, as there are now 2 arm64-builders (to be expanded to 4 really soon). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-02 13:52:53 +02:00
Tom Dohrmann	322c80e7c8	qemu: don't emit scsi parameter This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it. Fixes: kata-containers#10112 Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>	2024-08-02 07:30:39 +02:00
Tom Dohrmann	b7999ac765	runtime-rs: don't emit scsi parameter for block devices This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>	2024-08-02 07:30:23 +02:00
Fabiano Fidêncio	4183680bc3	Merge pull request #10107 from fidencio/topic/rotate-journal-logs-every-run tests: k8s: Rotate & cleanup journal for every run	2024-08-02 07:27:10 +02:00
Fabiano Fidêncio	302e02aed8	Merge pull request #10114 from fidencio/topic/kata-manager-configure-qemu-and-ovmf-for-tdx kata-manager: Ensure distro specific TDX config is set	2024-08-02 07:24:57 +02:00
Saul Paredes	194cc7ca81	genpolicy: add --version flag - Add --version flag to the genpolicy tool that prints the current version - Add version.rs.in template to store the version information - Update makefile to autogenerate version.rs from version.rs.in - Add license to Cargo.toml Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-08-01 17:18:17 -07:00
David Esparza	dcd0c0b269	metrics: Remove duplicated headers from results file. This PR removes duplicated entries (vcpus count, and available memory), from onednn and openvino results files. Fixes: #10119 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-08-01 18:11:06 -06:00
Dan Mihai	9e99329bef	genpolicy: reject create sandbox hooks Reject CreateSandboxRequest hooks, because these hooks may be used by an attacker. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-08-01 16:58:35 -07:00
ChengyuZhu6	2eac8fa452	tests: Fix error with `kubectl debug` The issue is similar to #10011. The root cause is that tty and stderr are set to true at same time in containerd: #10031. Fixes: #10081 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-02 07:32:30 +08:00
David Esparza	1e640ec3a6	metrics: fix pargins json results file. This PR encloses the search string for 'default_vcpus =' and 'default_memory =' with double quotes in order to parse the precise values, which are included in the kata configuration file. Fixes: #10118 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-08-01 17:05:03 -06:00
Dan Mihai	c2a55552b2	agent: fix the AllowRequestsFailingPolicy functionality 1. Use the new value of AllowRequestsFailingPolicy after setting up a new Policy. Before this change, the only way to enable AllowRequestsFailingPolicy was to change the default Policy file, built into the Guest rootfs image. 2. Ignore errors returned by regorus while evaluating Policy rules, if AllowRequestsFailingPolicy was enabled. For example, trying to evaluate the UpdateInterfaceRequest rules using a policy that didn't define any UpdateInterfaceRequest rules results in a "not found" error from regorus. Allow AllowRequestsFailingPolicy := true to bypass that error. 3. Add simple CI test for AllowRequestsFailingPolicy. These changes are restoring functionality that was broken recently by commmit `df23eb09a6`. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-08-01 22:37:18 +00:00
Fabiano Fidêncio	66b0305eed	Merge pull request #10117 from fidencio/topic/temporarily-remove-arm-nightly-jobs-part-2 ci: Temporarily remove arm64 builds -- part II	2024-08-01 23:06:46 +02:00
GabyCT	20a88b6470	Merge pull request #10099 from GabyCT/topic/fixmemo metrics: Update memory tests to use grep -F	2024-08-01 13:48:36 -06:00
Fabiano Fidêncio	aef7da7bc9	tests: k8s: Rotate & cleanup journal for every run This will help to avoid huge logs, and allow us to debug issues in a better way. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-01 21:36:57 +02:00
Fabiano Fidêncio	c5dad991ce	ci: Temporarily remove arm64 builds -- part II Let's remove what we commented out, as publish manifest complains: ``` Created manifest list quay.io/kata-containers/kata-deploy-ci:kata-containers-latest ./tools/packaging/release/release.sh: line 146: --amend: command not found ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-01 20:43:28 +02:00
Fabiano Fidêncio	5ec11afc21	Merge pull request #10111 from fidencio/topic/temporarily-remove-arm-nightly-jobs ci: Temporarily remove arm64 builds	2024-08-01 19:50:07 +02:00
Gabriela Cervantes	7454908690	metrics: Update memory tests to use grep -F This PR updates the memory tests like fast footprint to use grep -F instead of fgrep as this command has been deprecated. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-01 17:20:57 +00:00
Gabriela Cervantes	d72cb8ccfc	tests: kbs: Add missing dependencies to install kbs cli This PR adds missing packages depenencies to install kbs cli in a fresh new baremetal environment. This will avoid to have a failure when trying to run install-kbs-client. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-01 17:09:50 +00:00
Fabiano Fidêncio	bfd014871a	kata-manager: Ensure distro specific TDX config is set We've done something quite similar for kata-deploy, but I've noticed we forgot about the kata-manager counterpart. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-01 17:27:01 +02:00
Fabiano Fidêncio	e9710332e7	ci: Temporarily remove arm64 builds It's been a reasonable time that we're not able to even build arm64 artefacts. For now I am removing the builds as it doesn't make sense to keep running failing builds, and those can be re-enabled once we have arm64 machines plugged in that can be used for building the stuff, and maintainers for those machines. The `arm-jetson-xavier-nx-01` is also being removed from the runners. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-01 13:30:47 +02:00
Fabiano Fidêncio	c784fb6508	Merge pull request #10110 from ChengyuZhu6/bump-trustee version: bump trustee version	2024-08-01 07:34:38 +02:00
ChengyuZhu6	d35320472c	version: bump trustee version Bump trustee to the latest version to fix error with pulling busybox from dockerhub. Fixes: #10109 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-01 08:59:58 +08:00
Fupan Li	230aefc0da	Merge pull request #10070 from BbolroC/qemu-runtime-rs-k8s-s390x GHA: Run k8s e2e tests for qemu-runtime-rs on s390x	2024-07-31 18:41:11 +08:00
Chengyu Zhu	8e9f140ee0	Merge pull request #10080 from ChengyuZhu6/fix-coco-ci tests: add image check before running coco tests	2024-07-31 17:08:00 +08:00
Peng Tao	11e10647f9	Merge pull request #10104 from BbolroC/fix-zvsi-cleanup-s390x gha: Restore cleanup-zvsi for s390x	2024-07-31 16:23:26 +08:00
Chengyu Zhu	fc0f635098	Merge pull request #10101 from AdithyaKrishnan/main ci: Fix rate limit error by migrating busybox_image	2024-07-31 14:48:12 +08:00
ChengyuZhu6	2cfb32ac4d	version: bump nydus snapshotter to v0.13.14 bump nydus snapshotter to v0.13.14 to stabilize CIs. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-31 14:47:33 +08:00
ChengyuZhu6	41b7577f08	tests: add image check before running coco tests Currently, there are some issues with pulling images in CI, such as : https://github.com/kata-containers/kata-containers/actions/runs/10109747602/job/27959198585 This issue is caused by switching between different snapshotters for the same image in some scenarios. To resolve it, we can check existing images to ensure all content is available locally before running tests. Fixes: #10029 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-31 14:47:33 +08:00
Hyounggyu Choi	e135d536c5	gha: Restore cleanup-zvsi for s390x In #10096, a cleanup step for kata-deploy is removed by mistake. This leads to a cleanup error in the following `Complete job` step. This commit restores the removed step to resolve the current CI failure on s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-31 06:42:16 +02:00
Adithya Krishnan Kannan	fdf7036d5e	ci: Fix rate limit error by migrating busybox_image Changing the busybox_image from docker to quay to fix rate limit errors. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-07-30 22:32:22 -05:00
Hyounggyu Choi	c8a160d14a	Merge pull request #10096 from BbolroC/remove-pre-post-action-s390x gha: Eradicate {pre,post}-action steps for s390x runners	2024-07-30 22:30:05 +02:00
Hyounggyu Choi	8d529b960a	gha: Eradicate {pre,post}-action steps for s390x runners As suggested in #9934, the following hooks have been introduced for s390x runners: - ACTIONS_RUNNER_HOOK_JOB_STARTED - ACTIONS_RUNNER_HOOK_JOB_COMPLETED These hooks will perfectly replace the existing {pre,post}-action scripts. This commit wipes out all GHA steps for s390x where the actions are triggered. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-30 17:10:19 +02:00
Wainer Moschetta	528745fc88	Merge pull request #10052 from nubificus/feat_fix_qemu_after_8070 runtime-rs: Fix QEMU backend for runtime-rs	2024-07-30 11:00:14 -03:00
Fupan Li	de22b3c4bf	Merge pull request #10024 from lifupan/main runtime-rs: enable dragonball hypervisor support initrd	2024-07-30 16:00:42 +08:00
Fupan Li	e3f0d2a751	runtime-rs: enable dragonball hypervisor support initrd enable the dragonball support initrd. Fixes: #10023 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-07-30 14:50:24 +08:00
Fupan Li	4fbf9d67a5	Merge pull request #10043 from lifupan/fix_sandbox runtime-rs : fix the issue of stop sandbox	2024-07-29 09:22:26 +08:00
Fabiano Fidêncio	949ffd146a	Merge pull request #10083 from microsoft/danmihai1/policy-tests tests: k8s: minor policy tests clean-up	2024-07-28 11:04:24 +02:00
Dan Mihai	3e348e9768	tests: k8s: rename hard-coded policy test script Rename k8s-exec-rejected.bats to k8s-policy-hard-coded.bats, getting ready to test additional hard-coded policies using the same script. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-26 20:14:05 +00:00
Dan Mihai	7b691455c2	tests: k8s: hard-coded policy for any platform Users of AUTO_GENERATE_POLICY=yes: - Already tested auto-generated policy on any platform. - Will be able to test hard-coded policy too on any platform, after this change. CI continues to test hard-coded policies just on the platforms listed here, but testing those policies locally (outside of CI) on other platforms can be useful too. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-26 19:30:03 +00:00
Dan Mihai	83056457d6	tests: k8s-policy-pod: avoid word splitting Avoid potential word splitting when using array of command args array. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-26 18:55:52 +00:00
Dan Mihai	5546ce4031	Merge pull request #10069 from microsoft/danmihai1/exec-args genpolicy: validate each exec command line arg	2024-07-26 11:39:44 -07:00
Fabiano Fidêncio	b0b04bd2f3	Merge pull request #10078 from fidencio/topic/increase-rootfs-confidential-slash-run-to-50-percent tee: osbuilder: Set /run to use 50% of the image with systemd	2024-07-26 18:37:41 +02:00
Anastassios Nanos	d11657a581	runtime-rs: Remove unused env vars from build Since we can't find a homogeneous value for the resource/cgroup management of multiple hypervisors, and we have decoupled the env vars in the Makefile, we don't need the generic ones. Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2024-07-26 14:03:50 +00:00
Anastassios Nanos	3f58ea9258	runtime-rs: Decouple Makefile env VARS To avoid overriding env vars when multiple hypervisors are available, we add per-hypervisor vars for static resource management and cgroups handling. We reflect that in the relevant config files as well. Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2024-07-26 14:02:35 +00:00
Fabiano Fidêncio	5f146e10a1	osbuilder: Add logs for setting up systemd based stuff This helps us to debug any kind of changes. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-07-26 14:22:45 +02:00
Alex Carter	4a8fb475be	tee: osbuilder: Set /run to use 50% of the image with systemd Let's ensure at least 50% of the memory is used for /run, as systemd by default forces it to be 10%, which is way too small even for very small workloads. This is only done for the rootfs-confidential image. Fixes: kata-containers#6775 Signed-off-by: Alex Carter <Alex.Carter@ibm.com> Signed-off-by: Wang, Arron <arron.wang@intel.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.co Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-07-26 14:22:38 +02:00
Chengyu Zhu	2a9ed19512	Merge pull request #9988 from huoqifeng/annotation initdata: add initdata annotation in hypervisor config	2024-07-26 19:59:45 +08:00
Fupan Li	c51ba73199	container: fix the issue of send signal to process It's better to check the container's status before try to send signal to it. Since there's no need to send signal to it when the container's stopped. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-07-26 19:23:43 +08:00
Fupan Li	e156516bde	sandbox: fix the issue of stop sandbox Since stop sandbox would be called in multi path, thus it's better to set and check the sandbox's state. Fixes: #10042 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-07-26 19:23:34 +08:00
Qi Feng Huo	a113fc93c8	initdata: fix unit test code for initdata annotation Added ut code for initdata annotation Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>	2024-07-26 18:24:05 +08:00
Qi Feng Huo	8d61029676	initdata: add unit test code for initdata annotation Added ut code for initdata annotation Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>	2024-07-26 14:20:57 +08:00
Qi Feng Huo	b80057dfb5	initdata: Merge branch 'main' into annotation - Merge branch 'main' into feature branch annotation	2024-07-26 14:01:04 +08:00
Archana Shinde	d7637f93f9	Merge pull request #9899 from amshinde/multiple-networks-fix Fix issue while adding multiple networks with nerdctl	2024-07-25 11:56:27 -07:00
Dan Mihai	a37f10fc87	genpolicy: validate each exec command line arg Generate policy that validates each exec command line argument, instead of joining those args and validating the resulting string. Joining the args ignored the fact that some of the args might include space characters. The older format from genpolicy-settings.json was similar to: "ExecProcessRequest": { "commands": [ "sh -c cat /proc/self/status" ], "regex": [] }, That format will not be supported anymore. genpolicy will detect if its users are trying to use the older "commands" field and will exit with a relevant error message in that case. The new settings format is: "ExecProcessRequest": { "allowed_commands": [ [ "sh", "-c", "cat /proc/self/status" ] ], "regex": [] }, Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-25 16:57:17 +00:00
Dan Mihai	0f11384ede	tests: k8s-policy-pod: exec_command clean-up Use "${exec_command[@]}" for calling both: - add_exec_to_policy_settings - kubectl exec Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-25 16:55:03 +00:00
Dan Mihai	95b78ecaa9	tests: k8s-exec: reuse sh_command variable Reuse sh_command variable instead of repeading "sh". Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-25 16:50:34 +00:00
Alex Lyn	abb0a2659a	Merge pull request #9944 from Apokleos/align-ocispec-rs Align kata oci spec with oci-spec-rs	2024-07-25 19:36:52 +08:00
Alex Lyn	bb2b60dcfc	oci: Delete the kata oci spec It's time to delete the kata oci spec implemented just for kata. As we have already done align OCI Spec with oci-spec-rs. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	b56313472b	agent: Align agent OCI spec with oci-spec-rs Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	882385858d	runtime-rs: Align oci spec in runtime-rs with oci-spec-rs This commit aligns the OCI Spec implementation in runtime-rs with the OCI Spec definitions and related operations provided by oci-spec-rs. Key changes as below: (1) Leveraged oci-spec-rs to align Kata Runtime OCI Spec with the official OCI Spec. (2) Introduced runtime-spec to separate OCI Spec definitions from Kata-specific State data structures. (3) Preserved the original code logic and implementation as much as possible. (4) Made minor code adjustments to adhere to Rust programming conventions; Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	bf813f85f2	runk: Align oci spec with oci-spec-rs Utilized oci-spec-rs to align OCI Spec structures and data representations in runk with the OCI Spec. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	b3eab5ffea	genpolicy: Align agent-ctl OCI Spec with oci-spec-rs Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	c500fd5761	agent-ctl: Align agent-ctl OCI Spec with oci-spec-rs This commit aligns the OCI Spec used within agent-ctl with the oci-spec-rs definition and operations. This enhancement ensures that agent-ctl adheres to the latest OCI standards and provides a more consistent and reliable experience for managing container images and configurations. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	faffee8909	libs: update Cargo config and lock file update Cargo.toml and Cargo.lock for adding runtime-spec Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	8b5499204d	protocols: Reimplement OCI Spec to TTRPC Data Translation This commit transitions the data implementation for OCI Spec from kata-oci-spec to oci-spec-rs. While both libraries adhere to the OCI Spec standard, significant implementation details differ. To ensure data exchange through TTRPC services, this commit reimplements necessary data conversion logic. This conversion bridges the gap between oci-spec-rs data and TTRPC data formats, guaranteeing consistent and reliable data transfer across the system. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:46:07 +08:00
Anastassios Nanos	cda00ed176	runtime-rs: Add FC specific KERNELPARAMS To avoid overriding KERNELPARAMS for other hypervisors, add FC-specific KERNELPARAMS. Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2024-07-25 08:53:57 +00:00
Hyounggyu Choi	d8cac9f60b	GHA: Run k8s e2e tests for qemu-runtime-rs on s390x This commit adds a new CI job for qemu-runtime-rs to the existing zvsi Kubernetes test matrix. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-25 08:11:49 +02:00
Alex Lyn	4e003a2125	Merge pull request #10058 from Apokleos/enhance-vsock-connect runtime-rs: enhance debug info for agent connect.	2024-07-25 11:29:04 +08:00
Alex Lyn	36385a114d	runtime-rs: enhance debug info for agent connect. we need more friendly logs for debugging agent conntion cases when kata pods fail. Fixes #10057 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 08:51:57 +08:00
Dan Mihai	c3adeda3cc	Merge pull request #10051 from microsoft/danmihai1/exec-variable-reuse tests: k8s: reuse policy exec variable	2024-07-24 14:58:40 -07:00
Aurélien Bombo	f08b594733	Merge pull request #9576 from microsoft/saulparedes/support_env_from genpolicy: Add support for envFrom	2024-07-24 13:39:54 -07:00
GabyCT	79edf2ca7d	Merge pull request #10054 from GabyCT/topic/docnydus docs: Update url links in kata nydus document	2024-07-24 14:08:44 -06:00
Archana Shinde	64d6293bb0	tests:Add nerdctl test for testing with multiple netwokrs Add integration test that creates two bridge networks with nerdctl and verifies that Kata container is brought up while passing the networks created. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-24 10:45:56 -07:00
Archana Shinde	49fbae4fb1	agent: Wait for interface in update_interface For nerdctl and docker runtimes, network is hot-plugged instead of cold-plugged. While this change was made in the runtime, we did not have the agent waiting for the device to be ready. On some systems, the device hotplug could take some time causing the update_interface rpc call to fail as the interface is not available. Add a watcher for the network interface based on the pci-path of the network interface. Note, waiting on the device based on name is really not reliable especially in case multiple networks are hotplugged. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-24 10:45:56 -07:00
Dan Mihai	fecb70b85e	tests: k8s: reuse policy exec variable Share a single test script variable for both: - Allowing a command to be executed using Policy settings. - Executing that command using "kubectl exec". Fixes: #10014 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-24 17:42:04 +00:00
Fabiano Fidêncio	162a6b44f6	Merge pull request #10063 from ChengyuZhu6/fix-ci-timeout gha: Increase timeout to run CoCo tests	2024-07-24 15:14:35 +02:00
Pavel Mores	dd1e09bd9d	runtime-rs: add experimental support for memory hotunplugging to qemu-rs Hotunplugging memory is not guaranteed or even likely to work. Nevertheless I'd really like to have this code in for tests and observation. It shouldn't hurt, from experience so far. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-07-24 13:22:41 +02:00
Pavel Mores	3095b65ac3	runtime-rs: support hotplugging memory in QemuInner The bulk of this implementation are simple though tedious sanity checks, alignment computations and logging. Note that before any hotplugging, we query qemu directly for the current size of hotplugged memory. This ensures that any request to resize memory will be properly compared to the actual already available amount and only necessary amount will be added. Note also that we borrow checked_next_multiple_of() from CH implementation. While this might look uncleanly it's just a rather temporary solution since an equivalent function will apparently be part of std soon, likely the upcoming 1.75. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-07-24 13:22:41 +02:00
Pavel Mores	4a1c828bf8	runtime-rs: support hotplugging memory in Qmp The algorithm is rather simple - we query qemu for existing memory devices to figure out the index of the one we're about to add. Then we add a backend object and a corresponding frontend device. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-07-24 13:22:41 +02:00
Pavel Mores	0e0b146b87	runtime-rs: support storage & retrieval of guest memblock size in qemu-rs This will be used for ensuring that hotplugged memory block sizes are properly aligned. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-07-24 13:22:41 +02:00
Alex Lyn	efb7390357	kata-sys-utils: align OCI Spec with oci-spec-rs Do align oci spec and fix warnings to make clippy happy. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-24 14:38:48 +08:00
Alex Lyn	012029063c	runtime-spec: Introduce runtime-spec for Container State As part of aligning the Kata OCI Spec with oci-spec-rs, the concept of "State" falls outside the scope of the OCI Spec itself. While we'll retain the existing code for State management for now, to improve code organizationand clarity, we propose moving the State-related code from the oci/ dir to a dedicated directory named runtime-spec/. This separation will be completed in subsequent commits with the removal of the oci/ directory. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-24 14:38:30 +08:00
Zvonko Kaiser	a388d2b8d4	Merge pull request #9919 from zvonkok/ubuntu-dockerfile gpu: rootfs ubuntu build expansion	2024-07-24 08:05:54 +02:00
ChengyuZhu6	2b44e9427c	gha: Increase timeout to run CoCo tests This PR increases the timeout for running the CoCo tests to avoid random failures. These failures occur when the action `Run tests` times out after 30 minutes, causing the CI to fail. Fixes: #10062 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-24 12:31:38 +08:00
GabyCT	b408cc1694	Merge pull request #10060 from GabyCT/topic/fgreptest metrics: Update launch times to use grep -F	2024-07-23 17:23:14 -06:00
Gabriela Cervantes	0e5489797d	docs: Update url links in kata nydus document This PR updates the url links in the kata nydus document. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-23 17:49:12 +00:00
Gabriela Cervantes	3d17a7038a	metrics: Update launch times to use grep -F This PR updates the metrics launch times to use grep -F instead of fgrep as this command has been deprecated. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-23 17:13:52 +00:00
Zvonko Kaiser	941577ab3b	gpu: rootfs ubuntu build expansion For the GPU build we need go/rust and some other helpers to build the rootfs. Always use versions.yaml for the correct and working Rust and golang version Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-23 14:31:35 +00:00
Steve Horsman	d69950e5c6	Merge pull request #10053 from stevenhorsman/release-env-var ci: cache: Pass through RELEASE env	2024-07-22 21:53:20 +01:00
Dan Mihai	f26d595e5d	Merge pull request #9910 from microsoft/saulparedes/set_policy_rego_via_env tools: Allow setting policy rego file via	2024-07-22 11:00:30 -07:00
stevenhorsman	66f6ec2919	ci: cache: Pass through RELEASE env In kata-deploy-binaries.sh we want to understand if we are running as part of a release, so we need to pass through the RELEASE env from the workflow, which I missed in https://github.com/kata-containers/kata-containers/pull/9550 Fixes: #9921 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-22 16:39:35 +01:00
Zvonko Kaiser	5765b6e062	Merge pull request #9920 from zvonkok/initrd-builer gpu: rootfs/initrd build init	2024-07-22 15:06:49 +02:00
Zvonko Kaiser	73bcb09232	Merge pull request #9968 from zvonkok/kernel-gpu-dragonball-6.1.x dragonball: kernel gpu dragonball 6.1.x	2024-07-22 13:03:14 +02:00
Zvonko Kaiser	3029e6e849	gpu: rootfs/initrd build init Initramfs expects /init, create symlink only if ${ROOTFS}/init does not exist Init may be provided by other packages, e.g. systemd or GPU initrd/rootfs Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-22 10:19:05 +00:00
Saul Paredes	b7a184a0d8	rootfs: Allow AGENT_POLICY_FILE te be an absolute path Don't set AGENT_POLICY_FILE as $script_dir may change Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-21 14:57:41 -07:00
Alex Lyn	67466aa27f	kata-types: do alignment of oci-spec for kata-types Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-21 22:54:43 +08:00
Hyounggyu Choi	c774cd6bb0	Merge pull request #10031 from ChengyuZhu6/fix-log-contain-tdx tests: Fix missing log on TDX	2024-07-20 07:26:08 +02:00
ChengyuZhu6	6ea6e85f77	tests: Re-enable authenticated image tests on tdx Try to re-enable authenticated image tests on tdx. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-20 12:10:02 +08:00
ChengyuZhu6	3476fb481e	tests: Fix missing log on TDX Currently, we have found that `assert_logs_contain` does not work on TDX. We manually located the specific log, but it fails to get the log using `kubectl debug`. The error found in CI is: ``` warning: couldn't attach to pod/node-debugger-984fee00bd70.jf.intel.com-pdgsj, falling back to streaming logs: error stream protocol error: unknown error ``` Upon debugging the TDX CI machine, we found an error in containerd: ``` Attach container from runtime service failed" err="rpc error: code = InvalidArgument desc = tty and stderr cannot both be true" containerID="abc8c7a546c5fede4aae53a6ff2f4382ff35da331bfc5fd3843b0c8b231728bf" ``` We believe this is the root cause of the test failures in TDX CI. Therefore, we need to ensure that tty and stderr are not set to true at same time. Fixes: #10011 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Wang, Arron <arron.wang@intel.com>	2024-07-20 12:10:01 +08:00
Steve Horsman	7dd560f07f	Merge pull request #9620 from l8huang/kernel Add kernel config for NVIDIA DPU/ConnectX adapter	2024-07-19 23:16:51 +01:00
Dan Mihai	3127dbb3df	Merge pull request #10035 from microsoft/danmihai1/k8s-credentials-secrets tests: k8s-credentials-secrets: policy for second pod	2024-07-19 12:44:21 -07:00
Saul Paredes	2681fc7eb0	genpolicy: Add support for envFrom This change adds support for the `envFrom` field in the `Pod` resource Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-19 09:53:58 -07:00
GabyCT	be2d4719c2	Merge pull request #10040 from kata-containers/fix_blogbench_midvalues metrics: update avg reference values for blogbench.	2024-07-19 09:51:29 -06:00
Zvonko Kaiser	8eaa2f0dc8	dragonball: Add GPU support Build a GPU flavoured dragonball kernel Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-19 14:48:05 +00:00
Dan Mihai	44e443678d	Merge pull request #9835 from microsoft/saulparedes/test_policy_on_sev gha: enable autogenerated policy testing on SEV and SEV-SNP	2024-07-19 07:46:01 -07:00
Greg Kurz	dc97f3f540	Merge pull request #10045 from lifupan/cleanup_container runtime-rs: container: fix the issue of missing cleanup container	2024-07-19 16:36:04 +02:00
Alex Lyn	d0dc67bb96	Merge pull request #8597 from amshinde/vfio-hotplug-support Implement hotplug support for physical endpoints	2024-07-19 13:41:11 +08:00
Lei Huang	20f6979d8f	build: add kernel config for Nvidia DPU/ConnectX adapter With Nvidia DPU or ConnectX network adapter, VF can do VFIO passthrough to guest VM in `guest-kernel` mode. In the guest kernel, the adapter's driver is required to claim the VFIO device and create network interface. Signed-off-by: Lei Huang <leih@nvidia.com>	2024-07-18 22:29:16 -07:00
Fupan Li	8a2f7b7a8c	container: fix the issue of missing cleanup container When create container failed, it should cleanup the container thus there's no device/resource left. Fixes: #10044 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-07-19 11:02:55 +08:00
ms-mahuber	ddff762782	tools: Allow setting policy rego file via environment variable * Set policy file via env var * Add restrictive policy file to kata-opa folder * Change restrictive policy file name * Change relative default path location * Add license headers Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-18 15:05:45 -07:00
David Esparza	60f52a4b93	metrics: update avg reference values for blogbench. This PR updates the Blogbench reference values for read and write operations used in the CI check metrics job. This is due to the update to version 1.2 of blobench. Fixes: #10039 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-18 15:47:14 -06:00
Greg Kurz	fc4357f642	Merge pull request #10034 from BbolroC/hide-repack_secure_image-from-test tests: Call repack_secure_image() in set_metadata_annotation()	2024-07-18 23:03:41 +02:00
Aurélien Bombo	ab6f37aa52	Merge pull request #10022 from microsoft/danmihai1/probes-and-lifecycle genpolicy: container.exec_commands args validation	2024-07-18 12:21:31 -07:00
Steve Horsman	256ab50f1a	Merge pull request #9959 from sprt/fix-ci-cleanup ci: cleanup: Ignore nonexisting resources	2024-07-18 19:23:48 +01:00
David Esparza	1fdc5c1183	Merge pull request #10028 from amshinde/upgrade-blogbench-1.2 metric: Upgrade blogbench to 1.2	2024-07-18 11:30:17 -06:00
Hyounggyu Choi	a7e4d3b738	tests: Call repack_secure_image() in set_metadata_annotation() It is not good practice to call repack_secure_image() from a bats file because the test code might not consider cases where `qemu-se` is used as `KATA_HYPERVISOR`. This commit moves the function call to set_metadata_annotation() if a key includes `kernel_params` and `KATA_HYPERVISOR` is set to `qemu-se`, allowing developers to focus on the test scenario itself. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-18 18:09:45 +02:00
Dan Mihai	035a42baa4	tests: k8s-credentials-secrets: policy for second pod Add policy to pod-secret-env.yaml from k8s-credentials-secrets.bats. Policy was already auto-generated for the other pod used by the same test (pod-secret.yaml). pod-secret-env.yaml was inconsistent, because it was taking advantage of the "allow all" policy built into the Guest image. Sooner or later, CI Guests for CoCo will not get the "allow all" policy built in anymore and pod-secret-env.yaml would have stopped working then. Note that pod-secret-env.yaml continues to use an "allow all" policy after these changes. #10033 must be solved before a more restrictive policy will be generated for pod-secret-env.yaml. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-18 15:03:57 +00:00
Hyounggyu Choi	d2ac01c862	Merge pull request #10032 from BbolroC/fix-image-authenticated-for-s390x tests: Rebuild secure boot image for guest-pull-image-authenticated for IBM SE	2024-07-18 17:00:18 +02:00
Hyounggyu Choi	6e7ee4bdab	tests: Rebuild secure image for guest-pull-image-authenticated on SE Since #9904 was merged, newly introduced tests for `k8s-guest-pull-image-authenticated.bats` have been failing on IBM SE (s390x). The agent fails to start because a kernel parameter cannot pass to the guest VM via annotation. To fix this, the boot image must be rebuilt with updated parameters. This commit adds the rebuilding step in create_pod_yaml_with_private_image() for `qemu-se`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-18 14:56:12 +02:00
Archana Shinde	1636c201f4	network: Implement network hotunplug for physical endpoints Similar to HotAttach, the HotDetach method signature for network endoints needs to be changed as well to allow for the method to make use of device manager to manage the hot unplug of physical network devices. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 16:42:41 -07:00
Archana Shinde	c6390f2a2a	vfio: Introduce function to get vfio dev path This function will be later used to get the vfio dev path. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 16:42:41 -07:00
Archana Shinde	1e304e6307	network: Implement hotplug for physical endpoints Enable physical network interfaces to be hotplugged. For this, we need to change the signature of the HotAttach method to make use of Sandbox instead of Hypervisor. Similar approach was followed for Attach method, but this change was overlooked for HotAttach. The signature change is required in order to make use of device manager and receiver for physical network enpoints. Fixes: #8405 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 16:42:40 -07:00
Archana Shinde	2fef4bc844	vfio: use driver_override field for device binding. The current implementation for device binding using driver bind/unbind and new_id fails in the scenario when the physical device is not bound to a driver before assigning it to vfio. There exists and updated mechanism to accomplish the same that does not have the same issue as above. The driver_override field for a device allows us to specify the driver for a device rather than relying on the bound driver to provide a positive match of the device. It also has other advantages referenced here: https://patchwork.kernel.org/project/linux-pci/patch/1396372540.476.160.camel@ul30vt.home/ So use the updated driver_override mechanism for binding/unbinding a physical device/virtual function to vfio-pci. Signed-off-by: liangxianlong <liang.xianlong@zte.com.cn> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 16:42:40 -07:00
GabyCT	6aff5f300a	Merge pull request #10021 from GabyCT/topic/fixarchdoc docs: Update devmapper docs	2024-07-17 14:56:40 -06:00
Saul Paredes	57d2ded3e2	gha: enable autogenerated policy testing on SEV-SNP Enable autogenerated policy testing on SEV-SNP Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-17 13:32:06 -07:00
Archana Shinde	30e5e88ff1	metric: Upgrade blogbench to 1.2 Move to blogbench 1.2 version from 1.1. This version includes an important fix for the read_score test which was reported to be broken in the previous version. It essentially fixes this issue here: https://github.com/jedisct1/Blogbench/issues/4 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 11:32:09 -07:00
Steve Horsman	e5d5284761	Merge pull request #10026 from wainersm/release_370 release: Bump VERSION to 3.7.0	2024-07-17 18:43:51 +01:00
Wainer dos Santos Moschetta	6f7ab31860	release: Bump VERSION to 3.7.0 On preparation for the 3.7.0 release, bumped the version in VERSION file. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-07-17 14:19:44 -03:00
Saul Paredes	b3cc8b200f	gha: enable autogenerated policy testing on SEV Enable autogenerated policy testing on SEV Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-17 09:55:13 -07:00
Dan Mihai	f31c1b121e	Merge pull request #9812 from microsoft/saulparedes/test_policy_on_tdx gha: enable policy testing on TDX	2024-07-17 08:47:44 -07:00
Dan Mihai	449103c7bf	Merge pull request #10020 from microsoft/danmihai1/pod-security-context tests: fix ps command in k8s-security-context	2024-07-17 08:12:57 -07:00
Fabiano Fidêncio	b7051890af	Merge pull request #9722 from zvonkok/busybox-build deploy: Add busybox target	2024-07-17 13:47:15 +02:00
Steve Horsman	5ce2c1010a	Merge pull request #9904 from stevenhorsman/registry-authentication Support for registry authentication in guest pull	2024-07-17 10:48:38 +01:00
Fupan Li	65f2bfb8c4	Merge pull request #9967 from zvonkok/kernel-dragonball-6.1.x dragonball: kernel dragonball 6.1.x	2024-07-17 14:38:06 +08:00
Dan Mihai	0e86a96157	tests: fix ps command in k8s-security-context 1. Use a container image that supports "ps --user 1000 -f". 2. Execute that command using: sh -c "ps --user 1000 -f" instead of passing additional arguments to sh: sh -c ps --user 1000 -f Fixes: #10019 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-17 01:33:31 +00:00
Dan Mihai	9f4d1ffd43	genpolicy: container.exec_commands args validation Keep track of individual exec args instead of joining them in the policy text. Verifying each arg results in a more precise policy, because some of the args might include space characters. This improved validation applies to commands specified in K8s YAML files using: - livenessProbe - readinessProbe - startupProbe - lifecycle.postStart - lifecycle.preStop Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-17 01:19:23 +00:00
Dan Mihai	b23ea508d5	tests: k8s: container.exec_commands policy tests Add tests for genpolicy's handling of container.exec_commands. These are commands allowed by the policy and originating from these input K8s YAML fields: - livenessProbe - readinessProbe - startupProbe - lifecycle.postStart - lifecycle.preStop Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-17 01:19:00 +00:00
stevenhorsman	567b4d5788	test/k8s: Fix up node logging typo We had a typo in the attestation tests that we've copied around a lot and Wainer spotted it in the authenticated registry tests, so let's fix it up now Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
stevenhorsman	0015c8ef51	tests: Add guest-pull auth registry tests Add three new test cases for guest pull from an authenticated registry for the following scenarios: _Scenario: Creating a container from an authenticated image, with correct credentials via KBC works_ Given An authenticated container registry quay.io/kata-containers/confidential-containers-auth And a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for [guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml) And a KBS set up to have the correct auth.json for registry quay.io/kata-containers/confidential-containers-auth embedded in the `"Credential"` section of `its resources file` When I create a pod from the container image quay.io/kata-containers/confidential-containers-auth:test Then The pull image works and the pod can start _Scenario: Creating a container from an authenticated image, with incorrect credentials via KBC fails_ Given An authenticated container registry quay.io/kata-containers/confidential-containers-auth And a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for [guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml) And An installed kata CC with the sample_kbs set up to have the auth.json for registry quay.io/kata-containers/confidential-containers-auth embedded in the `"Credential"` resource, but with a dummy user name and password When I create a pod from the container image quay.io/kata-containers/confidential-containers-auth:test Then The pull image fails with a message that reflects that the authorisation failed _Scenario: Creating a container from an authenticated image, with no credentials fails_ Given An authenticated container registry quay.io/kata-containers/confidential-containers-auth And a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for [guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml) And An installed kata CC with no credentials section When I create a pod from the container image quay.io/kata-containers/confidential-containers-auth:test Then The pull image fails with a message that reflects that the authorisation failed Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
stevenhorsman	eb07f5ef5e	agent: doc: Fix ordering of options - Fix the config options to be back in alphabetical order to be easier to find Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
stevenhorsman	7cc81ce867	agent: image: Set image-rs auth config If the agent-config has a value for `image_registry_auth`, Then pass this to the image-rs client and enable auth mode too Fixes: #8122 Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
stevenhorsman	265322990a	agent: config: Add config option to provide auth for guest-pull Add optional config for agent.image_registry_auth, to specify the uri of credentials to be used when pulling images in the guest from an authenticated registry Fixes: #8122 Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com> Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
Steve Horsman	064b45a2fa	Merge pull request #10016 from wainersm/ibm-se-auth-reg workflows: setup environment to run auth registry tests on s390x	2024-07-16 22:24:39 +01:00
Gabriela Cervantes	d2866081d2	docs: Update devmapper docs This PR updates the devmapper docs by updating the url link for the current containerd devmapper information. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-16 21:07:51 +00:00
GabyCT	2206e2dd5c	Merge pull request #10013 from GabyCT/topic/updatecontdoc docs: Update cri installion guide url in containerd documentation	2024-07-16 14:32:59 -06:00
Wainer dos Santos Moschetta	66c600f8d8	gha: delint the s390x workflow Made run-k8s-tests-on-zvsi.yaml free of warnings by removing: SC2086:info:1:1: Double quote to prevent globbing and word splitting ... SC2086:info:2:1: Double quote to prevent globbing and word splitting ... Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-07-16 15:20:46 -03:00
Wainer dos Santos Moschetta	a98985fab8	gha: export user/password for auth registry tests on s390x Counterpart of commit `d8961cbd4a` for run-k8s-tests-on-zvsi workflow Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-07-16 15:18:40 -03:00
Saul Paredes	af49252c69	gha: enable policy testing on TDX Enable policy testing on TDX Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-15 14:09:49 -07:00
Saul Paredes	0b3d193730	genpolicy: Support cpath for mount sources Add setting to allow specifying the cpath for a mount source. cpath is the root path for most files used by a container. For example, the container rootfs and various files copied from the Host to the Guest when shared_fs=none are hosted under cpath. mount_source_cpath is the root of the paths used a storage mount sources. Depending on Kata settings, mount_source_cpath might have the same value as cpath - but on TDX for example these two paths are different: TDX uses "/run/kata-containers" as cpath, but "/run/kata-containers/shared/containers" as mount_source_cpath. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-15 14:09:49 -07:00
Gabriela Cervantes	e4045ff29a	docs: Update runtime v2 containerd url information This PR updates the runtime v2 containerd url information at containerd documentation. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-15 20:36:17 +00:00
Dan Mihai	bcaf7fc3b4	Merge pull request #10008 from microsoft/danmihai1/runAsUser genpolicy: add support for runAsUser fields	2024-07-15 12:08:50 -07:00
Gabriela Cervantes	9f738f0d05	docs: Update cri installion guide url in containerd documentation This PR updates the cri installation guide url link in the containerd documentation guide as the previous url link does not exists. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-15 16:58:18 +00:00
Dan Mihai	648265d80e	Merge pull request #9998 from microsoft/danmihai1/GENPOLICY_PULL_METHOD tests: k8s: GENPOLICY_PULL_METHOD clean-up	2024-07-15 09:32:29 -07:00
Steve Horsman	02b9fd6e95	Merge pull request #9382 from Xynnn007/feat-encrypt-image Merge to main: supporting pull encrypted images	2024-07-15 15:58:42 +01:00
stevenhorsman	b060fb5b31	tests/k8s: Skip measured rootfs test The only kernel built for measured rootfs was the kernel-tdx-experimental, so this test only ran in the qemu-tdx job runs the test. In commit `6cbdba7` we switched all TEE configurations to use the same kernel-confidential, so rootfs measured is disabled for qemu-tdx too now. The VM still fails to boot (because of a different reason...) but the bug in the assert_logs_contain, fixed in this PR was masking the checks on the logs. We still have a few open issues related to measured rootfs and generating the root hash, so let's skip this test that doesn't work until they are looked at Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-15 12:00:50 +01:00
stevenhorsman	2cf94ae717	tests: Add guest-pull encrypted image tests Add three new tests cases for guest-pull of an encrypted image for the following scenarios: _Scenario: Pull encrypted image on guest with correct key works_ Given I have a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for guest-pulling And A public encrypted container image i with a decryption key k that is configured as a resource the KBS, so that image-rs on the guest can connect to it When I try and create a pod from i Then The pod is successfully created and runs _Scenario: Cannot pull encrypted image with no decryption key_ Given I have a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for guest-pulling And A public encrypted container image i with a decryption key k, that is not configured in a KBS that image-rs on the guest can connect to When I try and create a pod from i Then The pod is not created with an error message that reflects why _Scenario: Cannot pull encrypted image with wrong decryption key_ Given I have a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for guest-pulling And A public encrypted container image i with a decryption key k and a different key k' that is set as a resource in a KBS, that image-rs on the guest can connect to When I try and create a pod from i Then The pod is not created with an error message that reflects why Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-15 12:00:50 +01:00
Xynnn007	a56b15112a	agent: add ocicrypt config ocicrypt config is for kata-agent to connect to CDH to request for image decryption key. This value is specified by an env. We use this workaround the same as CCv0 branch. In future, we will consider better ways instead of writting files and setting envs inside inner logic of kata-agent. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-07-15 12:00:50 +01:00
Xynnn007	1072658219	agent: Enable kata-cc-rustls-tls in image-rs - Enable the kata-cc-rustls-tls feature in image-rs, so that it can get resources from the KBS in order to retrieve the registry credentials. - Also bump to the latest image-rs to pick up protobuf fixes - Add libprotobuf-dev dependency to the agent packaging as it is needed by the new image-rs feature - Add extra env in the agent make test as the new version of the anyhow crate has changed the backtrace capture thus unit tests of kata-agent that compares a raised error with an expected one would fail. To fix this, we need only panics to have backtraces, thus set RUST_BACKTRACE=0 for tests due to document https://docs.rs/anyhow/latest/anyhow/ Fixes #9538 Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-15 12:00:50 +01:00
stevenhorsman	3b72e9ffab	tests/k8s: Fix assert_logs_contain The pipe needs adding to the grep, otherwise the grep gets consumed as an argument to `print_node_journal` and run in the debug pod. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-15 12:00:50 +01:00
Hyounggyu Choi	83b3a681f4	Merge pull request #10010 from BbolroC/osbuilder-bump-fedora-to-40 osbuilder: Bump Fedora to 40	2024-07-15 13:00:28 +02:00
Greg Kurz	203d9e7803	Merge pull request #10000 from littlejawa/kata_deploy_add_storage_config_for_crio kata-deploy: add storage configuration for cri-o	2024-07-15 12:29:21 +02:00
Hyounggyu Choi	08d2f6bfe4	osbuilder: Bump Fedora to 40 As Fedora 38 has reached EOL, we are encountering 404 errors for s390x, such as: ``` Status code: 404 for https://dl.fedoraproject.org/pub/fedora-secondary/updates/38/Everything/s390x/repodata/repomd.xml ``` Let's bump the OS to the latest version. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-15 09:58:54 +02:00
Fupan Li	a7179be31d	Merge pull request #9534 from Tim-Zhang/fix-stdin-stuck Fix ctr exec stuck problem	2024-07-15 13:19:19 +08:00
Dan Mihai	dded329d26	tests: k8s: SecurityContext.runAsUser policy test Add test for auto-generating policy for a pod spec that includes the SecurityContext.runAsUser field. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-13 01:37:58 +00:00
Dan Mihai	7040fb8c50	tests: k8s-security-context auto-generated policy Auto-generate the policy in k8s-security-context.bats - previously blocked by lacking support for PodSecurityContext.runAsUser. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-13 01:23:54 +00:00
Dan Mihai	f087044ecb	genpolicy: add support for runAsUser Add ability to auto-generate policy for SecurityContext.runAsUser and PodSecurityContext.runAsUser. Fixes: #8879 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-13 01:10:43 +00:00
Dan Mihai	5282701b5b	genpolicy: add link to allow_user() active issue Improve comment to workaround in rules.rego, to explain better the reason for that workaround. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-13 01:05:58 +00:00
GabyCT	3c0171df3d	Merge pull request #10005 from GabyCT/topic/katadragonball common: Add share fs information for dragonball	2024-07-12 16:10:29 -06:00
Wainer Moschetta	646d7ea4fb	Merge pull request #9951 from BbolroC/enable-attestation-for-ibm-se tests: Enable attestation e2e tests for IBM SE	2024-07-11 16:02:59 -03:00
Hyounggyu Choi	ca80301b4b	Merge pull request #10003 from BbolroC/skip-pod-shared-volume-for-ibm-se k8s: Skip shared-volume relevant tests for IBM SE	2024-07-11 19:29:13 +02:00
Gabriela Cervantes	4477b4c9dc	common: Add share fs information for dragonball This PR adds the share fs information for dragonball using kata-ctl to avoid the failures in runk tests saying that shared_fs is an unbound variable. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-11 17:09:35 +00:00
Dan Mihai	09c5ca8032	tests: k8s: clarify the need to use containerd.sock Modify the permissions of containerd.sock just when genpolicy needs access to this socket, when testing GENPOLICY_PULL_METHOD=containerd. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-11 16:49:58 +00:00
Dan Mihai	c1247cc254	tests: k8s: explain the default containerd settings Explain why the containerd settings on the local machine get set to containerd's defaults when testing GENPOLICY_PULL_METHOD=containerd. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-11 16:49:39 +00:00
Dan Mihai	3b62eb4695	tests: k8s: add comment for GENPOLICY_PULL_METHOD Explain why there are two different methods for pulling container images in genpolicy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-11 16:40:01 +00:00
Dan Mihai	eaedd21277	tests: k8s: use oci-distribution as default value oci-distribution is the value used by run-k8s-tests-on-aks.yaml, so use the same value as default for GENPOLICY_PULL_METHOD in gha-run.sh. The value of GENPOLICY_PULL_METHOD is currently compared just with "containerd", but avoid possible future problems due to using a different default value in gha-run.sh. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-11 16:40:01 +00:00
GabyCT	2056eda5f0	Merge pull request #9922 from GabyCT/topic/updateblogname metrics: Update container name in blogbench test	2024-07-11 10:05:35 -06:00
Hyounggyu Choi	32c3e55cde	k8s: Skip shared-volume relevant tests for IBM SE Currently, it is not viable to share a writable volume (e.g., emptyDir) between containers in a single pod for IBM SE. The following tests are relevant: - pod-shared-volume.bats - k8s-empty-dirs.bats (See: https://github.com/kata-containers/kata-containers/issues/10002) This commit skips the tests until the issue is resolved. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-11 14:09:19 +02:00
Julien Ropé	b83d4e1528	kata-deploy: add storage configuration for cri-o Make sure that the "skip_mount_home" flag is set in cri-o config. Fixes: #9878 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-07-11 10:11:30 +02:00
Qi Feng Huo	4d66ee1935	initdata: add initdata annotation in hypervisor config - Add Initdata annotation for hypervisor config, so that it can be passed when CreateVM Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>	2024-07-11 10:56:18 +08:00
GabyCT	dac07239f5	Merge pull request #9974 from squarti/sharedfs runtime: Initialize SharedFS for remote hypervisor	2024-07-10 17:03:00 -06:00
GabyCT	3827b5f9f2	Merge pull request #9982 from ChengyuZhu6/fix-ci tests: Delete test scripts forcely	2024-07-10 17:00:41 -06:00
Wainer Moschetta	deb4627558	Merge pull request #9975 from niteeshkd/nd_snp_attestation gha: enable SNP attestation	2024-07-10 18:59:05 -03:00
GabyCT	c40b3b4ce7	Merge pull request #9992 from sprt/fix-nydus ci: fix run-nydus tests	2024-07-10 13:56:16 -06:00
David Esparza	be9385342e	Merge pull request #9990 from GabyCT/topic/tdxtimeout gha: Increase timeout to run CoCo TDX tests	2024-07-10 13:21:23 -06:00
Silenio Quarti	8260ce8d15	runtime: Initialize SharedFS for remote hypervisor Sets SharedFS config to NoSharedFS for remote hypervisor in order to start the file watcher which syncs files from the host to the guest VMs. Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-07-10 14:31:25 -03:00
Aurélien Bombo	25e0e2fb35	ci: fix run-nydus tests GH-9973 introduced: * New function get_kata_memory_and_vcpus() in tests/metrics/lib/common.bash. * A call to get_kata_memory_and_vcpus() from extract_kata_env(), which is defined in tests/common.bash. Because the nydus test only sources tests/common.bash, it can't find get_kata_memory_and_vcpus() and errors out. We fix this by moving the get_kata_memory_and_vcpus() call from tests/common.bash to tests/metrics/lib/json.bash so that it doesn't impact the nydus test. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-10 17:19:08 +00:00
Gabriela Cervantes	b6b8524ab7	gha: Increase timeout to run CoCo TDX tests This PR increases the timeout to run the CoCo TDX tests in order to avoid the random failures on TDX saying that The action 'Run tests' has timed out after 30 minutes and making the GHA job fail. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-10 16:06:07 +00:00
Niteesh Dubey	e8a3f8571e	docs: update for SNP attestation This updates how-to document for SNP attestation. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-10 15:06:55 +00:00
Niteesh Dubey	ff04154fdb	gha: enable SNP attestation This removes the code to skip the SNP attestation. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-10 15:06:55 +00:00
Hyounggyu Choi	d94b285189	tests: Enable k8s-confidential-attestation.bats for s390x For running a KBS with `se-verifier` in service, specific credentials need to be configured. (See https://github.com/confidential-containers/trustee/tree/main/attestation-service/verifier/src/se for details.) This commit introduces two procedures to support IBM SE attestation: - Prepare required files and directory structure - Set necessary environment variables for KBS deployment - Repackage a secure image once the KBS service address is determined These changes enable `k8s-confidential-attestation.bats` for s390x. Fixes: #9933 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Hyounggyu Choi	5d0f74cd70	local-build: Extract build_secure_image() as a separate library Currently, all functions in `build_se_image.sh` are dedicated to publishing a payload image. However, `build_secure_image()` is now also used for repackaging a secure image when a kernel parameter is reconfigured. This reconfiguration is necessary because the KBS service address is determined after the initial secure image build. This commit extracts `build_secure_image()` from `build_se_image.sh` and creates a separate library, which can be loaded by bats-core. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Hyounggyu Choi	bf2f0ea2ca	tests: Change a location for creating key.bin The current KBS deployment creates a file `key.bin` assuming that `kustomization.yaml` is located in `overlays/`. However, this does not hold true when the kustomize config is enabled for multiple architectures. In such cases, the configuration file should be located in `overlays/$(uname -m)`. This commit changes the location for file creation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Hyounggyu Choi	4025ef7193	versions: Bump trustee to multi-arch deployment for KBS As part of the enablement for s390x, KBS should support multi-arch deployment. This commit updates the version of coco-trustee to a commit where the support is implemented. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Hyounggyu Choi	856a1f72c6	packaging: Set ATTESTER to se-attester for guest components on s390x This commit allows the guest-components builder to only build se-attester on s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Xuewei Niu	7f71eac6de	Merge pull request #9868 from l8huang/dan runtime: implement DAN in Go kata-runtime	2024-07-10 19:09:46 +08:00
Alex Lyn	dafff26f01	Merge pull request #9814 from Apokleos/bugfix-pcipath runtime-rs: bugfix for root bus slot allocation	2024-07-10 16:19:06 +08:00
Steve Horsman	aa487307e8	Merge pull request #9962 from GabyCT/topic/removecif scripts: Eliminate CI variable as it is not longer used	2024-07-10 09:02:33 +01:00
Steve Horsman	78bbc51ff0	Merge pull request #9806 from niteeshkd/nd_snp_certs runtime: pass certificates to get extended attestation report for SNP coco	2024-07-10 08:57:45 +01:00
Steve Horsman	29413021e5	Merge pull request #9981 from stevenhorsman/run-k8s-tests-on-zvsi-inherit-secrets gha: make run-k8s-tests-on-zvsi inherit secrets	2024-07-10 08:49:11 +01:00
Lei Huang	171d298dea	runtime: implement DAN in Go kata-runtime The DAN feature has already been implemented in kata-runtime-rs, and this commit brings the same capability to the Go kata-runtime. Fixes: #9758 Signed-off-by: Lei Huang <leih@nvidia.com>	2024-07-10 00:22:30 -07:00
ChengyuZhu6	489afffd8c	tests:gha: delete namespace before resetting namespace Delete the kata-containers-k8s-tests namespace before resetting the namespace to ensure that no deployments or services are restarting and creating pods in the default namespace. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Wang, Arron <arron.wang@intel.com>	2024-07-10 12:08:28 +08:00
ChengyuZhu6	e874c8fa2e	tests: Delete test scripts forcely Delete test scripts forcely in `Delete kata-deploy` step before deleting all kata pods. Fixes: #9980 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-10 12:08:28 +08:00
Alex Lyn	806e959b01	runtime-rs: bugfix for device slot allocation failed in dragonball In dragonball Vfio device passthrough scenarois, the first passthrough device will be allocated slot 0 which is occupied by root device. It will cause error, looks like as below: ``` ... 6: failed to add VFIO passthrough device: NoResource\n 7: no resource available for VFIO device"): unknown ... ``` To address such problem, we adopt another method with no pre-allocated guest device id and just let dragonball auto allocate guest device id and return it to runtime. With this idea, add_device will return value Result<DeviceType> and apply the change to related code. Fixes #9813 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-10 10:59:57 +08:00
Alex Lyn	27947cbb0b	dragonball: make add vfio device return guest device id Fixes #9813 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-10 10:59:51 +08:00
Alex Lyn	fa4af09658	Merge pull request #9985 from GabyCT/topic/fixcrites cri-containerd: Remove use_devmapper variable for cri-containerd tests	2024-07-10 10:13:27 +08:00
Alex Lyn	e4997760f1	Merge pull request #9987 from kata-containers/remove_double_process_check_from_memory_usage_test metrics: Remove duplicate check of processes from memory test.	2024-07-10 10:12:18 +08:00
David Esparza	09f523c815	Merge pull request #9973 from kata-containers/add_memory_and_vcpus_info_to_results Add memory and vcpus info to metrics results	2024-07-09 18:05:07 -06:00
David Esparza	e77d44614b	metrics: Remove duplicate check of processes from memory test. This PR removes the common_init function call from the memory usage script to eliminate duplicate checking that is also done from the init_env function. It also eliminates duplicaction of nested conditionals. Fixes: #9984 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 12:34:51 -06:00
Gabriela Cervantes	7061272b4e	kernel: bump kata config version This PR bumps the kata config version as the kernel scripts were modified. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Gabriela Cervantes	de848c1458	packaging: Remove CI variable from build kernel script This PR removes the CI variable from build kernel script which is not longer supported it as this was part of the jenkins environment. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Gabriela Cervantes	28601b51d2	tools: Remove CI variable in kata deploy in docker script This PR removes the CI variable in kata deploy in docker script which was supported it in jenkins environment which is not longer being supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Gabriela Cervantes	f2b8c6619d	makefile: Remove CI variable from local build makefile This PR removes the CI variable from the local build makefile as this was part of the jenkins environment which is not longer supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Gabriela Cervantes	4161fa3792	tools: Remove CI variable in test images script for osbuilder This PR removes the CI variable in test images script for osbuilder as this was part of the jenkins environment which is not longer supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Greg Kurz	7506d1ec29	tools: Remove CI variable in test config osbuilder script This PR removes the CI variable in test config osbuilder script which was supported on the jenkins environment which is not longer supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> [greg: squash all fixes into a single patch] Signed-off-by: Greg Kurz <groug@kaod.org>	2024-07-09 20:03:08 +02:00
Niteesh Dubey	647dad2a00	gha: skip SNP attestation test Skip the SNP attestation test for now. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-09 17:16:07 +00:00
Niteesh Dubey	e7b4e5e386	gha: add SNP attestation test This tests the attestation of SNP guest. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-09 17:14:26 +00:00
Gabriela Cervantes	1a1e62b968	cri-containerd: Remove use_devmapper variable for cri-containerd tests This PR removes the use_devmapper variable which was part of the jenkins environment flags which is not longer support it or available for the cri-containerd tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 17:09:55 +00:00
GabyCT	eb0bc5007c	Merge pull request #9976 from sprt/fix-cri-containerd tests: cri-containerd: Ensure Docker isn't present	2024-07-09 11:02:20 -06:00
David Esparza	04df85a44f	metrics: Add num_vcpus and free_mem to metrics results template. This PR retrieves the free memory and the vcpus count from a kata container and includes them to the json results file of any metric. Additionally this PR parses the requested vcpus quantity and the requested amount memory from kata configuration file and includes this pair of values into the json results file of any metric. Finally, the file system defined in the kata configuration file is included in the results template. Fixes: #9972 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 10:29:29 -06:00
David Esparza	a554541495	metrics: Improvement to the description of certain functions. This PR rephrased the description and usage of certain functions as such as: - set_kata_configuration_performance - set_kata_config_file - get_current_kata_config_file - check_if_root - check_ctr_images Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 10:29:29 -06:00
stevenhorsman	c7cf26fa32	gha: make run-k8s-tests-on-zvsi inherit secrets run-k8s-tests-on-zvsi runs the coco tests and we've added new secrets to provide credentials for the authenticated image testing, so we need to let the zvsi job inherit these from the caller workflow like the rest of the coco tests Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-09 15:29:48 +01:00
Hyounggyu Choi	37b907dfbc	Merge pull request #9859 from BbolroC/set-ocispec-for-vfio-ap tests: Extend vfio-ap hotplug test to use a zcrypttest tool	2024-07-09 14:03:45 +02:00
Steve Horsman	ff498c55d1	Merge pull request #9719 from fitzthum/sealed-secret Support Confidential Sealed Secrets (as env vars)	2024-07-09 09:43:51 +01:00
Niteesh Dubey	529660fafb	runtime: pass certificates for SNP coco This will be used to get extended attestation report. Fixes: #9805 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-09 03:46:00 +00:00
Tim Zhang	704da86e9b	CI: Add tests for stdio Add tests for stdio Signed-off-by: Tim Zhang <tim@hyper.sh>	2024-07-09 11:44:40 +08:00
Tim Zhang	8801554889	runtime-rs: Fix ctr exec stuck problem Fixes: #9532 Instead of call agent.close_stdin in close_io, we call agent.write_stdin with 0 len data when the stdin pipe ends. Signed-off-by: Tim Zhang <tim@hyper.sh>	2024-07-09 11:44:36 +08:00
Tobin Feldman-Fitzthum	1c2d69ded7	tests: add test for sealed env secrets The sealed secret test depends on the KBS to provide the unsealed value of a vault secret. This secret is provisioned to an environment variable. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-07-08 17:41:20 -05:00
Linda Yu	b4d61f887b	agent: unittest for sealed secret as env in kata To test unsealing secrets stored in environment variables, we create a simple test server that takes the place of the CDH. We start this server and then use it to unseal a test secret. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-07-08 17:32:45 -05:00
Linda Yu	6003608fe6	agent: support sealed secret as env in kata When sealed-secret is enabled, the Kata Agent intercepts environment variables containing sealed secrets and uses the CDH to unseal the value. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-07-08 17:31:33 -05:00
Gabriela Cervantes	cf2d5ff4c1	scrips: Fix indentation in QAT run script This PR fixes the indentation of the QAT run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:23:50 +00:00
Gabriela Cervantes	d53eb61856	QAT: Remove CI variable from QAT run script This PR removes the CI variable from QAT run script which was used in the jenkins environment and not longer used. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:16:00 +00:00
Gabriela Cervantes	8a79b1449e	tests: Remove CI variable in tracing test This PR removes the CI variable as well as the instructions related to this as this was part of the jenkins environment which is not longer supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:12:41 +00:00
Gabriela Cervantes	9d44abb406	tests: Remove CI variable in test agent shutdown This PR removes the CI variable as well as the instructions related to this variable which was used on the jenkins environment and not longer supported. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:10:24 +00:00
Gabriela Cervantes	f2ed8dc568	docs: Remove CI variable from Intel QAT documentation This PR updates the Intel QAT documentation by removing the CI variable which is not longer being supported as this was part of the jenkins CI environment. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:05:47 +00:00
Gabriela Cervantes	ff06ef0bbc	scripts: Eliminate CI variable as it is not longer used This PR removes the CI variable which is not longer being used or valid in the kata containers repository. The CI variable was used when we were using jenkins and scripts setups which are not longer supported. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:00:30 +00:00
GabyCT	cb0fb91bdd	Merge pull request #9966 from GabyCT/topic/fixstability tests: Use variable already defined in metrics common script for stability tests	2024-07-08 13:55:55 -06:00
Aurélien Bombo	e9d6179b28	tests: cri-containerd: Ensure Docker isn't present Following #9960 that transitioned this test to a free runner, we need to ensure Docker isn't installed on the system as that will conflict with the installation of Podman. Example error: https://github.com/kata-containers/kata-containers/actions/runs/9818218975/job/27177785716 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-08 18:50:57 +00:00
Steve Horsman	e8836fafaa	Merge pull request #9828 from stevenhorsman/image-rs-bump-bad84c7 Image rs bump to latest main	2024-07-08 17:07:59 +01:00
Fabiano Fidêncio	67ba0ad0ad	Merge pull request #9971 from GabyCT/topic/fixnerdctldep gha: Fix pip installation for nerdctl GHA	2024-07-06 21:37:55 +02:00
Gabriela Cervantes	724b2c612c	gha: Fix pip installation for nerdctl GHA This PR fixes the pip installation for nerdctl by removing a flag which is not longer supported and avoid the failure of no such option: --break-system-packages. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-05 17:31:52 +00:00
stevenhorsman	1d6c1d1621	test: Add journal logging for debug - Due to the error we hit with pulling the agnhost image used in the liveness-probe tests, we want to leave the console printing to help with debug when we next try to bump the image-rs version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-05 10:25:28 +01:00
stevenhorsman	d511820974	agent: Bump image-rs - Bump the commit of image-rs we are pulling in to 413295415 Note: This is the last commmit before a change to whiteout handling was introduced that lead to the error `'failed to unpack: convert whiteout"` when pulling the agnhost:2.21 image Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-05 10:25:28 +01:00
Fabiano Fidêncio	543c90f145	Merge pull request #9695 from ChengyuZhu6/fix-init Fix issues on CI about guest-pull	2024-07-05 11:21:08 +02:00
ChengyuZhu6	65dc12d791	tests: Re-enable k8s-kill-all-process-in-container.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	2ea521db5e	tests:tdx: Re-enable k8s-liveness-probes.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	93453c37d6	tests: Re-enable k8s-sysctls.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	6c5e053dd5	tests: Re-enable k8s-shared-volume.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	85979021b3	tests: Re-enable k8s-file-volume.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	e71c7ab932	agent/image: Remove functions about merging container spec for guest pull Let me explain why: In our previous approach, we implemented guest pull by passing PullImageRequest to the guest. However, this method resulted in the loss of specifications essential for running the container, such as commands specified in YAML, during the CreateContainer stage. To address this, it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull. The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt to pull the same image, like InitContainer. This is because the image service searches for the existing configuration, which resides in the guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>. Consequently, when the InitContainer finishes its task and terminates, the directory ceases to exist. As a result, during the creation of the application container, the OCI spec and process information cannot be merged due to the absence of the expected configuration file. Fixes: kata-containers#9665 Fixes: kata-containers#9666 Fixes: kata-containers#9667 Fixes: kata-containers#9668 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	c9d1a758cd	agent/image: Reuse the mountpoint in image-rs Currently, the image is pulled by image-rs in the guest and mounted at `/run/kata-containers/image/cid/rootfs`. Finally, the agent rebinds `/run/kata-containers/image/cid/rootfs` to `/run/kata-containers/cid/rootfs` in CreateContainer. However, this process requires specific cleanup steps for these mount points. To simplify, we reuse the mount point `/run/kata-containers/cid/rootfs` and allow image-rs to directly mount the image there, eliminating the need for rebinding. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
stevenhorsman	05cd1cc7a0	agent: Add CreateContainer support for pre-pulled bundle - Add a check in setup_bundle to see if the bundle already exists and if it does then skip the setup. This commit is cherry-picked from `44ed3ab80e`. The reason that k8s-kill-all-process-in-container.bats failed is that deletion of the directory `/root/kata-containers/cid/rootfs` failed during removing container because it was mounted twice (one in image-rs and one in set_bundle ) and only unmounted once in removing container. Fixes: #9664 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Dave Hay <david_hay@uk.ibm.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-05 08:10:00 +08:00
Zvonko Kaiser	7990d3a154	dragonball: Update kata config version Mandatory update Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-04 17:24:16 +00:00
Zvonko Kaiser	cfbca4fe0d	dragonball: Update versions Use the latest guest kernel that we use for all other VMMs Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-04 17:24:16 +00:00
Zvonko Kaiser	26446d1edb	dragonball: Update patches After v5.14 there is no cpu_hotplug_begin function now cpus_write_lock same for cpu_hotplug_done = cpus_write_unlock Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-04 17:23:24 +00:00
Zvonko Kaiser	ad574b7e10	dragonball: Add patches for 6.1.x Ported the 5.10 patchs to 6.1.x Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-04 17:06:39 +00:00
Gabriela Cervantes	757f37d956	stability: General improvements for soak parallel test This PR has better variable definitons as well the use of a variable which is already defined in the metrics common script for soak parallel test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-04 16:32:46 +00:00
Gabriela Cervantes	6d56abbdad	stability: General improvements to agent stability test This PR is for better variable definitions as well as the use of the CTR_EXE variable which is already defined in the metrics common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-04 16:24:27 +00:00
Gabriela Cervantes	3e6c32c3c8	tests: Use variable already defined in stability tests This PR uses the CTR_EXE which is already defined in the metrics common script to have uniformity across the multiple stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-04 16:21:24 +00:00
Steve Horsman	ddb8a94677	Merge pull request #9960 from sprt/fix-garm ci: Transition GARM tests to free runners, pt. I	2024-07-04 09:04:58 +01:00
Biao Lu	6c1a2f01f8	protocols: add support for sealed_secret service To unseal a secret, the Kata agent will contact the CDH using ttRPC. Add the proto that describes the sealed secret service and messages that will be used. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Biao Lu <biao.lu@intel.com>	2024-07-04 01:03:41 -05:00
Fabiano Fidêncio	49696bbdf2	Merge pull request #9943 from AdithyaKrishnan/nydus-cleanup-timeout tests: Fixes TEE timeout issue	2024-07-03 22:57:17 +02:00
Anastassios Nanos	db75b5f3c4	Merge pull request #8070 from nubificus/feat_add-fc-runtime-rs runtime-rs: firecracker hypervisor backend	2024-07-03 22:29:30 +03:00
Adithya Krishnan Kannan	9250858c3e	tests: Stop trying to patch finalize We have not seen instances of the nydus snapshotter hanging on its deletion that we must patch its finalize. Let's just drop this line for now. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-07-03 12:19:26 -05:00
Dan Mihai	ada53744ea	Merge pull request #9907 from microsoft/saulparedes/allow_empty_env_vars genpolicy: allow some empty env vars	2024-07-03 08:07:23 -07:00
Aurélien Bombo	f18e35014f	ci: Move `run-nerdctl-tests` to free runner See #9940. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:58:11 +00:00
Aurélien Bombo	c0919d6f45	ci: Move `run-docker-tests` to free runner Removed the Docker installation step as that's preinstalled in free runners. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:59 +00:00
Aurélien Bombo	743a765525	ci: Move `run-runk` to free runner See #9940. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:48 +00:00
Aurélien Bombo	09cce86cc7	ci: Move `run-nydus` to free runner See #9940. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:42 +00:00
Aurélien Bombo	9e1b6064dc	ci: Move `run-containerd-stability` to free runner Removes the Docker installation step as that's preinstalled on the free runner: https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md#tools Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:37 +00:00
Aurélien Bombo	6a0e403acf	ci: Move `run-cri-containerd` to free runner See #9940. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:29 +00:00
George Pyrros	2d19f3fbd7	runtime-rs: firecracker hypervisor backend Add a basic runtime-rs `Hypervisor` trait implementation for AWS Firecracker - Add basic hypervisor operations (setup / start / stop / add_device) - Implement AWS Firecracker API on a separate file `fc_api.rs` - Add support for running jailed (include all sandbox-related content) - Add initial device support (limited as hotplug is not supported) - Add separate config for runtime-rs (FC) Notes: - devmapper is the only snapshotter supported - to account for no sharefs support, we copy files in the sandbox (as in the GO runtime) - nerdctl spawn is broken (TODO: #7703) Fixes: #5268 Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk> Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk> Signed-off-by: Charalampos Mainas <cmainas@nubificus.co.uk> Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk>	2024-07-03 08:30:30 +00:00
GabyCT	e3e3873857	Merge pull request #9954 from GabyCT/topic/sysbenchci metrics: Remove variable in sysbench that is not being used	2024-07-02 16:58:46 -06:00
Aurélien Bombo	eda5d2c623	ci: cleanup: Run every 24 hours instead of 6 hours Resources don't fail to get deleted as often to need to run every 6 hours. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-02 22:27:58 +00:00
Aurélien Bombo	f20924db24	ci: cleanup: Ignore nonexisting resources Some resource names seem to be lingering in Azure limbo but do not map to any actual resources, so we ignore those. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-02 22:23:54 +00:00
GabyCT	0590aab3e6	Merge pull request #9952 from GabyCT/topic/unitjenkins docs: Remove jenkins reference from unit testing presentation	2024-07-02 15:34:25 -06:00
Aurélien Bombo	33d08a8417	Merge pull request #9825 from microsoft/mahuber/main osbuilder: allow rootfs builds w/o git or version file deps	2024-07-02 09:38:13 -07:00
Steve Horsman	078a1147a6	Merge pull request #9909 from kata-containers/sprt/gha-cleanup-pt2 ci: Add scheduled job to cleanup resources, pt. II	2024-07-02 17:12:03 +01:00
Gabriela Cervantes	b7da1291ea	metrics: Remove variable in sysbench that is not being used This PR removes the CI_JOB variable which previously was used but not longer being supported of the metrics sysbench test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-02 15:29:50 +00:00
Wainer Moschetta	ec695f67e1	Merge pull request #9577 from microsoft/saulparedes/topology genpolicy: add topologySpreadConstraints support	2024-07-02 11:24:26 -03:00
Fabiano Fidêncio	ef3f6515cf	Merge pull request #9941 from sprt/temp-disable-test ci: Temporarily disable kata-deploy and GARM tests	2024-07-02 14:13:46 +02:00
Amulya Meka	dd12089e0d	Merge pull request #9914 from Amulyam24/qemu-fix kata-deploy: fix qemu static build on ppc64le	2024-07-02 10:45:03 +05:30
Saul Paredes	f3f3caa80a	genpolicy: update sample Update pod-one-container.yaml sample Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-01 13:49:08 -07:00
Dan Mihai	75aee526a9	genpolicy: add topologySpreadConstraints support Allow genpolicy to process Pod YAML files including topologySpreadConstraints. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-01 13:32:49 -07:00
Gabriela Cervantes	c270df7a9c	docs: Remove jenkins reference from unit testing presentation This PR removes the jenkins reference from unit testing presentation as this is not longer supported on the kata containers project. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-01 20:26:35 +00:00
GabyCT	e94490232e	Merge pull request #9949 from cmaf/tests-fix-openvino-help tests: Update help section in openvino test	2024-07-01 13:31:51 -06:00
Gabriela Cervantes	e3318a04f7	metrics: Update container name in blogbench test This PR updates the container name to put a random name instead of using a hard coded name. This PR is a general improvement to avoid random bug failures specially when we are running on baremetal environments. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-01 19:28:16 +00:00
Fabiano Fidêncio	05848d0c34	Merge pull request #9930 from likebreath/0627/clh_v40.0 Upgrade to Cloud Hypervisor v40.0	2024-07-01 20:04:47 +02:00
Steve Horsman	4fd820abd2	Merge pull request #9947 from stevenhorsman/fix-cleanups-workflow-secret gha: ci: Remove incorrect secrets line	2024-07-01 16:30:37 +01:00
Chelsea Mafrica	0b83c8549a	tests: Update help section in openvino test Test reports that it is a onednn test when it is openvino; update description. Fixes: #9948 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-07-01 14:24:50 +00:00
Hyounggyu Choi	795c5dc0ff	tests: Extend vfio-ap hotplug test to use zcrypttest This commit extends the vfio-ap hotplug test to include the use of `zcrypttest`. A newly introduced test by the tool consists of several test rounds as follows: - ioctl_test - simple_test - simple_one_thread_test - simple_multi_threads_test - multi_thread_stress_test - hang_after_offline_online_test A writable root filesystem is required for testing because the reference count needs to be reset after each test round. The current containerd kata containers support does not include `--privileged_without_host_devices`, which is necessary to configure a writable filesystem along with `--privileged`. (Please check out https://github.com/kata-containers/kata-containers/issues/9791 for details) So `crictl` is chosen to extend the test. The commit also includes the removal of old commands previously used for the tests repository but no longer in use. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-01 11:41:59 +02:00
Hyounggyu Choi	5bda197e9d	tests: Add zcrypttest tool to test image Dockerfile This commit copies an internal testing tool `zcrypttest` to the test image. A base image is changed to `ubuntu:22.04` due to a library dependency issue. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-01 11:40:49 +02:00
Hyounggyu Choi	99690ab202	runtime: Instantiate/pass vfio-ap device to ociSpec This commit adds the missing step of passing an attached vfio-ap device to a container via ociSpec. It instantiates and passes a vfio-ap device (e.g. a Z crypto device). A device at `/dev/z90crypt` covers all use cases at the time of writing. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-01 11:40:49 +02:00
Amulyam24	259ec408b5	kata-deploy: fix qemu static build for v8.2.1 on ppc64le Do not install the packages librados-dev and librbd-dev as they are not needed for building static qemu. Add machine option cap-ail-mode-3=off while creating the VM to qemu cmdline. Fixes: #9893 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-07-01 14:56:43 +05:30
stevenhorsman	16130e473c	gha: ci: Remove incorrect secrets line The CI is failing with: ``` Invalid workflow file: .github/workflows/cleanup-resources.yaml#L10 The workflow is not valid. .github/workflows/cleanup-resources.yaml (Line: 10, Col: 5): Unexpected value 'secrets' ``` I think this is because `secrets: inherit` is only applicable when re-using a workflow, not for a standalone job like we have here. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-01 09:32:58 +01:00
Hyounggyu Choi	f0187ff969	Merge pull request #9932 from BbolroC/drop-ci-install-go CI: Eliminate dependency on tests repo	2024-07-01 08:24:28 +02:00
Hyounggyu Choi	f2bfc306a2	Merge pull request #9936 from BbolroC/use-quay-lpine-bash-curl CI: Use multi-arch image for alpine-bash-curl	2024-07-01 08:02:01 +02:00
Manuel Huber	4b2e725d03	rootfs: Install Rust only when necessary For docker-based builds only install Rust when necessary. Further, execute the detect Rust version check only when intending to install Rust. As of today, this is the case when we intend to build the agent during rootfs build. Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2024-06-28 22:19:46 +00:00
Aurélien Bombo	c605fff4c1	ci: Temporarily disable kata-deploy and GARM tests Per the decision taken in the 6/27 AC meeting, this PR temporarily disables kata-deploy and GARM tests until we secure further Azure CI funding. In the meantime, I'll transition the GARM tests to free runners and reenable them to regain that coverage without affecting spending (see #9940). If it turns out the free runners are too slow, we'll switch back to GARM. After funding is secured, we'll reenable the kata-deploy tests (see #9939). Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-06-28 20:23:07 +00:00
Hyounggyu Choi	dd23beeb05	CI: Eliminating dependency on clone_tests_repo() As part of archiving the tests repo, we are eliminating the dependency on `clone_tests_repo()`. The scripts using the function is as follows: - `ci/install_rust.sh`. - `ci/setup.sh` - `ci/lib.sh` This commit removes or replaces the files, and makes an adjustment accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-28 14:52:02 +02:00
Hyounggyu Choi	f2c5f18952	CI: Use multi-arch image for alpine-bash-curl A multi-arch image for `alpine-bash-curl` has been pushed to and available at `quay.io/kata-containers`. This commit switches the test image to `quay.io/kata-containers/alpine-bash-curl`. Fixes: #9935 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-28 12:01:53 +02:00
Hyounggyu Choi	0e20f60534	CI: Drop unused scripts The following scripts are not used by the repository any more: - ci/install_go.sh - ci/run.sh - ci/install_vc.sh Additionally, they rely on the tests repo, which is soon to be archived. This commit drops the unused scripts. Fixes: #8507 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-28 07:55:21 +02:00
Archana Shinde	82a1892d34	agent: Add additional info while returning errors for update_interface This should provide additional context for errors while updating network interface. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-06-27 12:56:53 -07:00
Archana Shinde	2127288437	agent: Bring interface down before renaming it. In case we are dealing with multiple interfaces and there exists a network interface with a conflicting name, we temporarily rename it to avoid name conflicts. Before doing this, we need to rename bring the interface down. Failure to do so results in netlink returning Resource busy errors. The resource needs to be down for subsequent operation when the name is swapped back as well. This solves the issue of passing multiple networks in case of nerdctl as: nerdctl run --rm --net foo --net bar docker.io/library/busybox:latest ip a Fixes: #9900 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-06-27 12:56:53 -07:00
Zvonko Kaiser	a32b21bd32	Merge pull request #9918 from zvonkok/build-error rootfs: Fix spurious error	2024-06-27 19:46:51 +02:00
Bo Chen	25e3cab028	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v40.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #9929 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-06-27 09:59:00 -07:00
Bo Chen	ad92d73e43	versions: Upgrade to Cloud Hypervisor v40.0 Details of this release can be found in our roadmap project as iteration v40.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #9929 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-06-27 09:40:13 -07:00
Alex Lyn	d66c214ae7	Merge pull request #9849 from markyangcc/main runtime: fix missing of VhostUserDeviceReconnect parameter assignment	2024-06-27 21:48:37 +08:00
Wainer Moschetta	afc1c1a782	Merge pull request #9896 from fitzthum/bump-gc-090 versions: bump coco guest components and trustee	2024-06-27 09:46:06 -03:00
Zvonko Kaiser	29bb9de864	Merge pull request #9923 from BbolroC/increase-interval-max-tries-kubectl tests: Increase interval and max_tries for kubectl_retry	2024-06-27 09:49:24 +02:00
Hyounggyu Choi	4ec355fb78	tests: Increase interval and max_tries for kubectl_retry Observed instability in the API server after deploying kata-deploy caused test failures. (see: https://github.com/kata-containers/kata-containers/actions/runs/9681494440/job/26743286861) Specifically, `kubectl_retry logs` failed before the API server could respond properly. This commit increases the interval and max_tries for kubectl_retry(), allowing sufficient time to handle this situation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-27 08:39:22 +02:00
Aurélien Bombo	2c89828749	ci: Add scheduled job to cleanup resources, pt. II Follow-up to #9898 and final PR of this set. This implements the actual deletion logic. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-06-26 17:36:47 +00:00
Zvonko Kaiser	893fd2b59c	Merge pull request #9916 from zvonkok/config-fix gpu: Missing separator	2024-06-26 14:46:47 +02:00
Greg Kurz	fe7ef878d2	Merge pull request #9913 from gkurz/update-kata-ctl-deps kata-ctl: Update Cargo.lock	2024-06-26 14:31:03 +02:00
Zvonko Kaiser	30ec78b19a	rootfs: Fix spurious error In some DMZ'ed or CI systems the repos are not up to date and multistrap fails to find the ubuntu-keyring package. Update the repos to fix this; Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-26 11:10:58 +00:00
Zvonko Kaiser	e0aa54301f	gpu: Missing separator Add the correct separator for replacement Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-26 10:40:35 +00:00
Greg Kurz	ac33a389c0	Merge pull request #9879 from pmores/remove-dependency-on-containerd-bundle-dir-tree runtime-rs: remove attempt to access sandbox bundle from container bu…	2024-06-26 10:57:50 +02:00
Greg Kurz	db7b2f7aaa	kata-ctl: Update Cargo.lock A previous change missed to refresh Cargo.lock. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-06-26 08:27:52 +02:00
Tobin Feldman-Fitzthum	dd8605917b	versions: bump coco guest components and trustee Pick up the changes from the newest version of guest-components and trustee. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-06-25 23:56:18 +00:00
GabyCT	81d23a1865	Merge pull request #9897 from GabyCT/topic/montime tests: Increase timeout to crictl calls on kata monitor tests	2024-06-25 17:27:15 -06:00
Gabriela Cervantes	a8432880f8	tests: Increase timeout to crictl calls on kata monitor tests This PR increases the timeout to crictl calls on kata monitor tests to avoid to hit issues every now and avoid random failures. This PR is very similar to PR #7640. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-25 22:32:47 +00:00
Wainer Moschetta	c4fb6fbda2	Merge pull request #9887 from ldoktor/ci-kata-runtime ci.ocp: Ensure we smoke-test with the right runtime class	2024-06-25 15:27:27 -03:00
Fabiano Fidêncio	fb44edc22f	Merge pull request #9906 from stevenhorsman/TEE-sample-kbs-policy-guards tests: attestation: Restrict sample policy use	2024-06-25 20:27:13 +02:00
Steve Horsman	c9df743dab	Merge pull request #9898 from sprt/gha-cleanup-job ci: Add scheduled job to cleanup resources, pt. I	2024-06-25 19:11:30 +01:00
Saul Paredes	ce19419d72	genpolicy: allow some empty env vars Updated genpolicy settings to allow 2 empty environment variables that may be forgotten to specify (AZURE_CLIENT_ID and AZURE_TENANT_ID) Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-06-25 10:53:05 -07:00
Aurélien Bombo	0582a9c75b	Merge pull request #9864 from 3u13r/feat/genpolicy/layers-cache-file-path genpolicy: allow specifying layer cache file	2024-06-25 10:42:22 -07:00
Aurélien Bombo	d60b548d61	ci: Add scheduled job to cleanup resources This is the first part of adding a job to clean up potentially dangling Azure resources. This will be based on Jeremi's tool from https://github.com/jepio/kata-azure-automation. At first, we'll only clean up AKS clusters, as this is what has been causing us problems lately, but this could very well be extended to cleaning up entire resource groups, which is why I left the different names pretty generic (i.e. "resources" instead of "clusters"). Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-06-25 16:33:03 +00:00
stevenhorsman	7610b34426	tests: attestation: Restrict sample policy use - We only want to enable the sample verifier in the KBS for non-TEE tests, so prevent an edge case where the TEE platform isn't set up correctly and we might fall back to the sample and get false positives. To prevent this we add guards around the sample policy enablement and only run it for non confidential hardware Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-25 16:59:40 +01:00
Steve Horsman	d574d37c4b	Merge pull request #9903 from stevenhorsman/authenticated-regsitry-workflow-secrets workflow: coco: Add auth registry secret	2024-06-25 16:40:46 +01:00
stevenhorsman	d8961cbd4a	workflow: coco: Add auth registry secret - Add the `AUTHENTICATED_IMAGE_USER` and `AUTHENTICATED_IMAGE_PASSWORD` repository secrets as env vars to the coco tests, so we can use them to pull an images from and authenticated registry for testing Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-25 11:11:02 +01:00
Alex Lyn	2c5b3a5c20	Merge pull request #9830 from gaohuatao-1/ght/count-rs runtime-rs: fix the bug of func count_files	2024-06-25 15:00:46 +08:00
GabyCT	27d75f93e2	Merge pull request #9872 from GabyCT/topic/varmemin metrics: Improve variable definition in memory inside containers script	2024-06-24 15:30:05 -06:00
Aurélien Bombo	b0cdf4eb0d	Merge pull request #9579 from microsoft/saulparedes/add_seccomp_support genpolicy: ignore SeccompProfile in PodSpec	2024-06-24 08:58:01 -07:00
Wainer Moschetta	bcdc4fde10	Merge pull request #9857 from wainersm/disable_failing_jobs-part2 CI: disable jobs that failed >= 50% on nightly CI recently - part 2	2024-06-24 10:11:05 -03:00
Leonard Cohnen	6a3ed38140	genpolicy: allow specifying layer cache file Add --layers-cache-file-path flag to allow the user to specify where the cache file for the container layers is saved. This allows e.g. to have one cache file independent of the user's working directory. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-06-24 14:53:27 +02:00
Fabiano Fidêncio	3adf9e250f	Merge pull request #9875 from zvonkok/gha-no-sudo-arm64 ci: gha no sudo arm64	2024-06-21 15:28:54 +02:00
Wainer Moschetta	f7e0d6313b	Merge pull request #9865 from wainersm/qemu-coco-dev_updates runtime: updates to qemu-coco-dev configuration	2024-06-21 10:14:30 -03:00
Fabiano Fidêncio	2d552800f2	Merge pull request #9876 from zvonkok/gha-no-sudo-s390x ci: remove sudo from s390x build	2024-06-21 15:00:31 +02:00
Saul Paredes	44afb4aa5f	genpolicy: ignore SeccompProfile in PodSpec Ignore SeccompProfile in PodSpec Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-06-20 09:42:17 -07:00
Dan Mihai	7aeaf2502a	Merge pull request #9856 from microsoft/danmihai1/new-policy-rules genpolicy: reject untested CreateContainer field values	2024-06-20 09:34:53 -07:00
GabyCT	9320c2e484	Merge pull request #9845 from GabyCT/topic/fixartifacts gha: Do not fail when collecting artifacts	2024-06-20 10:15:53 -06:00
Hyounggyu Choi	959a277dc5	Merge pull request #9886 from BbolroC/kernel-config-uv-uapi-s390x kernel: Add CONFIG_S390_UV_UAPI for s390x	2024-06-20 16:05:15 +02:00
Steve Horsman	d5b4da7331	Merge pull request #9881 from stevenhorsman/remote-hypervisor-policy runtime: Support policy in remote hypervisor	2024-06-20 14:01:29 +01:00
Hyounggyu Choi	9cb12dfa88	kernel: Add CONFIG_S390_UV_UAPI for s390x While enabling the attestation for IBM SE, it was observed that a kernel config `CONFIG_S390_UV_UAPI` is missing. This config is required to present an ultravisor in the guest VM. Ths commit adds the missing config. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-20 13:15:33 +02:00
Lukáš Doktor	b08c019003	ci.ocp: Ensure we smoke-test with the right runtime class we do encourage people to set the KATA_RUNTIME, but it is only used by the webhook. Let's define it in the main `test.sh` and use it in the smoke test to ensure the user-defined runtime is smoke-tested rather than hard-coded kata-qemu one. Related to: #9804 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-06-20 11:15:02 +02:00
Fabiano Fidêncio	0f2a4d202e	Merge pull request #9884 from fidencio/topic/re-enable-tdx-ci ci: tdx: Re-enable TDX CI	2024-06-20 06:39:06 +02:00
GabyCT	02075f73e9	Merge pull request #9874 from GabyCT/topic/fixvarnerdctl tests: nerdctl: Fix variables names and remove network	2024-06-19 13:43:25 -06:00
Fabiano Fidêncio	2bab0f31d7	ci: tdx: Re-enable TDX CI Now, using vanilla kubernetes, let's re-enable the TDX CI and hope it becomes more stable than it used to be. The cleanup-snapshotter is now taking ~4 minutes, and that matches with the other platforms, mainly considering there's a sum of 210 seconds sleep in the process. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-19 20:08:28 +02:00
Greg Kurz	81972f6ffc	Merge pull request #9149 from ryansavino/upgrade-to-qemu-8.2.1 qemu: upgrade to 8.2.4	2024-06-19 19:10:02 +02:00
stevenhorsman	779754dcf6	runtime: Support policy in remote hypervisor Move the `sandbox.agent.setPolicy` call out of the remoteHypervisor if, block, so we can use the policy implementation on peer pods Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-19 16:43:53 +01:00
Fabiano Fidêncio	f9862e054c	Merge pull request #9882 from fidencio/topic/ci-tdx-use-vanilla-k8s ci: tdx: Use vanilla k8s instead of k3s	2024-06-19 17:33:00 +02:00
Pavel Mores	6a4919eeb9	runtime-rs: fix misleading log message get_vmm_master_tid() currently returns an error with the message "cannot get qemu pid (though it seems running)" when it finds a valid QemuInner::qemu_process instance but fails to extract the PID out of it. This condition however in fact means that a qemu child process was running (otherwise QemuInner::qemu_process would be None) but isn't anymore (id() returns None). Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-19 17:15:24 +02:00
Pavel Mores	af5492e773	runtime-rs: made Qemu::stop_vm() idempotent Since Hypervisor::stop_vm() is called from the WaitProcess request handling which appears to be per-container, it can be called multiple times during kata pod shutdown. Currently the function errors out on any subsequent call after the initial one since there's no VM to stop anymore. This commit makes the function tolerate that condition. While it seems conceivable that sandbox shouldn't be stopped by WaitProcess handling, and the right fix would then have to happen elsewhere, this commit at least makes qemu driver's behaviour consistent with other hypervisor drivers in runtime-rs. We also slightly improve the error message in case there's no QemuInner::qemu_process instance. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-19 17:15:24 +02:00
Pavel Mores	5fbbff9e5e	runtime-rs: remove attempt to access sandbox bundle from container bundle Since no objections were raised in the linked issue (#9847) this commit removes the attempt to derive sandbox bundle path from container bundle path. As described in more detail in the linked issue, this is container runtime specific and doesn't seem to serve any purpose. As for implementation, we hoist the only part of get_shim_info_from_sandbox() that's still useful (getting the socket address) directly into the caller and remove the function altogether. Fixes #9847 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-19 17:09:15 +02:00
Fabiano Fidêncio	7127178acc	ci: tdx: Use vanilla k8s instead of k3s We've noticed a bunch of issues related to deploying and deleting the nydus-snapshotter. As we don't see the same issues on other machines using vanilla kubernetes, let's avoid using k3s for now follow the flow. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-19 16:56:15 +02:00
Zvonko Kaiser	beab17f765	Merge pull request #9877 from zvonkok/gha-no-sudo-ppc64 ci: gha no sudo ppc64	2024-06-19 14:02:05 +02:00
Zvonko Kaiser	d783ddaf03	ci: Remove not needed chown for ppc64 Now that all artifacts are owned by $USER no extra step needed to adjust ownership Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:56:45 +00:00
Zvonko Kaiser	5bc37e39d5	ci: remove sudo from ppc64 build We can now do the same for ppc64 that we did for amd64 and remove the sudo cp. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:55:45 +00:00
Zvonko Kaiser	c341234c0b	ci: remove sudo from s390x build We can now do the same for s390x that we did for amd64 and remove the sudo cp. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:53:33 +00:00
Zvonko Kaiser	3beb460a97	ci: Remove not needed chown for arm64 Now that all artifacts are owned by $USER no extra step needed to adjust ownership Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:48:00 +00:00
Zvonko Kaiser	445b389b16	ci: remove sudo from arm64 build We can now do the same for arm64 that we did for amd64 and remove the sudo cp. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:46:51 +00:00
Gabriela Cervantes	6ec7971f7a	tests: nerdctl: Fix variables names and remove network This PR fixes the variables names for the network that was created as well removes the network that were created for the tests to ensure a clean environment when running all the tests and avoid failures specially on baremental environments that network already exists. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-18 23:00:49 +00:00
Dan Mihai	4df66568cf	genpolicy: reject untested CreateContainer field values Reject CreateContainerRequest field values that are not tested by Kata CI and that might impact the confidentiality of CoCo Guests. This change uses a "better safe than sorry" approach to untested fields. It is very possible that in the future we'll encounter reasonable use cases that will either: - Show that some of these fields are benign and don't have to be verified by Policy, or - Show that Policy should verify legitimate values of these fields These are the new CreateContainerRequest Policy rules: count(input.shared_mounts) == 0 is_null(input.string_user) i_oci := input.OCI is_null(i_oci.Hooks) is_null(i_oci.Linux.Seccomp) is_null(i_oci.Solaris) is_null(i_oci.Windows) i_linux := i_oci.Linux count(i_linux.GIDMappings) == 0 count(i_linux.MountLabel) == 0 count(i_linux.Resources.Devices) == 0 count(i_linux.RootfsPropagation) == 0 count(i_linux.UIDMappings) == 0 is_null(i_linux.IntelRdt) is_null(i_linux.Resources.BlockIO) is_null(i_linux.Resources.Network) is_null(i_linux.Resources.Pids) is_null(i_linux.Seccomp) i_linux.Sysctl == {} i_process := i_oci.Process count(i_process.SelinuxLabel) == 0 count(i_process.User.Username) == 0 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-06-18 18:09:31 +00:00
Wainer Moschetta	cf372f41bf	Merge pull request #9869 from fidencio/topic/disable-tdx-ci ci: tdx: Disable TDX CI	2024-06-18 14:47:38 -03:00
Gabriela Cervantes	671d9af456	metrics: Improve variable definition in memory inside containers script This PR improves the variable definition in memory inside the container script for metrics. This change declares and assigns the variables separately to avoid masking return values. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-18 16:56:12 +00:00
Gabriela Cervantes	eeb467bdc2	gha: Do not fail when collecting artifacts This PR will avoid the failures when collecting artifacts for the gha. This will ensure that we collect and archive system's data for the purpose of debugging. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-18 16:05:23 +00:00
Zvonko Kaiser	b1909e940e	deploy: Add busybox target For a minimal initrd/image build we may want to leverage busybox. This is part number two of the NVIDIA initrd/image build Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-18 15:31:00 +00:00
Wainer Moschetta	36093e86e0	Merge pull request #9863 from wainersm/kata-deploy_yq kata-deploy: always copy ci/install_yq.sh	2024-06-18 10:05:41 -03:00
Fabiano Fidêncio	587f4d45de	ci: tdx: Disable TDX CI TDX CI has been having some issues with the Nydus snapshotter cleanup, which has been stuck for hours depending every now and then. With this in mind, let's disable the TDX CI, so we avoid it blocking the progress of Kata Containers project, and we re-enable it as soon as we have it solved on Intel's side. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-18 10:30:40 +02:00
markyangcc	a28bf266f9	runtime: fix missing of VhostUserDeviceReconnect parameter assignment Commit 'ca02c9f5124e' implements the vhost-user-blk reconnection functionality, However, it has missed assigning VhostUserDeviceReconnect when new the QEMU HypervisorConfig, resulting in VhostUserDeviceReconnect always set to default value 0. Real change is this line, most of changes caused by go format, return vc.HypervisorConfig{ // ... VhostUserDeviceReconnect: h.VhostUserDeviceReconnect, }, nil Fixes: #9848 Signed-off-by: markyangcc <mmdou3@163.com>	2024-06-18 12:15:10 +08:00
Alex Lyn	388cd7dde4	Merge pull request #9772 from pmores/add-base-qmp-framework runtime-rs: add base qmp framework	2024-06-18 09:53:28 +08:00
Alex Lyn	275c498dc9	Merge pull request #9834 from lifupan/main sandbox: fix the issue of failed to get the vmm master tid	2024-06-18 08:57:21 +08:00
Alex Lyn	d3fb6bfd35	Merge pull request #9860 from stevenhorsman/tokio-vulnerability-bump Tokio vulnerability bump	2024-06-18 08:35:34 +08:00
Wainer dos Santos Moschetta	bdbee78517	runtime: allow default_{vcpus,memory} annotations to qemu-coco-dev This is a counterpart of commit `abf52420a4` for the qemu-coco-dev configuration. By allowing default_vcpu and default_memory annotations users can fine-tune the VM based on the size of the container image to avoid issues related with pulling large images in the guest. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-17 18:59:52 -03:00
Wainer dos Santos Moschetta	baa8d9d99c	runtime: set shared_fs=none to qemu-coco-dev configuration Just like the TEE configurations (sev, snp, tdx) we want to have the qemu-coco-dev using shared_fs=none. Fixes: #9676 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-17 18:42:46 -03:00
Wainer Moschetta	b8d7a8c546	Merge pull request #9862 from BbolroC/improve-kubectl-retry tests: Use selector rather than pod name for kubectl logs/describe	2024-06-17 18:33:24 -03:00
Hyounggyu Choi	6b065f5609	tests: Use selector rather than pod name for kubectl logs/describe The following error was observed during the deployment of nydus snapshotter: ``` Error from server (NotFound): the server could not find the requested resource ( pods/log nydus-snapshotter-5v82v) 'kubectl logs nydus-snapshotter-5v82v -n nydus-system' failed after 3 tries Error: Process completed with exit code 1. ``` This error can occur when a pod is re-created by a daemonset during the retry interval. This commit addresses the issue by using `--selector` rather than the pod name for `kubectl logs/describe`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-17 22:27:50 +02:00
Wainer Moschetta	7df221a8f9	Merge pull request #9833 from wainersm/qemu-rs_tests tests/k8s: run for qemu-runtime-rs on AKS	2024-06-17 16:59:46 -03:00
Zvonko Kaiser	5f11c0f144	Merge pull request #9861 from zvonkok/release-3.6.0 release: Bump VERSIONS file to 3.6.0	2024-06-17 20:35:29 +02:00
Wainer Moschetta	b6a28bd932	Merge pull request #9786 from microsoft/saulparedes/add_back_insecure_registry_pull genpolicy: add back support for insecure	2024-06-17 15:21:25 -03:00
Wainer Moschetta	68415dabcd	Merge pull request #9815 from msanft/fix/genpolicy/flag-name genpolicy: fix settings path flag name	2024-06-17 15:13:25 -03:00
Wainer dos Santos Moschetta	08eaa60b59	CI: disable all run-kata-deploy-tests-on-garm jobs The following jobs have failed more than 50% on nightly CI. run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, k0s) run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, rke2) run-kata-deploy-tests-on-garm / run-kata-deploy-tests (qemu, k0s) Instead of removing only those jobs, let's skip the kata-deploy-tests on GARM completely so we can try to fix all the issues (or maybe drop the jobs altogether). Issue: #9854 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-17 14:39:38 -03:00
Steve Horsman	4a41cee534	Merge pull request #9838 from zvonkok/gha-no-sudo CI: remove sudo from GHA	2024-06-17 16:23:39 +01:00
Wainer dos Santos Moschetta	e517167825	kata-deploy: always copy ci/install_yq.sh To build the build-kata-deploy image, it should be copied ci/install_yq.sh to tools/packaging/kata-deploy/local-build/dockerbuild as this script will install yq within the image. Currently, if tools/packaging/kata-deploy/local-build/dockerbuild/install_yq.sh exists then make won't copy it again. This can raise problems as, for example, the current update of yq version (commit `c99ba42d`) in ci/install_yq.sh won't force the rebuild of the build-kata-deploy image. Note: this isn't a problem on a fresh dev or CI environment. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-17 12:18:22 -03:00
Zvonko Kaiser	618121a654	release: Bump VERSIONS file to 3.6.0 Let's bump the VERSIONS file and start preparing for a new release of the project. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-17 12:06:46 +00:00
stevenhorsman	53659f1ede	libs: Update tokio dependencies - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	35f6be97df	runtime-rs: Update tokio dependency - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 If possible it would be good to add the many runtime-rs creates into the runtime-rs workspace and provide a centralised version to avoid the updates in many places. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	3bb1a67d80	agent-ctl: Update rustjail dependencies - Run `cargo update -p rustjail` to pick up rustjail's bump of tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	d2d35d2dcc	runk: Update tokio dependencies - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	adda401a8c	genpolicy: Update tokio dependencies - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	b7928f465e	agent: Update tokio dependencies - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:02:47 +01:00
Zvonko Kaiser	5c2f3f34a8	CI: remove sudo from GHA Now that all artifacts are owned by $USER we can start to remove sudo from our GHA Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-17 11:06:56 +00:00
Steve Horsman	cce735a09e	Merge pull request #9840 from stevenhorsman/bump-agent-rust-1.75.0 versions: Bump rust toolchain	2024-06-17 11:28:07 +01:00
Fupan Li	b218c4bc10	Merge pull request #9836 from lifupan/main_fix sandbox: fix the issue of double initial_size_manager config	2024-06-17 09:15:51 +08:00
Fabiano Fidêncio	9b5dd854db	Merge pull request #9726 from GabyCT/topic/unodeport tests: kbs: Use nodeport deployment from upstream trustee	2024-06-16 22:31:27 +02:00
Wainer dos Santos Moschetta	d4f664b73b	CI: disable run-kata-monitor-tests / run-monitor (containerd, lts) job The job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. Issue: #9853 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-14 16:27:04 -03:00
Wainer dos Santos Moschetta	cbf0b7ca7b	CI: disable run-basic-amd64-tests / run-nerdctl-tests (clh) job The job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. Issue: #9852 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-14 16:17:26 -03:00
Wainer dos Santos Moschetta	562820449e	CI: disable run-basic-amd64-tests / run-vfio (qemu) job The job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. The clh variation was disabled on commit `5f5274e699` so this change will actually result on all the VFIO jobs disabled. Instead of delete the entire entry from this workflow yaml (or comment the entry), I preferred to use `if: false` which will make the jobs appear on the UI as skipped. Issue: 9851 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-14 16:09:59 -03:00
GabyCT	4800e242a4	Merge pull request #9832 from GabyCT/topic/fixsets tests: setup: Improve setup script for kubernetes tests	2024-06-14 11:14:05 -06:00
Bo Chen	a68aeca356	Merge pull request #9575 from likebreath/0430/clh_v39.0 versions: Upgrade to Cloud Hypervisor v39.0	2024-06-14 09:10:19 -07:00
stevenhorsman	e23b929ba0	versions: Bump rust toolchain - Bump the rust version used to build the agent to 1.75.0 as agreed on in the AC meeting Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-14 16:45:16 +01:00
stevenhorsman	3fb176970f	dragonball: Fix device manager warning - Fix the lint error: ``` error: you seem to use `.enumerate()` and immediately discard the index --> src/device_manager/mod.rs:427:33 \| 427 \| for (_index, device) in self.virtio_devices.iter().enumerate() { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` by removing the unnecessary enumerate Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-14 16:45:16 +01:00
stevenhorsman	1ea2671f2f	dragonball: Fix lint with rust 1.75.0 The ci failed with: ``` error: use of `or_insert_with` to construct default value --> src/address_space_manager.rs:650:14 \| 650 \| .or_insert_with(NumaNode::new); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `or_default()` \| ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-14 16:45:16 +01:00
Steve Horsman	ab8a9882c1	Merge pull request #9818 from EmmEff/fix-spelling runtime: fix minor spelling issues	2024-06-14 13:12:56 +01:00
Steve Horsman	99bf95f773	Merge pull request #9827 from littlejawa/fix_panic_on_metrics_gathering runtime: avoid panic on metrics gathering	2024-06-14 11:12:43 +01:00
Steve Horsman	3eba4211f3	Merge pull request #9843 from microsoft/danmihai1/install_yq ci: fix the expected yq version string	2024-06-14 10:26:21 +01:00
Pavel Mores	380f8ad03f	runtime-rs: add base vCPU hotplugging support We take advantage of the Inner pattern to enable QemuInner::resize_vcpu() take `&mut self` which we need to call non-const functions on Qmp. This runs on Intel architecture but will need to be verified and ported (if necessary) to other architectures in the future. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-14 10:13:32 +02:00
Pavel Mores	8231c6c4a3	runtime-rs: instantiate Qmp as (optional) member of QemuInner The QMP_SOCKET_FILE constant in cmdline_generator.rs is made public to make it accessible from QemuInner. This is fine for now however if the constant needs to be accessed from additional places in the future we could consider moving it to somewhere more visible. The Debug impl for Qmp is empty since first, we don't actually want it, it's only forced by Hypervisor trait bounds, and second, it doesn't have anything to display anyway. If Qmp gets any members in the future that can be meaningfully displayed they should be handled by Qmp's Debug::fmt(). Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-14 10:13:32 +02:00
Pavel Mores	6fdb262dca	runtime-rs: add Qmp object to encapsulate QMP functionality The constructor handles QMP connection initialisation, too, so there can be non-functional Qmp instance. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-14 10:13:32 +02:00
Manuel Huber	62fd84dfd8	build: allow rootfs builds w/o git or VERSION file deps We set the VERSION variable consistently across Makefiles to 'unknown' if the file is empty or not present. We also use git commands consistently for calculating the COMMIT, COMMIT_NO variables, not erroring out when building outside of a git repository. In create_summary_file we also account for a missing/empty VERSION file. This makes e.g. the UVM build process in an environment where we build outside of git with a minimal/reduced set of files smoother. Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2024-06-13 22:46:52 +00:00
Dan Mihai	824287d64a	Merge pull request #9844 from microsoft/danmihai1/k8s-policy-pvc tests: fix yq command line in k8s-policy-pvc	2024-06-13 15:07:15 -07:00
Wainer dos Santos Moschetta	73ab5942fb	tests/k8s: run for qemu-runtime-rs on AKS The following tests are disabled because they fail (alike with dragonball): - k8s-cpu-ns.bats - k8s-number-cpus.bats - k8s-sandbox-vcpus-allocation.bats Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-13 16:20:59 -03:00
Mike Frisch	c2f61b0fe3	runtime: spelling fixes Minor spelling fixes in runtime log messages. Signed-off-by: Mike Frisch <mikef17@gmail.com>	2024-06-13 12:11:34 -04:00
Dan Mihai	56f9e23710	tests: fix yq command line in k8s-policy-pvc Fix the collision between: - https://github.com/kata-containers/kata-containers/pull/9377 - https://github.com/kata-containers/kata-containers/pull/9706 One enabled a newer yq command line format and the other used the older format. Both passed CI because they were not tested together. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-06-13 16:06:15 +00:00
Dan Mihai	23e99e264c	ci: fix the expected yq version string I get: ~/gopath/bin/yq --version yq (https://github.com/mikefarah/yq/) version v4.40.7 Also add support for set -o xtrace to install_yq.sh. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-06-13 15:52:26 +00:00
Ryan Savino	0430794952	qemu: upgrade to 8.2.4 There is a known issue in qemu 7.2.0 that causes kernel-hashes to fail the verification of the launch binaries for the SEV legacy use case. Upgraded to qemu 8.2.4. new available features disabled. Fixes: #9148 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-13 10:19:42 -05:00
Greg Kurz	b85b1c1058	Merge pull request #9790 from gkurz/kill-some-dead-runtime-code Kill some dead runtime code	2024-06-13 15:45:51 +02:00
gaohuatao	4cb4e44234	runtime-rs: fix the bug of func count_files When the total number of files observed is greater than limit, return -1 directly. runtime has fixed this bug, it should b ported to runtime-rs. Fixes:#9829 Signed-off-by: gaohuatao <gaohuatao@bytedance.com>	2024-06-13 16:02:33 +08:00
Fupan Li	cd68ef372f	sandbox: fix the issue of double initial_size_manager config It shouldn't call the initial_size_manager's setup_config in the load_config since it had been called in the sandbox's try_init function. Fixes: #9778 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-06-13 15:44:51 +08:00
Fupan Li	61687992f4	sandbox: fix the issue of failed to get the vmm master tid For kata container, the container's pid is meaning less to containerd/crio since the container's pid is belonged to VM, and containerd/crio couldn't use it. Thus we just return any tid of kata shim or hypervisor. But since the hypervisor had been stopped before deleting the container, and it wouldn't get the hypervisor's tid for some supported hypervisor, thus we'd better to return the kata shim's pid instead of hypervisor's tid. Fixes: #9777 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-06-13 10:27:04 +08:00
Fabiano Fidêncio	56423cbbfe	Merge pull request #9706 from burgerdev/burgerdev/genpolicy-devices genpolicy: add support for devices	2024-06-12 23:03:41 +02:00
Wainer Moschetta	d971e5ae68	Merge pull request #9537 from wainersm/kata-deploy-crio kata-deploy: configuring CRI-O for guest-pull image pulling	2024-06-12 17:27:00 -03:00
Gabriela Cervantes	c36c300fd6	tests: kbs: Use nodeport deployment from upstream trustee This PR uses the nodeport deployment from upstream trustee. To ensure our deployment is as close to upstream trustee replace the custom nodeport handling and replace it with nodeport kustomized flavour from the trustee project. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-12 20:01:59 +00:00
Gabriela Cervantes	0066aebd84	tests: setup: Improve setup script for kubernetes tests This PR makes general improvements like definition of variables and the use of them to improve the general setup script for kubernetes tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-12 19:39:54 +00:00
GabyCT	461b6e7c93	Merge pull request #9821 from GabyCT/topic/fixts metrics: Use function definition to have uniformity	2024-06-12 10:04:28 -06:00
Fabiano Fidêncio	3a0247ed43	Merge pull request #9819 from stevenhorsman/config-envvar-precedence agent: config: Ensure envs take precedence	2024-06-12 11:26:02 +02:00
Julien Ropé	9c86eb1d35	runtime: avoid panic on metrics gathering While running with a remote hypervisor, whenever kata-monitor tries to access metrics from the shim, the shim does a "panic" and no metric can be gathered. The function GetVirtioFsPid() is called on metrics gathering, and had a call to "panic()". Since there is no virtiofs process for remote hypervisor, the right implementation is to return nil. The caller expects that, and will skip metrics gathering for virtiofs. Fixes: #9826 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-06-12 10:02:44 +02:00
Xuewei Niu	92cc5e0adb	Merge pull request #9781 from gaohuatao-1/ght/shm	2024-06-12 12:39:28 +08:00
Moritz Sanft	84903c898c	genpolicy: fix settings path flag name This corrects the warning to point to the \`-j\` flag, which is the correct flag for the JSON settings file. Previously, the warning was confusing, as it pointed to the \`-p\` flag, which specifies to the path for the Rego ruleset. Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>	2024-06-11 21:17:18 +02:00
Greg Kurz	1acf8d0c35	govmm: Drop QEMU's `NoShutdown` knob Code is not used. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-06-11 19:55:54 +02:00
Greg Kurz	cb5b548ad7	govmm: Drop QEMU's `Daemonize` knob Code isn't used anymore. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-06-11 19:55:54 +02:00
Greg Kurz	33eaf69d5f	virtcontainers: Drop QEMU's `Daemonize` knob QEMU isn't started as daemon anymore and this won't change (see #5736 for details). Drop the related code. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-06-11 19:55:54 +02:00
Wainer Moschetta	f66a5b6287	Merge pull request #9807 from wainersm/qemu-rs_kata-deploy kata-deploy: add qemu-runtime-rs runtimeClass	2024-06-11 14:50:01 -03:00
Dan Mihai	d47f40210a	Merge pull request #9808 from microsoft/saulparedes/oci_from_settings genpolicy: load OCI version from settings	2024-06-11 10:42:04 -07:00
Gabriela Cervantes	a96ff49060	metrics: Use function definition to have uniformity This PR uses the function definition to have uniformity across all the launch times script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-11 17:36:08 +00:00
Saul Paredes	3e9d6c11a1	genpolicy: add back support for insecure registries Adding back changes from `77540503f9`. Fixes: #9008 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-06-11 09:42:23 -07:00
Bo Chen	2398442c58	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v39.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #8694, #9574 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-06-11 09:42:17 -07:00
Bo Chen	7a82894502	versions: Upgrade to Cloud Hypervisor v39.0 This patch upgrades Cloud Hypervisor to v39.0 from v36.0, which contains fixes of several security advisories from dependencies. Details can be found from #9574. Fixes: #8694, #9574 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-06-11 09:42:16 -07:00
Wainer dos Santos Moschetta	be9990144a	workflow: run kata-deploy tests to qemu-runtime-rs on AKS Start testing the ability of kata-deploy to install and configure the qemu-runtime-rs runtimeClass. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-11 12:58:47 -03:00
Wainer dos Santos Moschetta	4f398cc969	kata-deploy: add qemu-runtime-rs runtimeClass Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass which ties to qemu hypervisor implementation in rust for the runtime-rs. Fixes: #9804 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-11 12:58:47 -03:00
stevenhorsman	40e02b34cb	agent: config: Ensure envs take precedence - Update the config parsing logic so that when reading from the agent-config.toml file any envs are still processed - Add units tests to formalise that the envs take precedence over values from the command line and the config file Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-11 16:31:10 +01:00
Steve Horsman	59ff40f054	Merge pull request #9811 from mkulke/mkulke/use-kebabcase-for-enum-values-in-config-file-parsing agent: convert enum vals to kebab-case in cfg file	2024-06-11 14:49:30 +01:00
gaohuatao	638e9acf89	runtime: fix the bug of func countFiles When the total number of files observed is greater than limit, return (-1, err). When the returned err is not nil, the func countFiles should return -1. Fixes:#9780 Signed-off-by: gaohuatao <gaohuatao@bytedance.com>	2024-06-11 18:17:18 +08:00
Alex Lyn	1c8db85d54	Merge pull request #9784 from Apokleos/bufix-testcases kata-types: fix bug in kata-types several test cases	2024-06-11 10:01:45 +08:00
Saul Paredes	6a84562c16	genpolicy: load OCI version from settings Load OCI version from genpolicy-settings.json and validate it in rules.rego Fixes: #9593 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-06-10 15:30:39 -07:00
GabyCT	0c5849b68b	Merge pull request #9809 from microsoft/danmihai1/yq-breaking-change tests: k8s: use newer yq command line format	2024-06-10 16:29:59 -06:00
Wainer Moschetta	ade69e44f9	Merge pull request #9785 from BbolroC/kubectl-retry CI: Introduce retry mechanism for kubectl in gha-run.sh	2024-06-10 18:33:34 -03:00
Magnus Kulke	abc704a720	agent: convert enum vals to kebab-case in cfg file fixes #9810 Add an annotation to the enum values in the agent config that will deserialize them using a kebab-case conversion, aligning the behaviour to parsing of params specified via kernel cmdline. drive-by fix: add config override for guest_component_procs variable Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-06-10 21:55:05 +02:00
Dan Mihai	32198620a9	tests: k8s: use newer yq command line format Fix the recent collision between: - https://github.com/kata-containers/kata-containers/pull/9377 - https://github.com/kata-containers/kata-containers/pull/9725 One enabled a newer yq command line format and the other used the older format. Both passed CI because they were not tested together. Fixes: #9789 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-06-10 18:48:25 +00:00
Dan Mihai	079a0a017c	Merge pull request #9557 from portersrc/ci-debug-output-nydus-pod CI: describe pod on k8s-create-pod wait failure	2024-06-10 08:17:54 -07:00
Ryan Savino	84280115f6	Merge pull request #9151 from niteeshkd/nd_snp_kernel_hashes runtime: enable kernel-hashes for SNP confidential container	2024-06-07 18:19:51 -05:00
GabyCT	03bcc167a4	Merge pull request #9779 from GabyCT/topic/fixcoscript tests: Fix indentation in common script	2024-06-07 15:37:10 -06:00
Wainer Moschetta	7a28535277	Merge pull request #9800 from fidencio/topic/ci-tdx-re-enable-some-of-the-tests ci: tdx: Re-enable a bunch of volume related tests	2024-06-07 16:17:19 -03:00
Hyounggyu Choi	8ff128dda8	CI: Introduce retry mechanism for kubectl in gha-run.sh Frequent errors have been observed during k8s e2e tests: - The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? - Error from server (ServiceUnavailable): the server is currently unable to handle the request - Error from server (NotFound): the server could not find the requested resource These errors can be resolved by retrying the kubectl command. This commit introduces a wrapper function in common.sh that runs kubectl up to 3 times with a 5-second interval. Initially, this change only covers gha-run.sh for Kubernetes. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-07 18:24:19 +02:00
Fabiano Fidêncio	81c221c1b4	ci: k8s: tdx: Re-enable volume tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:13:36 +02:00
Fabiano Fidêncio	9db9d35198	ci: k8s: tdx: Re-enable projected-volume tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:12:36 +02:00
Fabiano Fidêncio	f6a6cba8ca	ci: k8s: tdx: Re-enable nested-configmap-secret tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:12:06 +02:00
Fabiano Fidêncio	957d0cccf6	ci: k8s: tdx: Re-enable inotify tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:10:39 +02:00
Fabiano Fidêncio	fc6f662ae0	ci: k8s: tdx: Re-enable credentials-secrets tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:08:29 +02:00
Fabiano Fidêncio	5741c6d3e6	Merge pull request #9768 from fidencio/topic/ci-tdx-enable-cdh-test ci: kbs: Enable CDH tests for TDX	2024-06-07 17:59:12 +02:00
Greg Kurz	afeb98d73f	Merge pull request #9782 from ldoktor/ci-centos-9 ci.ocp: Switch base to centos-9	2024-06-07 13:15:02 +02:00
Fabiano Fidêncio	fde457589e	ci: kbs: tdx: Enable basic attestation tests Let's stop skipping the CDH tests for TDX, as know we should have an environmemnt where it can run and should pass. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 12:18:50 +02:00
Fabiano Fidêncio	cac525059e	ci: kbs: tdx: Use the hostname ip instead of localhost for the PCCS We must ensure we use the host ip to connect to the PCCS running on the host side, instead of using localhost (which has a different meaning from inside the KBS pod). The reason we're using `hostname -i` isntead of the helper functions, is because the helper functions need the coco-kbs deployed for them to work, and what we do is before the deployment. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 12:18:07 +02:00
Alex Lyn	27685c91e5	kata-types: fix bug in kata-types several test cases (1) As mis-use of cap.set causing previous Caps lost which causing assert! failed, just replacing cap.set with cap.add. (2) It will return error if there's no such name setting when do update_config_by_annotation { ... if config.runtime.name.is_empty() { return Err(io::Error::new( io::ErrorKind::InvalidData, "Runtime name is missing in the configuration", )); } ... } Fixes #9783 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-06-07 09:16:23 +08:00
David Esparza	822c641b58	Merge pull request #9760 from amshinde/kata-manager-link-runc kata-manager: Add symlinks for runc and slirp4netns	2024-06-06 12:55:57 -06:00
Lukáš Doktor	699376c535	ci.ocp: Switch base to centos-9 Centos8 is EOL and repos are not available anymore. Centos9 contains the same packages and should do well as a base for testing. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-06-06 09:03:17 +02:00
Chris Porter	4172ccb3a0	CI: describe pod on k8s-create-pod wait failure This is generally useful debug output on test failures, and specifically this has been useful for nydus-related issues recently. Signed-off-by: Chris Porter <porter@ibm.com>	2024-06-05 12:37:53 -04:00
Gabriela Cervantes	264c7e9473	tests: Fix indentation in common script This PR fixes the indentation in common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-05 15:52:40 +00:00
Niteesh Dubey	1dbf5208ac	versions: Upgrade ovmf This is required to support SEV-SNP confidential container with kernel-hashes. Since this ovmf is latest stable version, it is good to upgrade for tdx and Vanilaa builds too. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-06-05 15:02:02 +00:00
Niteesh Dubey	62d3d7c58f	runtime: enable kernel-hashes for SNP confidential container This is required to provide the hashes of kernel, initrd and cmdline needed during the attestation of the coco. Fixes: #9150 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-06-05 15:02:02 +00:00
Steve Horsman	b30d085271	Merge pull request #9702 from ildikov/blog-submission-guide docs: Adding blog submission guidelines	2024-06-05 09:03:19 +01:00
Amulya Meka	b323afeda9	Merge pull request #9214 from Amulyam24/oras kata-deploy: install oras using release artefacts on ppc64le	2024-06-05 11:40:55 +05:30
Fabiano Fidêncio	138ef2c55f	Merge pull request #9678 from AdithyaKrishnan/main TEEs: Skip a few CI tests for SEV/SNP	2024-06-04 23:42:51 +02:00
GabyCT	ba30f0804a	Merge pull request #9770 from GabyCT/topic/fixvad tests: Use variable definition for better uniformity	2024-06-04 15:23:34 -06:00
Wainer dos Santos Moschetta	af4f9afb71	kata-deploy: add PULL_TYPE handler for CRI-O A new PULL_TYPE environment variable is recognized by the kata-deploy's install script to allow it to configure CRIO-O for guest-pull image pulling type. The tests/integration/kubernetes/gha-run.sh change allows for testing it: ``` export PULL_TYPE=guest-pull cd tests/integration/kubernetes ./gha-run.sh deploy-k8s ``` Fixes #9474 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-04 14:02:01 -03:00
GabyCT	6c2e8bed77	Merge pull request #9725 from 3u13r/feat/genpolicy/filter-by-runtime genpolicy: add ability to filter for runtimeClassName	2024-06-04 10:06:14 -06:00
Hyounggyu Choi	869f89c338	Merge pull request #9773 from BbolroC/use-qemu-coco-dev-s390x GHA: Use qemu-coco-dev for k8s nydus test on s390x	2024-06-04 17:49:38 +02:00
Gabriela Cervantes	cafba23f3e	tests: Use variable definition for better uniformity This PR replaces the name to use a variable that is already defined to have a better uniformity across the general script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-04 15:49:27 +00:00
Wainer Moschetta	2b8cdd9ff2	Merge pull request #9765 from wainersm/disable_failing_jobs CI: disable jobs that failed > 50% on nightly CI recently - part 1	2024-06-04 12:05:36 -03:00
Hyounggyu Choi	246ee83768	GHA: Use qemu-coco-dev for k8s nydus test on s390x In line with the changes for x86_64, the k8s nydus test for s390x should also use `qemu-coco-dev` for `KATA_HYPERVISOR`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-04 15:49:23 +02:00
Hyounggyu Choi	3aff6c5bd8	CI: Retry fetching node_start_time when it is empty It was observed that the `node_start_time` value is sometimes empty, leading to a test failure. This commit retries fetching the value up to 3 times. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-04 15:41:15 +02:00
Zvonko Kaiser	647560539f	Merge pull request #9769 from zvonkok/initrd-image-no-sudo ci: remove sudo and make sure artifacts is owned by user	2024-06-04 07:16:51 +02:00
Wainer Moschetta	b5561074c3	Merge pull request #9377 from beraldoleal/yqbump deps: bumping yq to v4.40.7	2024-06-03 14:34:58 -03:00
Ildiko Vancsa	5e03bec26b	docs: Adding blog submission guidelines The Kata blog was recently moved to the project's website. The content of the blog is stored together with the rest of the website source on GitHub. This patch adds a short guide that describes how to submit a new blog post as a PR, to appear on the project's website. Signed-off-by: Ildiko Vancsa <ildiko.vancsa@gmail.com>	2024-06-03 08:58:05 -07:00
GabyCT	6c7affbd85	Merge pull request #9741 from GabyCT/topic/staticcheck tests: Fix indentation in static checks script	2024-06-03 09:43:23 -06:00
Zvonko Kaiser	a48c084e13	ci: remove sudo and make sure image is owed by user The image build needs special handling since we're doing a lot of privileged operations. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-03 15:29:06 +00:00
Fabiano Fidêncio	34d45f0868	Merge pull request #9749 from mkulke/mkulke/configure-guest-components-spawning CoCo: introduce config for guest-components procs	2024-06-03 15:50:36 +02:00
Ryan Savino	72dc823059	tests: k8s: sev: snp: skip "setting sysctl" test This test fails when using `shared_fs=none` with the nydus snapshotter. Issue tracked here: #9666 Skipping for now. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:17 -05:00
Ryan Savino	3f3be54893	tests: k8s: sev: snp: skip initContainers shared vol test This test is failing due to the initContainers not being properly handled with the guest image pulling. Issue tracked here: #9668 Skipping for now. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:17 -05:00
Ryan Savino	35dfb730ce	tests: k8s: sev: snp: skip "kill all processes in container" test This test fails when using `shared_fs=none` with the nydus napshotter, Issue tracked here: #9664 Skipping for now. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:16 -05:00
Ryan Savino	62cc1dec4c	tests: replace docker debug alpine image with ghcr docker alpine latest image is rate limited. Need to use ghcr.io image. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:16 -05:00
ChengyuZhu6	1820b02993	tests: replace busybox from docker with quay in guest pull To prevent download failures caused by high traffic to the Docker image, opt for quay.io/prometheus/busybox:latest over docker.io/library/busybox:latest . Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-06-03 01:14:16 -05:00
Ryan Savino	6c646dc96d	tests: k8s: sev: snp: add runtime annotation for sev and snp sev and snp cases added to the KATA_HYPERVISOR switch. Signed-off-by: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:16 -05:00
Ryan Savino	6db08ed620	runtime: sev: snp: Use shared_fs=none Disabling 9p for SEV and SNP TEEs. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:16 -05:00
Ryan Savino	668959408d	tests: ensure kata_deploy cleanup even if namespace deletion fails the test cluster namespace deletion failing causes kata_deploy to not get cleaned up. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:15 -05:00
Wainer dos Santos Moschetta	c9f93fc507	github: add actionlint configuration file Added configuration file with rules to exclude some self-hosted runners from the linter warnings. Related-with: #9646 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-31 19:46:09 -03:00
Wainer dos Santos Moschetta	5f5274e699	CI: disable run-basic-amd64-tests / run-vfio (clh) job The job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. Issue: 9764 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-31 19:34:45 -03:00
Wainer dos Santos Moschetta	9154ce9051	CI: disable run-basic-amd64-tests / run-tracing jobs These jobs have failed more than 50% on nightly CI. Remove them from the list of execution until we don't have a fix. Issue: 9763 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-31 19:26:58 -03:00
Wainer dos Santos Moschetta	ac4d48ad17	CI: disable run-kata-monitor-tests / run-monitor (qemu, containerd) job This job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. Issue: 9761 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-31 19:21:21 -03:00
Archana Shinde	7a3e13fae8	kata-manager: Add symlinks for runc and slirp4netns For nerdctl install, add symlinks for runc and slirp4netns in the binary install path. runc link comes in handy for running runc containers with nerdctl fir quick tests. slirp4netns allows for running containers with user mode networking useful in case of rootless containers. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-05-31 13:53:42 -07:00
Markus Rudy	13310587ed	genpolicy: check requested devices CreateContainerRequest objects can specify devices to be created inside the guest VM. This change ensures that requested devices have a corresponding entry in the PodSpec. Devices that are added to the pod dynamically, for example via the Device Plugin architecture, can be allowlisted globally by adding their definition to the settings file. Fixes: #9651 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-05-31 22:05:49 +02:00
Wainer Moschetta	f093c4c190	Merge pull request #9754 from wainersm/qemu_coco_dev-enable_policy_tests tests/k8s: enable policy tests for qemu-coco-dev	2024-05-31 15:09:25 -03:00
Markus Rudy	ea578f0a80	genpolicy: add support for VolumeDevices This adds structs and fields required to parse PodSpecs with VolumeDevices and PVCs with non-default VolumeModes. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-05-31 19:34:14 +02:00
Beraldo Leal	d3a5eb299a	tools: bumping kernel config version Lets make ci happy. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	53b8158a81	tests: adding debug and skip to kata-deploy If a test is failing during setup, makes no much sense to run the suite. Let's skip and add some debug messages. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	9171821d57	tests: add debug message to check return code Lets add this message to make sure sh is starting properly. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	f91fbef184	tests: increase time after sh execution Increased sleep duration to ensure the shell process starts. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	ba5d2e54c2	tests: remove object separation mark from eof End of file should not end with --- mark. This will confuse tools like yq and kubectl that might think this is another object. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	3e8b4806b8	tests: increase debug messages for kata-deploy When the timeout happens we can't tell much information about the nodes. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	c99ba42d62	deps: bumping yq to v4.40.7 Since yq frequently updates, let's upgrade to a version from February to bypass potential issues with versions 4.41-4.43 for now. We can always upgrade to the newest version if necessary. Fixes #9354 Depends-on:github.com/kata-containers/tests#5818 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	4f6732595d	ci: skip go version check golang.mk is not ready to deal with non GOPATH installs. This is breaking test on s390x. Since previous steps here are installing go and yq our way, we could skip this aditional check. A full refactor to golang.mk would be needed to work with different paths. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Greg Kurz	7886ed6670	Merge pull request #9751 from wainersm/k8s_print_logs_on_fail tests/k8s: print logs on fail only (k8s-confidential-attestation.bats)	2024-05-31 14:47:27 +02:00
Fabiano Fidêncio	44df674232	Merge pull request #9757 from fidencio/topic/ci-tdx-skip-empty-dir-tests ci: k8s: Skip empty dir tests also for TDX	2024-05-31 13:18:35 +02:00
Magnus Kulke	9f04dc4c8b	agent: introduce config for coco attestion procs fixes #9748 A configuration option `guest_component_procs` has been introduced that indicates which guest component processes are supposed to be spawned by the agent. The default behaviour remains that all of those processes are actively spawned by the agent. At the moment this is based on presence of binaries in the rootfs and the guest_component_api_rest option. The new option is incremental: none -> attestation-agent -> confidential-data-hub -> api-server-rest e.g. api-server-rest implies attestation-agent and confidential-data-hub the `none` option has been removed from guest_component_api_rest, since this is addresses by the introduced option. To not change expected behaviour for non-coco guests we still will still only attempt to spawn the processes if the requested attestation binaries are present on the rootfs, and issue in warning in those cases. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-05-31 12:15:41 +02:00
Amulyam24	eadcb868f4	kata-deploy: install oras using release artefacts on ppc64le We are currently building Oras from source on ppc64le. Now that they offically release the artefacts for power, consume them to install Oras. Fixes: #9213 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-05-31 14:16:14 +05:30
Zvonko Kaiser	0321a3adcc	Merge pull request #8944 from zvonkok/update-threat-model threat-model: Add VFIO, ACPI and KVM/VMM threat-model descriptions	2024-05-31 10:38:27 +02:00
Fabiano Fidêncio	03a7cf4b02	ci: k8s: Skip empty dir tests also for TDX Wainer noticed this is failing for the coco-qemu-dev case, and decided to skip it, notifying me that he didn't fully understand why it was not failing on TDX. Turns out, though, this is also failing on TDX, and we need to skip it there as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-31 09:59:46 +02:00
Fabiano Fidêncio	72a71ff2bf	Merge pull request #9737 from zvonkok/kata-deploy-no-sudo ci: kata-deploy no sudo	2024-05-31 09:55:24 +02:00
Zvonko Kaiser	dd89d35b75	Merge pull request #9747 from zvonkok/remove-git-config ci: Remove all git config safe.directory	2024-05-31 07:25:28 +02:00
Leonard Cohnen	1d1690e2a4	genpolicy: add ability to filter for runtimeClassName Add the CLI flag --runtime-class-names, which is used during policy generation. For resources that can define a runtimeClassName (e.g., Pods, Deployments, ReplicaSets,...) the value must have any of the --runtime-class-names as prefix, otherwise the resource is ignored. This allows to run genpolicy on larger yaml files defining many different resources and only generating a policy for resources which will be deployed in a confidential context. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-05-31 03:17:02 +02:00
Wainer dos Santos Moschetta	3333f8ddfd	tests/k8s: enable policy tests for qemu-coco-dev So qemu-coco-dev is on pair with the TEE configurations. Fixes: #9753 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-30 21:51:15 -03:00
Wainer Moschetta	83fa813700	Merge pull request #9694 from wainersm/qemu_coco_dev-k8s-guest-pull tests: enable guest-pull on all k8s tests for the qemu-coco-dev configuration	2024-05-30 21:48:11 -03:00
Wainer dos Santos Moschetta	55ae98eb28	tests/k8s: print logs on fail only (k8s-confidential-attestation.bats) Use the variable BATS_TEST_COMPLETED which is defined by the bats framework when the test finishes. `BATS_TEST_COMPLETED=` (empty) means the test failed, so the node syslogs will be printed only at that condition. Fixes: #9750 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-30 17:19:33 -03:00
Wainer Moschetta	66e3b88694	Merge pull request #9746 from wainersm/nydus_snapshotter_pin ci: pin the nydus-snapshotter image version	2024-05-30 16:49:10 -03:00
Wainer dos Santos Moschetta	3e18fe7805	tests/k8s: skip file volume tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Issue: #9667 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-30 14:50:59 -03:00
Zvonko Kaiser	063db516f2	ci: Remove all git config safe.directory Now with the sudo less build we should be good to remove those hacks. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-30 15:12:28 +00:00
Zvonko Kaiser	d8889684f0	ci: kata-deploy no sudo Build/push/manage aritfacts without sudo Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-30 15:07:27 +00:00
Wainer dos Santos Moschetta	5faf9ca344	ci: pin the nydus-snapshotter image version It's cloning the nydus-snapshotter repo from the version specified in versions.yaml, however, the deployment files are set to pull in the latest version of the snapshotter image. With this version we are pinning the image version too. This is a temporary fix as it should be better worked out at nydus-snapshotter project side. Fixes: #9742 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-30 11:21:16 -03:00
Greg Kurz	b3cb19b6a7	Merge pull request #9639 from emanuellima1/rng-impl runtime-rs: Add RNG to QEMU cmdline	2024-05-30 12:00:11 +02:00
Zvonko Kaiser	7cc0ebe75e	Merge pull request #9743 from zvonkok/tools-fix ci: Fix tools builder images	2024-05-30 11:53:34 +02:00
Zvonko Kaiser	02a7f8c852	ci: Fix tools builder images We weren't considering changes of the tools script dir adding a fourth hash to accomodate this Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-30 08:10:42 +00:00
Fabiano Fidêncio	97806dbdaa	Merge pull request #9732 from zvonkok/shim-v2-no-sudo ci: shim-v2 no sudo	2024-05-30 07:01:04 +02:00
Wainer dos Santos Moschetta	37894923c1	tests/k8s: skip empty dir volumes tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	79a8b31ec5	tests/k8s: skip shared volume tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Issue: #9668 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	aa1a37081e	tests/k8s: skip sysctls tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Issue: #9666 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	0e81ced9f1	tests/k8s: skip kill-all-process tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Issue: #9664 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	18896efa3c	tests/k8s: skip seccomp tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Unlike other tests that I've seen failing on this scenario, k8s-seccomp.bats fails after a couple of consecutive executions, so it's that kind of failure that happens once in a while. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	b62ad71c43	tests/k8s: add runtime handler annotation for qemu-coco-dev This will enable the k8s tests to leverage guest pulling when PULL_TYPE=guest-pull for qemu-coco-dev runtimeclass. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	089c7ad84a	tests/k8s: add runtime handler annotation only for guest-pull The runtime handler annotation is required for Kubernetes <= 1.28 and guest-pull pull type. So leverage $PULL_TYPE (which is exported by CI jobs) to conditionally apply the annotation. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
GabyCT	0eddfdc74f	Merge pull request #9731 from zvonkok/pause-no-sudo ci: pause-image no sudo	2024-05-29 11:48:41 -06:00
Zvonko Kaiser	7354c427f9	Merge pull request #9734 from zvonkok/virtiofsd-no-sudo ci: virtiofsd no sudo	2024-05-29 19:31:25 +02:00
GabyCT	3c91aa0475	Merge pull request #9739 from zvonkok/initramfs-no-sudo ci: initramfs no sudo	2024-05-29 11:28:59 -06:00
Hyounggyu Choi	40d2306f95	Merge pull request #9729 from zvonkok/agent-no-sudo-build ci: build agent without sudo	2024-05-29 19:27:56 +02:00
GabyCT	03be220482	Merge pull request #9730 from zvonkok/kernel-no-sudo ci: kernel no sudo	2024-05-29 10:23:31 -06:00
GabyCT	a32058913a	Merge pull request #9679 from amshinde/kata-manager-install-cni kata-manager: Copy cni files under /opt/cni	2024-05-29 10:20:34 -06:00
GabyCT	a5808a556d	Merge pull request #9733 from zvonkok/tools-no-sudo ci: tools no sudo	2024-05-29 10:19:17 -06:00
GabyCT	e94b09839d	Merge pull request #9736 from zvonkok/qemu-no-sudo ci: qemu no sudo	2024-05-29 10:18:34 -06:00
GabyCT	6d58fce4a9	Merge pull request #9677 from GabyCT/topic/memoryusags metrics: Improve variable definition in memory usage script	2024-05-29 10:16:56 -06:00
Emanuel Lima	138d985c64	runtime-rs: Add RNG to QEMU cmdline It creates this line, as the Golang runtime does: -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-05-29 13:11:00 -03:00
Hyounggyu Choi	6ba2461404	Merge pull request #9728 from zvonkok/coco-guest-comp-no-sudo ci: guest-components without sudo	2024-05-29 17:55:43 +02:00
Gabriela Cervantes	09c3e08f6a	tests: Fix indentation in static checks script This PR fixes the indentation in the static checks script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-29 15:43:44 +00:00
Xuewei Niu	c297a7891c	Merge pull request #9723 from zvonkok/hotunplug-fix vfio: Fix hot-unplug	2024-05-29 22:02:05 +08:00
Zvonko Kaiser	25c784c568	ci: shim-v2 no sudo Build shim-v2 without sudo docker this is not needed. This is part 6 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-29 09:24:54 +00:00
Zvonko Kaiser	84a9773cec	ci: initramfs no sudo BUild initramfs without sudo docker this is not needed. This is part 10 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-29 09:20:39 +00:00
Zvonko Kaiser	7dc47c8150	ci: qemu no sudo Build qemu without sudo docker this is not needed. This is part 9 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 16:12:06 +00:00
Zvonko Kaiser	4a455bf24a	ci: virtiofsd no sudo build virtiofsd without sudo docker this is not needed. This is part 8 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 14:19:58 +00:00
Wainer Moschetta	9896f69827	Merge pull request #9414 from ldoktor/ci-bisection ci.ocp: Document openshift pipeline and manual bisection	2024-05-28 11:17:09 -03:00
Zvonko Kaiser	dd04d26cb0	ci: tools no sudo Build tools without sudo docker this is not needed. This is part 7 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 13:57:20 +00:00
Zvonko Kaiser	6c9c0306ac	ci: pause-image no sudo Build pause-image without sudo docker this is not needed. This is part 5 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 11:31:59 +00:00
Hyounggyu Choi	e8c06301d7	Merge pull request #9727 from zvonkok/ovmf-no-sudo ci: ovmf without sudo	2024-05-28 13:29:00 +02:00
Zvonko Kaiser	c95ae5a502	ci: kernel no sudo Build kernel without sudo docker this is not needed. This is part 4 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 11:19:08 +00:00
Zvonko Kaiser	8fab5dd584	ci: build agent without sudo Build agent without sudo docker this is not needed. This is part 3 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 09:55:32 +00:00
Zvonko Kaiser	1e4cbc4fcd	ci: guest-components wihout sudo Build guest-components without sudo docker this is not needed. This is part 2 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 09:03:14 +00:00
Zvonko Kaiser	b76938b922	ci: ovmf without sudo Build ovmf without sudo docker this is not needed. This is part 1 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 08:25:27 +00:00
Zvonko Kaiser	c6c20ac253	docs: Format the threat-model to 80 chars Truncate long lines to reasonable 80 characters Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 07:39:26 +00:00
Zvonko Kaiser	d4832b3b74	vfio: Fix hotpunplug We need to remove the device from the tracking map, a container restart will increment the bus index and we will get out of root-ports and crash the machine. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 07:37:30 +00:00
Zvonko Kaiser	a7931115a0	Merge pull request #8861 from zvonkok/config-pcie-root-switch-port gpu: reintroduce pcie_root_port and add pcie_switch_port	2024-05-27 13:17:57 +02:00
Fabiano Fidêncio	3276bb52b6	Merge pull request #9721 from fidencio/topic/ci-kata-deploy-improvements-and-fixes kata-deploy / kata-cleanup / ci: Fixes and improvements to kata-deploy / kata-cleanup and its usage in the CI	2024-05-27 12:29:40 +02:00
Zvonko Kaiser	4c93bb2d61	qemu: Add CDI device handling for any container type We need special handling for pod_sandbox, pod_container and single_container how and when to inject CDI devices Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-27 10:13:01 +00:00
Zvonko Kaiser	c7b41361b2	gpu: reintroduce pcie_root_port and add pcie_switch_port In Kubernetes we still do not have proper VM sizing at sandbox creation level. This KEP tries to mitigates that: kubernetes/enhancements#4113 but this can take some time until Kube and containerd or other runtimes have those changes rolled out. Before we used a static config of VFIO ports, and we introduced CDI support which needs a patched contianerd. We want to eliminate the patched continerd in the GPU case as well. Fixes: #8860 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-27 10:13:01 +00:00
Fupan Li	6f6a164451	Merge pull request #9268 from zvonkok/kata-agent-createcontainer kata-agent: CreateContainer Hook	2024-05-27 16:36:22 +08:00
Fabiano Fidêncio	e81e8a4527	tests: kata-deploy: Adjust timeout 10 minutes is waay too long. Let's give it 4 minutes only. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 06:23:00 +02:00
Fabiano Fidêncio	fba5793c0d	tests: kata-deploy: Run the tests from "${repo_root_dir}" Let's see if it helps with issues like: ``` error: must build at directory: not a valid directory: evalsymlink failure on '"/home/runner/actions-runner/_work/kata-containers/kata-containers/tests/functional/kata-deploy/../../..//tools/packaging/kata-deploy/kata-cleanup/overlays/k0s"' : lstat /home/runner/actions-runner/_work/kata-containers/kata-containers/tests/functional/kata-deploy/": no such file or directory ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 06:23:00 +02:00
Fabiano Fidêncio	8a8a7ea0e5	tests: kata-deploy: Show more logs in the setup() This will also help us to better understand possible failures with the CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	47d9589e9b	tests: kata-deploy: Show output of passing tests This will help us to debug failures and compare passing and failures outputs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	dbd0d4a090	gha: Only do preventive cleanups for baremetal This takes a few minutes that could be saved, so let's avoid doing this on all the platforms, but simply do this when it's needed (the baremetal use case). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	ee2ef0641c	tests: k8s: Allow passing "all" to run all the tests Currently only "baremetal" runs all the tests, but we could easily run "all" locally or using the github provided runners, even when not using a "baremetal" system. The reason I'd like to have a differentiation between "all" and "baremetal" is because "baremetal" may require some cleanup, which "all" can simply skip if testing against a fresh created VM. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	556227cb51	tests: Add the possibility to deploy k0s / rke2 For now we've only exposed the option to deploy kata-deploy for k3s and vanilla kubernetes when using containerd. However, I do need to also deploy k0s and rke2 for an internal CI, and having those exposed here do not hurt, and allow us to easily expand the CI at any time in the future. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	e3c2f0b0f1	kata-cleanup: Add k0s kustomization k0s was added to kata-deploy, but it's kata-cleanup counterpart was never added. Let's fix it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	f15d40f8fb	kata-deploy: Fix k0s deployment k0s deployment has been broken since we moved to using `tomlq` in our scripts. The reason is that before using `tomlq` our script would, involuntarily, end up creating the file. Now, in order to fix the situation, we need to explicitly create the file and let `tomlq` add the needed content. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Alex Lyn	713c929a64	Merge pull request #9656 from pmores/document-qemu-rs-conventions runtime-rs: document architecture & implementation conventions in qem…	2024-05-27 10:38:58 +08:00
Xuewei Niu	bb7a1c56e9	Merge pull request #9693 from sidneychang/9690/Adjust-indentation	2024-05-27 00:20:34 +08:00
Alex Lyn	55dbf6121a	Merge pull request #9604 from Apokleos/qmp-cmdline01 runtime-rs: add QMP support for Qemu(part I)	2024-05-26 20:22:59 +08:00
Alex Lyn	028b10ce7a	Merge pull request #9687 from l8huang/vfio-pci-gk agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device	2024-05-26 17:48:25 +08:00
Steve Horsman	b89c3e35dd	Merge pull request #9583 from cncal/update_check_error_message runtime: make kata-runtime check error more understandable when /dev/kvm doesn't exist	2024-05-24 17:49:43 +01:00
Alex Lyn	41fb7aeb89	runtime-rs: add QMP params suppport in cmdline Fixes: #9603 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-05-24 22:16:24 +08:00
Alex Lyn	7ed6c6896b	runtime-rs: add an option dbg_monitor_socket for HMP support This option allows to add a debug monitor socket when `enable_debug = true` to control QEMU within debugging case. Fixes: #9603 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-05-24 22:16:17 +08:00
Lei Huang	3624573b12	agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device The `update_env_pci()` function need the PCI address mapping to translate the host PCI address to guest PCI address in below environment variables: - PCIDEVICE_<prefix>_<resource-name>_INFO - PCIDEVICE_<prefix>_<resource-name> So collect PCI address mapping for both vfio-pci-gk and vfio-pci devices. Fixes #9614 Signed-off-by: Lei Huang <leih@nvidia.com>	2024-05-23 21:20:01 -07:00
Fupan Li	d73876252e	Merge pull request #9690 from justxuewei/agent-timeout runtime-rs: Remove obsoleted dial_timeout config	2024-05-24 10:31:12 +08:00
Zvonko Kaiser	3affd83e14	Merge pull request #9605 from l8huang/skip-env kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO	2024-05-23 18:45:00 +02:00
Fabiano Fidêncio	44d6cb7791	Merge pull request #9698 from wainersm/k8s_tests_disable_fail_fast tests/k8s: disable "fail-fast" behavior by default	2024-05-23 18:28:00 +02:00
Fabiano Fidêncio	d83cf39ba1	Merge pull request #9680 from kata-containers/dependabot/go_modules/src/runtime/go_modules-5e29427af7 build(deps): bump golang.org/x/net from 0.24.0 to 0.25.0 in /src/runtime in the go_modules group across 1 directory	2024-05-23 12:55:29 +02:00
Fabiano Fidêncio	d9ee950d8f	Merge pull request #9696 from wainersm/skip_custom_dns_test tests/k8s: skip custom DNS tests on confidential jobs	2024-05-22 23:57:21 +02:00
GabyCT	e08ad8d1b7	Merge pull request #9686 from GabyCT/topic/fixbootclh metrics: Fix minvalue for boot time	2024-05-22 15:46:50 -06:00
Wainer dos Santos Moschetta	76735df427	tests/k8s: disable "fail-fast" behavior by default The k8s test suite halts on the first failure, i.e., failing-fast. This isn't the behavior that we used to see when running tests on Jenkins and it seems that running the entire test suite is still the most productive way. So this disable fail-fast by default. However, if you still wish to run on fail-fast mode then just export K8S_TEST_FAIL_FAST=yes in your environment. Fixes: #9697 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-22 18:27:44 -03:00
Fabiano Fidêncio	8eb061cd5b	Merge pull request #9681 from GabyCT/topic/etdx gha: Enable install kbs and coco components for TDX, but still skip the CDH test	2024-05-22 23:18:42 +02:00
Wainer dos Santos Moschetta	43766cdb96	tests/k8s: skip custom DNS tests on confidential jobs This test has failed in confidential runtime jobs. Skip it until we don't have a fix. Fixes: #9663 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-22 17:08:22 -03:00
Fabiano Fidêncio	904370ecd6	tests: attestation: tdx: Skip test for now Skipping the test will allow us to have the TDX CI running while we debug the test. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:04:13 +02:00
Fabiano Fidêncio	414d716eef	tests: kbs: Enable cli installation also on CentOS One of our machines is running CentOS 9 Stream, and we could easily verify that we can build and install the kbs client there, thus we're expanding the installation script to also support CentOS 9 Stream. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Fabiano Fidêncio	27d7f4c5b8	tests: kbs: Fix rust installation `externals.coco-kbs.toolchain` is not defined, get the rust_version from `externals.coco-trustee.toolchain` instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Fabiano Fidêncio	fa8b5c76b8	tests: kbs: Add more info for the TDX deployment Ditto in the commit shortlog. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Fabiano Fidêncio	6ffd7b8425	versions: trustee: Bump version to 6adb8383309cbb7 We're bumping the version in order to bring in the customisation needed for setting up a custom pccs, which is needed for the KBS integration tests with Kata Containers + TDX. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Fabiano Fidêncio	dbd1fa51cd	tests: kbs: Don't assume /tmp/trustee exists in the machine Instead, check if the directory exists before pushd'ing into it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Gabriela Cervantes	f698caccc0	gha: Enable install kbs and coco components for TDX This PR enables the installation and unistallation of the kbs client as well as general coco components needed for the TDX GHA CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-22 20:01:57 +02:00
GabyCT	eaaab19763	Merge pull request #9685 from GabyCT/topic/fixic tests: Fix indentation in confidential common script	2024-05-22 11:53:33 -06:00
Gabriela Cervantes	29a10f1373	metrics: Fix minvalue for boot time This PR fixes the minvalue for boot time to avoid the random failures of the GHA CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-22 17:52:51 +00:00
GabyCT	0b32360ab4	Merge pull request #9684 from stevenhorsman/add-arch-to-component-cache-tags ci: cache: Add arch suffix to all cache tags	2024-05-22 09:24:28 -06:00
Fabiano Fidêncio	0e33ecf7fc	Merge pull request #9653 from JakubLedworowski/fixes-9497-ensure-quote-generation-service-is-added-to-qemu-cmd-2 runtime: Enable connection to Quote Generation Service (QGS)	2024-05-22 15:49:23 +02:00
sidneychang	8938f35627	runtime-rs: Adjust indentation in ifneq statements within Makefile. Replace tab indentation with spaces for the three lines within the ifneq statements, aligning them with the surrounding code. Fixes:#9692 Signed-off-by: sidneychang <2190206983@qq.com>	2024-05-22 20:24:35 +08:00
Fabiano Fidêncio	94f7bbf253	Merge pull request #9682 from fidencio/topic/allow-increasing-cpus-and-memory-via-annotation-for-tdx runtime: tdx: Allow default_{cpu,memory} annotations	2024-05-22 12:07:28 +02:00
Xuewei Niu	d31616cec3	runtime-rs: Remove obsoleted dial_timeout config The `dial_timeout` works fine for Runtime-go, but is obsoleted in Runtime-rs. When the pod cannot connect to the Agent upon starting, we need to adjust the `reconnect_timeout_ms` to increase the number of connection attempts to the Agent. Fixes: #9688 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-05-22 17:57:05 +08:00
Jakub Ledworowski	fc680139e5	runtime: Enable connection to Quote Generation Service (QGS) For the TD attestation to work the connection to QGS on the host is needed. By default QGS runs on vsock port 4050, but can be modified by the host owner. Format of the qemu object follows the SocketAddress structure, so it needs to be provided in the JSON format, as in the example below: -object '{"qom-type":"tdx-guest","id":"tdx","quote-generation-socket":{"type":"vsock","cid":"2","port":"4050"}}' Fixes: #9497 Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>	2024-05-22 11:16:24 +02:00
Alex Lyn	0331859740	Merge pull request #9642 from gkurz/drop-unused-knobs-qemu-rs runtime-rs: Drop some useless QEMU arguments	2024-05-22 16:13:14 +08:00
Alex Lyn	ce030d1804	Merge pull request #9641 from cmaf/runtime-resize-mem-1 runtime: Add missing check in ResizeMemory for CH	2024-05-22 14:05:30 +08:00
Alex Lyn	b7af00be2a	Merge pull request #9624 from cncal/bugfix_duplicated_devices runtime: fix duplicated devices requested to the agent	2024-05-22 12:45:46 +08:00
Steve Horsman	f41f642b90	Merge pull request #9635 from kata-containers/dependabot/go_modules/src/runtime/go_modules-f0df977846 build(deps): bump github.com/containerd/containerd from 1.7.11 to 1.7.16 in /src/runtime in the go_modules group across 1 directory	2024-05-21 21:19:32 +01:00
Steve Horsman	9b0ed3dfa7	Merge pull request #9657 from ajaypvictor/remote-hyp-annotations runtime: Disable number of cpu comparison on remote hypervisor scenario	2024-05-21 21:19:12 +01:00
Hyounggyu Choi	92101fc61f	Merge pull request #9658 from BbolroC/migrate-vfio-ap-test CI: Migrate vfio-ap test files from tests repo	2024-05-21 20:21:09 +02:00
Lei Huang	b0a91b0d13	kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO The new version of sriov-network-device-plugin adds an env `PCIDEVICE_<prefix>_<resource-name>_INFO`, which has a json value; kata-agent can't parse it as env `PCIDEVICE_<prefix>_<resource-name>` which has value in format "DDDD:BB:SS.F". This change updates env `PCIDEVICE_<prefix>_<resource-name>_INFO`. Signed-off-by: Lei Huang <leih@nvidia.com>	2024-05-21 10:46:41 -07:00
stevenhorsman	db4818fe1d	ci: cache: Enforce tag length limit Container tags can be a maximum of 128 characters long so calculate the length of the arch suffix and then restrict the tag to this length subtracted from 128 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 18:03:45 +01:00
Gabriela Cervantes	c9e91db16f	tests: Fix indentation in confidential common script This PR fixes the indentation in the confidential common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-21 16:33:46 +00:00
stevenhorsman	d6afd77eae	ci: cache: Update agent cache to use the full commit hash - Previously I copied the logic that abbreviated the commit hash from the versioning, but looking at our versions.yaml the clear pattern is that when pointing at commits of dependencies we use the full commit hash, not the abbreviated one, so for consistency I think we should do the same with the components that we make available Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 16:51:16 +01:00
stevenhorsman	d46b6a3879	ci: cache: Add arch suffix to all cache tags As we have multi-arch builds for nearly all components, we want to ensure that all the cache tags we set have the architecture suffix, not just the `TARGET_BRANCH` one. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 11:25:07 +01:00
stevenhorsman	865fa9da15	runtime: Resolve go static-checks failure Remove `rand.Seed` call to resolve the following failure: ``` rand.Seed is deprecated: As of Go 1.20 there is no reason to call Seed with a random value. ``` The go rand.Seed docs: https://pkg.go.dev/math/rand@go1.20#Seed back this up and states: > If Seed is not called, the generator is seeded randomly at program startup. so I believe we can just delete the call. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 11:08:59 +01:00
Fabiano Fidêncio	abf52420a4	runtime: tdx: Allow default_{cpu,memory} annotations For now, let's allow the users to set the default_cpu and default_memory when using TDX, as they may hit issues related to the size of the container image that must be pulled and unpacked inside the guest, Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-21 10:26:39 +02:00
stevenhorsman	75a201389d	runtime: update go version in go.mod - Make due to us bumping the golang version used in our CI but `make vendor` fails without the go version in the runtime go.mod being increased, so update this and run go mod tidy Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 09:11:46 +01:00
dependabot[bot]	735185b15c	build(deps): bump github.com/containerd/containerd Bumps the go_modules group with 1 update in the /src/runtime directory: [github.com/containerd/containerd](https://github.com/containerd/containerd). Updates `github.com/containerd/containerd` from 1.7.11 to 1.7.16 - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](https://github.com/containerd/containerd/compare/v1.7.11...v1.7.16) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-type: direct:production dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2024-05-21 09:11:46 +01:00
Ajay Victor	abe607b0c7	runtime: Disable number of cpu comparison on remote hypervisor scenario Fixes https://github.com/kata-containers/kata-containers/issues/9238 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>	2024-05-21 13:34:21 +05:30
dependabot[bot]	01868b2849	--- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: indirect dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2024-05-20 22:06:41 +00:00
Fabiano Fidêncio	8879e3bc45	Merge pull request #9452 from GabyCT/topic/tdxcoco gha: Add support to install KBS to k8s TDX GHA workflow	2024-05-20 23:28:52 +02:00
Fabiano Fidêncio	072b929b6f	Merge pull request #9660 from malt3/fix/genpolicy/namespace_empty_string genpolicy: detect empty string in ns as default	2024-05-20 21:34:13 +02:00
Gabriela Cervantes	cfdef7ed5f	tests/k8s: Use custom intel DCAP configuration This PR adds the use of custom Intel DCAP configuration when deploying the KBS. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-20 18:44:57 +00:00
Gabriela Cervantes	cace2fd340	metrics: Improve variable definition in memory usage script This PR improves general format like variable definition to have uniformity across the memory usage script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-20 16:14:59 +00:00
Fabiano Fidêncio	97056b017d	Merge pull request #9675 from stevenhorsman/release-build-tarballs-inherit-secrets gha: release: Set inherit secrets on tarball builds	2024-05-20 18:06:38 +02:00
Fabiano Fidêncio	b8b3bcc492	Merge pull request #9671 from bikesheddev/fix/kata-deploy-unbound-variable fix: kata-deploy.sh VERSION_ID unbound-variable	2024-05-20 17:22:55 +02:00
Fabiano Fidêncio	94cff3f74e	Merge pull request #9315 from fidencio/topic/adapt-TEEs-for-shared_fs-none TEEs: Use `shared_fs=none` for TDX	2024-05-20 17:17:36 +02:00
Fabiano Fidêncio	cffeb0ffb8	Merge pull request #9673 from fidencio/topic/revert-aks-workaround Revert "ci: azure: Workaround azure cli installation script"	2024-05-20 16:16:55 +02:00
stevenhorsman	f271983aeb	gha: release: Set inherit secrets on tarball builds Now we have updated the release builds to push artefacts to our registry for the release, so we can cache the images, we need to set `secrets: inherit` for all architecture's tarball builds so that we can log into quay.io and ghcr in those steps Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-20 14:19:17 +01:00
Fabiano Fidêncio	25c9cf32ff	Revert "ci: azure: Workaround azure cli installation script" This reverts commit `5ff53e4d1c`, as the script was fixed by MSFT, at least according to: https://github.com/Azure/azure-cli/issues/28984 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-20 14:38:46 +02:00
vac (Brendan)	d812007b99	kata-deploy: Fix unbound VERSION_ID VERSION_ID is not guaranteed to be specified in os-release, this makes kaka-deploy breaks in rolling distros like arch linux and void linux. Note that operating system vendors may choose not to provide version information, for example to accommodate for rolling releases. In this case, VERSION and VERSION_ID may be unset. Applications should not rely on these fields to be set. Signed-off-by: vac <dot.fun@protonmail.com>	2024-05-20 19:48:31 +08:00
Tim Zhang	857d2bbc8e	agent: Fix ctr exec stuck problem Fixes: #9532 Close stdin when write_stdin receives data of length 0. Stop call notify_term_close() in close_stdin, because it could discard stdout unexpectedly. Signed-off-by: Tim Zhang <tim@hyper.sh>	2024-05-20 14:52:14 +08:00
Fabiano Fidêncio	e8ebe18868	tests: k8s: tdx: Skip liveness probe test This test doesn't fail with the guest image pulling, but it for sure should. :-) We can see in the bats logs, something like: ``` Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 31s default-scheduler Successfully assigned kata-containers-k8s-tests/liveness-exec to 984fee00bd70.jf.intel.com Normal Pulled 23s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 345ms (345ms including waiting) Normal Started 21s kubelet Started container liveness Warning Unhealthy 7s (x3 over 13s) kubelet Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory Normal Killing 7s kubelet Container liveness failed liveness probe, will be restarted Normal Pulled 7s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 389ms (389ms including waiting) Warning Failed 5s kubelet Error: failed to create containerd task: failed to create shim task: the file /bin/sh was not found: unknown Normal Pulling 5s (x3 over 23s) kubelet Pulling image "quay.io/prometheus/busybox:latest" Normal Pulled 4s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 342ms (342ms including waiting) Normal Created 4s (x3 over 23s) kubelet Created container liveness Warning Failed 3s kubelet Error: failed to create containerd task: failed to create shim task: failed to mount /run/kata-containers/f0ec86fb156a578964007f7773a3ccbdaf60023106634fe030f039e2e154cd11/rootfs to /run/kata-containers/liveness/rootfs, with error: ENOENT: No such file or directory: unknown Warning BackOff 1s (x3 over 3s) kubelet Back-off restarting failed container liveness in pod liveness-exec_kata-containers-k8s-tests(b1a980bf-a5b3-479d-97c2-ebdb45773eff) ``` Let's skip it for now as we have an issue opened to track it down: https://github.com/kata-containers/kata-containers/issues/9665 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 21:59:29 +02:00
Fabiano Fidêncio	a2c70222a8	tests: k8s: tdx: Skip initContainerd shared vol test This is another one that is related to initContainers not being properly handled with the guest image pulling. Let's skip it for now as we have https://github.com/kata-containers/kata-containers/issues/9668 to track it down. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 20:58:45 +02:00
Fabiano Fidêncio	9d56145499	tests: k8s: tdx: Skip volume related tests Similarly to firecracker, which doesn't have support for virtio-fs / virtio-9p, TDX used with `shared_fs=none` will face the very same limitations. The tests affected are: * k8s-credentials-secrets.bats * k8s-file-volume.bats * k8s-inotify.bats * k8s-nested-configmap-secret.bats * k8s-projected-volume.bats * k8s-volume.bats Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 19:38:49 +02:00
Fabiano Fidêncio	606a62a0a7	tests: k8s: tdx: Skip "Setting sysctl" test This test fails when using `shared_fs=none` with the nydus-snapshotter, and we're tracking the issue here: https://github.com/kata-containers/kata-containers/issues/9666 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 19:38:38 +02:00
Fabiano Fidêncio	937b2d5806	tests: k8s: tdx: Skip "Kill all processes in container" test This test fails when using `shared_fs=none` with the nydus snapshotter, and we're tracking the issue here: https://github.com/kata-containers/kata-containers/issues/9664 For now, let's have it skipped. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:51:14 +02:00
Fabiano Fidêncio	03ce41b743	tests: k8s: tdx: Skip "Check custom dns" test The test has been failing on TDX for a while, and an issue has been created to track it down, see: https://github.com/kata-containers/kata-containers/issues/9663 For now, let's have it skipped. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:51:14 +02:00
Fabiano Fidêncio	1a8a4d046d	tests: k8s: setup: Improve / Fix logs Let's make sure the logs will print the correct annotation and its value, instead of always mentioning "kernel" and "initrd". Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:51:14 +02:00
Fabiano Fidêncio	3f38309c39	tests: k8s: tdx: Stop running `k8s-guest-pull-image.bats` We're doing that as all tests are going to be running with `shared_fs=none`, meaning that we don't need any specific test for this case anymore. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:51:00 +02:00
Fabiano Fidêncio	e84619d54b	tests: k8s: tdx: Add `add_runtime_handler_annotations` function This function will set the needed annotation for enforcing that the image pull will be handled by the snapshotter set for the runtime handler, instead of using the default one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:49:07 +02:00
Fabiano Fidêncio	f2de259387	runtime: tdx: Use shared_fs=none We shouldn't be using 9p, at all, with TEEs, as off right now we have no way to ensure the channels are encrypted. The way to work this around for now is using guest pull, either with containerd + nydus snapshotter or with CRI-O; or even tardev snapshotter for pulling on the host (which is the approach used by MSFT). This is only done for TDX for now, leaving the generic, AMD, and IBM related stuff for the folks working on those to switch and debug possible issues on their environment. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:47:09 +02:00
Fabiano Fidêncio	5b257685d9	Merge pull request #9662 from dborquez/fix_launchtimes_timestamp_generation Fix launch times timestamp generation.	2024-05-18 21:11:09 +02:00
Fabiano Fidêncio	94786dc939	Merge pull request #9659 from stevenhorsman/remove-non-printable-tag-characters ci: cache: Filter out non-printable characters from tag	2024-05-18 14:47:07 +02:00
Fabiano Fidêncio	874cda0e51	Merge pull request #9655 from BbolroC/add-arch-to-initramfs CI: Append arch type to initramfs-cryptsetup image	2024-05-18 14:31:57 +02:00
Malte Poll	babdab9078	genpolicy: detect empty string in ns as default In Kubernetes, the following values for namespace are equivalent and all refer to the default namespace: - ` ` (namespace field missing) - `namespace: ""` (namespace field is the empty string) - `namespace: "default"`(namespace field has the explicit value `default`) Genpolicy currently does not handle the empty string case correctly. Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>	2024-05-18 12:44:59 +02:00
Fabiano Fidêncio	cbfdc70a55	Merge pull request #9613 from fidencio/topic/skip-pull-image-tests-on-tees-part-II tests: pull-image: Only skip tests for TEEs	2024-05-18 03:31:38 +02:00
Archana Shinde	0e28e904e0	kata-manager: Install cni for containerd When just containerd is installed without installing nerdctl, cni plugins are missing from the installation. containerd tarball does not include cni plugin files. Hence install cni plugins separately for containerd. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-05-18 00:19:57 +00:00
Archana Shinde	d23d58a484	kata-manager: Copy cni files under /opt/cni nerdctl requires cni plugins to be installed in /opt/cni/bin Without bridge plugin installed, it is not possible to run a container with nerdctl. The downloaded nerdctl tarball contains cni plugin files, but are extracted under /usr/local/libexec. Copy extracted tarball cni files under /usr/local/libexec to /opt/cni/bin Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-05-18 00:16:48 +00:00
David Esparza	938d3dc430	metrics: fix timestamps generation from launch times test. Use `eval` to process the `date` command along with its parameters, thus avoiding misinterpreting the parameters as commands. Fixes: #9661 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-05-17 14:44:41 -06:00
David Esparza	bae377b42a	metrics: determine the realpath of kata-shim component. Determine the realpath of kata-shim avoiding the check fails in case the kata-shim is not a symlink, as was happening prior to this commit. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-05-17 14:40:02 -06:00
Fabiano Fidêncio	5ff53e4d1c	ci: azure: Workaround azure cli installation script This is done in order to work around https://github.com/Azure/azure-cli/issues/28984, following a suggestion on the very same issue. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 20:28:24 +02:00
stevenhorsman	42fddb5530	ci: cache: Filter out non-printable characters from tag - The tags have a trailing non-printable character, which results in our cache tags having a trailing underscore e.g. `ghcr.io/kata-containers/cached-artefacts/agent:ce24e9835_` For ease of use of these cached components, we should strip off the trailing underscore. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-17 14:16:40 +01:00
Hyounggyu Choi	961735a181	CI: Migrate vfio-ap test files from tests repo An e2e test for `vfio-ap` has been conducted internally in IBM due to the lack of publicly available test machines equipped with a required crypto device. The test is performed by the `tests` repository: (i.e. `772105b560/Makefile (L144)`) The community is working to integrate all tests into the `kata-containers` repository, so the `vfio-ap` test should be part of that effort. This commit moves a test script and Dockerfile for a test image from the `tests` repository. We do not rename the script to `gha-run.sh` because it is not executed by Github Actions' workflow. You can check the test results from the s390x nightly test with the migrated files here: https://github.com/kata-containers/kata-containers/actions/runs/9123170010/job/25100026025 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-05-17 14:59:16 +02:00
stevenhorsman	a92defdffe	tests: pull-image: Remove skips Given that we think the containerd -> snapshotter image cache problems have been resolved by bumping to nydus-snapshotter v0.3.13 we can try removing the skips to test this out Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-17 12:39:57 +02:00
stevenhorsman	7ac302e2d8	tests: Slacken guest pull rootfs count assert - We previously have an expectation for the pause rootfs to be pull on the host when we did a guest pull. We weren't really clear why, but it is plausible related to the issues we had with containerd and nydus caching. Now that is fixed we can begin to address this with setting shared_fs=none, but let's start with updating the rootfs host check to be not higher than expected Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	67ff58251d	tests: confidential_common: Remove unneeded `ensure_yq` call This test is called from `tests/integration/run_kuberentes_tests.sh`, which already ensures that yq is installed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	cc874ad5e1	tests: confidential: Ensure those only run on TEEs Running those with the non-TEE runtime classes will simply fail. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	2bc5b1bba2	tests: pull-image: Only skip tests for TEEs On `1423420`, I've mistakenly disabled the tests entirely, for both non-TEEs and TEEs. This happened as I didn't realise that `confidential_setup` would take non-TEEs into consideration. :-/ Now, let me follow-up on that and make sure that the tests will be running on non-TEEs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	d875f89fa2	tests: Add is_confidential_hardware() This function is a helper to check whether the KATA_HYPERVISOR being used is a confidential hardware (TEE) or not, and we can use it to skip or only run tests on those platforms when needed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	4a04a1f2ae	tests: Re-work confidential_setup() Let's rename it to `is_confidential_runtime_class`, and adapt all the places where it's called. The new name provides a better description, leading to a better understanding of what the function really does. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Pavel Mores	b9febc4458	runtime-rs: document architecture & implementation conventions in qemu-rs Implementation of QemuCmdLine has a fairly uniform and repetitive structure that's guided by a set of conventions. These conventions have however been mostly implicit so far, leading to a superfluous and annoying request/force-push churn during qemu-rs PR reviews. This commit aims to make things explicit so that contributors can take them into account before an initial PR submission. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-05-17 12:21:44 +02:00
Hyounggyu Choi	3917930a76	CI: Append arch type to initramfs-cryptsetup image This commit is to append an arch type to the initramfs-cryptsetup image to prevent a wrong arch image from being pulled on a different arch host. Fixes: #9654 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-05-17 11:42:49 +02:00
Steve Horsman	9a6d8d8330	Merge pull request #9650 from stevenhorsman/caching-tagging-update-partIII Caching tagging update part iii	2024-05-17 09:09:15 +01:00
stevenhorsman	ce24e98358	ci: cache: Add tag character filtering - Container image tags can only contain alphanumeric, period, hyphen and underscore characters, so convert characters outside of these to be underscores, to avoid having invalid tag failures Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-16 21:38:07 +01:00
stevenhorsman	a98b1e3afb	ci: cache: Integrate tagging updates with recent changes Recently the extra gpu caching was added, unfortunately when I rebased I ended up with both the new tagging logic and old logic. Let's try and integrate them properly to avoid doing the push twice. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-16 21:38:07 +01:00
Lukáš Doktor	f994f79078	ci.ocp: Add steps to reproduce/bisect CI runs in case the upstream CI fails it's useful to pin-point the PR that caused the regression. Currently openshift-ci does not allow doing that from their setup but we can mimic the setup on our infrastructure and use the available kata-deploy-ci images to find the first failing one. To help with that add a few helper scripts and a howto. Fixes: #9228 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-05-16 20:20:05 +02:00
Lukáš Doktor	a556ad7e01	ci.ocp: Document how to run openshift-tests with kata document the ocp pipeline. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-05-16 20:15:32 +02:00
Lukáš Doktor	ea081bd882	ci.ocp: Add webhook cleanup cleanup the webhook resources as well. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-05-16 20:15:31 +02:00
David Esparza	029a6de52b	Merge pull request #9615 from GabyCT/topic/fixlaunchtime metrics: Update launch times script	2024-05-16 11:28:44 -06:00
Steve Horsman	33e6b241ba	Merge pull request #9647 from stevenhorsman/fix-artefact-tags-unbound-variable ci: cache: Fix unbound variable	2024-05-16 16:22:47 +01:00
stevenhorsman	9d9487b17f	ci: cache: Fix unbound variable Now we have the workflow updated and can test the changes in caching we've hit an error: ``` line 1180: artefact_tag: unbound variable ``` so we need to fix that up. Sorry for missing this before. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-16 14:30:32 +01:00
Steve Horsman	03c08583c3	Merge pull request #9644 from stevenhorsman/fix-broken-workflow workflow: Remove if from env conditional	2024-05-16 14:13:25 +01:00
stevenhorsman	f7fd2f9a5d	workflow: Fix problems with build-asset workflows - It appears like the `if` isn't required when setting env as a conditional - `inputs.stage` over input.stage - Swap matrix.component to matrix.asset Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-16 11:51:46 +01:00
Steve Horsman	d8468cb178	Merge pull request #9550 from stevenhorsman/tag-component-caches Tag component caches	2024-05-16 11:05:18 +01:00
Steve Horsman	b31ff09b8d	Merge pull request #9617 from zvonkok/artefact-repository deploy: Add artefact repository	2024-05-16 10:41:23 +01:00
Fabiano Fidêncio	4d073c837d	Merge pull request #9636 from ChengyuZhu6/snapshotter version: Bump nydus snapshotter to v0.13.13	2024-05-16 02:54:53 +02:00
GabyCT	05cc8fae5e	Merge pull request #9610 from GabyCT/topic/fixrwfio metrics: Fix random write value for FIO	2024-05-15 17:44:41 -06:00
Gabriela Cervantes	793a02600a	metrics: Fix random write value for clh for FIO This PR decreases the random write value for clh for FIO. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-15 22:13:10 +00:00
Chelsea Mafrica	5d2af555da	runtime: Add missing check in ResizeMemory for CH ResizeMemory for Cloud Hypervisor is missing a check for the new requested memory being greater than the max hotplug size after alignment. Add the check, and since an earlier check for this setsrequested memory to the max hotplug size, do the same in the post-alignment check. Fixes #9640 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-05-15 11:29:18 -07:00
GabyCT	d752f0aa4f	Merge pull request #9627 from GabyCT/topic/ghacomk8s gha: Fix indentation in gha run k8s common	2024-05-15 11:55:14 -06:00
Greg Kurz	bd6420e0cc	runtime-rs: Drop some useless QEMU arguments All these settings are hardcoded as `false` and result in no extra options on the QEMU command line, like the go runtime does. There actually not needed : - we're never going to ask QEMU to survive a guest shutdown - we're never going to run QEMU daemonized since it prevents log collection - we're never going to ask QEMU to start with the guest stopped No need to keep this code around then. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-05-15 18:33:43 +02:00
stevenhorsman	7f41329010	ci: cache: Optional tag components with tags - CoCo wants to use the agent and coco-guest-components cached artifacts so tag them with a helpful version, so make these easier to get Signed-off-by: stevenhorsman <steven@uk.ibm.com> No commands remaining.	2024-05-15 16:56:40 +01:00
stevenhorsman	9999971656	release: Move component's don't ship logic - We don't want to ship certain components (agent, coco-guest-components) as part of the release, but for other consumers it's useful to be able to pull in the components from oras, so rather than not building them, just don't upload it as part of the release. - Also make the archs all consistent on not shipping the agent Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-15 16:55:55 +01:00
stevenhorsman	040e6cdf12	gha: release: Set RELEASE env - Set RELEASE env to 'yes', or 'no', based on if the stage passed in was 'release', so we can use it in the build scripts Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-15 16:55:55 +01:00
stevenhorsman	d93156d84d	gha: release: Push artifacts to registry on release For other projects (e.g. CoCo projects) being able to access the released versions of components is helpful, so push these during the release process Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-15 16:55:55 +01:00
Steve Horsman	19ca1a6656	Merge pull request #9638 from BbolroC/use-fixed-len-git-hash-explicitly CI: Use `--abbrev=9` explicitly for abbreviated commit hash	2024-05-15 16:55:07 +01:00
GabyCT	64b915b86e	Merge pull request #9438 from GabyCT/topic/addnegativetest tests: Add k8s negative policy test	2024-05-15 08:52:57 -06:00
Hyounggyu Choi	e075150fbe	CI: Use `--abbrev=9` explicitly for abbreviated commit hash A length of the result of `git log -1 --pretty=format:%h` could vary over different CI systems, highly likely messing up their caching mechanisms. This commit is to use an option `--abbrev=9` to standardize the length to 9 characters for CI. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-05-15 14:22:07 +02:00
Zvonko Kaiser	117e2f2ecc	Merge pull request #9618 from zvonkok/nvidia-rootfs-#1 gpu: Add build targets for GPU rootfs initrd/image	2024-05-15 13:30:42 +02:00
Hyounggyu Choi	6a4ff08156	Merge pull request #9632 from BbolroC/do-not-build-agent-policy-for-s390x local-build: Ensure the default rootfs is built with AGENT_POLICY=yes	2024-05-15 06:56:22 +02:00
ChengyuZhu6	d48c7ec979	version: Bump nydus snapshotter to v0.13.13 Bump nydus snapshotter to v0.13.13 to fix the gap when switching different snapshotters in guest pull. Fixes: #8407 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-05-15 12:21:01 +08:00
Fabiano Fidêncio	92bb235723	osbuilder: Log when the default policy is installed This will help us to debug issues in the future (and would have helped in the past as well). :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-14 20:45:49 +02:00
Fabiano Fidêncio	75bd97e8df	build: Ensure the default rootfs is built with AGENT_POLICY=yes This is needed, as `b1710ee2c0` made the default agent shipped the one with policy support. However, we simply didn't update the rootfs to reflect that, causing then an issue to start the agent as shown by the strace below: ``` open("/etc/kata-opa/default-policy.rego", O_RDONLY\|O_LARGEFILE\|O_CLOEXEC) = -1 ENOENT (No such file or directory) futex(0x7f401eba0c28, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [], 8) = 0 tkill(553681, SIGABRT) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=553681, si_uid=1000} --- +++ killed by SIGABRT (core dumped) +++ ``` This happens as the default policy must be set when the agent is built with policy support, but the code path that copies that into the rootfs is only triggered if the rootfs itself is built with AGENT_POLICY=yes, which we're now doing for both confidential and non-confidential cases. Sadly this was not caught by CI till we the cache was not used for rootfs, which should be solved by the previous commit. Fixes: #9630, #9631 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-14 20:39:15 +02:00
Hyounggyu Choi	37060a7d2e	local-build: Stop using cached artifacts when local-build/* is updated This is to add an info for files at `tools/packaging/kata-deploy/local-build/* to a version of the components and ensure that the cached artefacts are not used when the files of interest are updated. Fixes: #9630 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-05-14 19:47:33 +02:00
Fabiano Fidêncio	9a3392993d	Merge pull request #9629 from ldoktor/tdx_not_supported_warning kata-deploy: Fix tdx_not_supported call	2024-05-14 17:27:56 +02:00
Greg Kurz	f14a1330d4	Merge pull request #9585 from littlejawa/debugging_the_runtime debugging: adding a script and instructions for debugging the GO shim	2024-05-14 15:31:07 +02:00
Lukáš Doktor	d9ae130031	kata-deploy: Fix tdx_not_supported call the `tdx_not_supported_warning` function does not exists, the `tdx_not_supported` should be called instead. Fixes: #9628 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-05-14 13:26:07 +02:00
Julien Ropé	e7cfc0865a	debugging: adding a script and instructions for debugging the GO shim Using a debugger with the kata runtime is complicated, but it can be done and can be very useful. This commits provides a helper script that simplifies it, and updates the developper's documentation to explain how to use it. Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-05-14 11:12:31 +02:00
Greg Kurz	e2117d3b71	Merge pull request #9571 from emanuellima1/fix-impl-rtc runtime-rs: Fix constructing the RTC struct	2024-05-14 09:17:27 +02:00
Gabriela Cervantes	f20a44bba3	gha: Fix indentation in gha run k8s common This PR fixes the indentation in gha run k8s common script to have uniformity across the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-13 20:07:47 +00:00
Fabiano Fidêncio	4d5e90038c	Merge pull request #9626 from fidencio/topic/prepare-for-3.5.0-release release: Bump VERSIONS file to 3.5.0	2024-05-13 12:52:12 +02:00
Fabiano Fidêncio	0e385452e5	release: Bump VERSIONS file to 3.5.0 Let's bump the VERSIONS file and start preparing for a new release of the project. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-13 10:49:09 +02:00
Fabiano Fidêncio	c64b07f981	Merge pull request #9622 from fidencio/topic/unbreak-nvidia-gpu-build build: nvidia-gpu: Fix cache usage of the headers tarball	2024-05-12 14:40:22 +02:00
cncal	232db2d906	runtime: fix duplicated devices requested to the agent By default, when a container is created with the `--privileged` flag, all devices in `/dev` from the host are mounted into the guest. If there is a block device(e.g. `/dev/dm`) followed by a generic device(e.g. `/dev/null`)，two identical block devices(`/dev/dm`) would be requested to the kata agent causing the agent to exit with error: > Conflicting device updates for /dev/dm-2 As the generic device type does not hit any cases defined in `switch`， the variable `kataDevice` which is defined outside of the loop is still the value of the previous block device rather than `nil`. Defining `kataDevice` in the loop fixes this bug. Signed-off-by: cncal <flycalvin@qq.com>	2024-05-12 16:38:37 +08:00
Fabiano Fidêncio	9713558477	k0s: Use a different port for kube-route's metrics kube-router decided to use :8080 for its metrics, and this seems to be a change that affected k0s 1.30.0+, leading to kube-router pod crashing all the time and anything can actually be started after that. Due to this issue, let's simply use a different port (:9999) and move on with our tests. Fixes: #9623 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-11 23:18:20 +02:00
Fabiano Fidêncio	4cd048444d	build: nvidia-gpu: Fix cache usage of the headers tarball Whenever we count on having the headers tarball, we must unpack the cached content into the expected directory, otherwise we'd simply fail, as we've been failing in our CI, at the end of the process where we generate the tarball from the cached components. It's weird to me, sincerely, that the headers tarball end up in such weird place (build/kernel-nvidia-gpu/builddir/), but I'll leave that to Zvonko to figure out whether something better can be done, as the intuit of this PR is simply unblock Kata Containers CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-11 17:59:53 +02:00
Zvonko Kaiser	693e307f72	deploy: Add artefact repository New env var so everyone can test the PUSH_TO_REGISTRY feature export PUSH_TO_REGISTRY=yes export ARTEFACT_REGISTRY=quay.io export ARTEFACT_REPOSITORY=my-fancy-kata-containers export ARTEFACT_REGISTRY_USERNAME=zvonkok export ARTEFACT_REGISTRY_PASSWORD=<super-secret> make ...-tarball Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-10 16:41:52 +00:00
Zvonko Kaiser	4dea73b433	Merge pull request #9616 from zvonkok/nv-kernel-hotfix deploy: Fix wrong pushing of artifacts	2024-05-10 18:38:09 +02:00
Zvonko Kaiser	4d0f42a145	deploy: Fix wrong pushing of artifacts Added explicit case statements for nvidia-gpu and nvidia-gpu-confidential Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-10 14:08:32 +00:00
Zvonko Kaiser	85374f55d2	gpu: Add build targets for GPU rootfs initrd/image Preparation for complete GPU rootfs build step #1/#N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-10 09:47:21 +00:00
Zvonko Kaiser	8ec2cc9c0d	threat-model: Add VFIO, ACPI and KVM/VMM threat-model descriptions We're missing several topics in the current threat model lets update. Fixes: #8943 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-10 07:18:44 +00:00
Fabiano Fidêncio	20515fed70	Merge pull request #9484 from zvonkok/nvidia-runtimeclasses deploy: Add runtimeClasses relating to the NVIDIA GPU	2024-05-10 03:52:12 +02:00
Gabriela Cervantes	80e551ea74	metrics: Update launch times script This PR updates the launch times scripts by improving the variable definition as well as trying to use the same format across all the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-09 21:29:32 +00:00
Emanuel Lima	59c1567f80	runtime-rs: Fix constructing the RTC struct RTC was being built in a wrong fashion on commit #2bc5e3c6e2ab0145fa9e8be95df0d5086c07a517 RTC was being constructed inside the QemuCmdLine struct, but it should've been built inside the devices vector. Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-05-09 15:00:47 -03:00
Fabiano Fidêncio	2f686b1179	Merge pull request #9608 from fidencio/topic/tdx-depend-on-distro-host-stack-part-II tdx: Adapt kata-deploy to use QEMU / OVMF from the distros	2024-05-09 10:25:19 +02:00
Zvonko Kaiser	da7e6a0f07	deploy: Add runtimeClasses relating to the NVIDIA GPU Fixes: #9483 For the added configurations we need to provide runtimeClasses. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 10:00:59 +02:00
Fabiano Fidêncio	96a100f910	Merge pull request #9482 from zvonkok/kernel-headers-tarball kernel: Add caching of kernel-headers	2024-05-09 09:58:30 +02:00
Fabiano Fidêncio	aba56a8adb	tests: measured-rootfs: Skip policy addition Let's skip the policy addition for now, in order to get the TDX CI back up and running, and then we can re-enable it as soon as we get https://github.com/kata-containers/kata-containers/issues/9612 fixed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	77f457c0e1	runtime: tdx: Drop sept-ve-disable=on This was needed when we were using an old (and not maintained anymore) host stack. Considering what we have as part of the distros, Today, this can simply be dropped, as I cannot find any reference of this one being needed in any up-to-date documentation. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	416d00228c	Revert "qemu: tdx: Adapt command line" (partially) This reverts commit `b7cccfa019`. The `private=on` bit has never made its way upstream, and was removed from the latest iteration that we're using. With that in mind, let's revert its usage in the code. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	1c3037fd25	Revert "govmm: tdx: Expose the private=on\|off knob" This reverts commit `582b5b6b19`. The `private=on` bit has never made its way upstream, and was removed from the latest iteration that we're using. With that in mind, let's revert its addition, and later on its usage in the code. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	a9720495de	kata-deploy: Ensure the distro QEMU and OVMF are used for TDX Here we're checking the distro's `/etc/os-release` or `/usr/lib/os-release` in order to get which distro we're deploying the Kata Containers artefacts to, and then to properly adjust the QEMU and OVMF with TDX support that's been shipped with the distros. Together with that, we're also printing the instructions provided by the distro on how to enable and use TDX. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	f48450b360	runtime: config: tdx: Add QEMU / OVMF placeholder var Let's add the PLACEHOLDER_FOR_DISTRO_{QEMU,OVMF}_WITH_TDX_SUPPORT variables instead of actually setting a path, so we can easily replace those as part of our deployment scripts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	84b94dc2b1	kata-deploy: Expose /host to the daemon-set We'll need to have access to the host os-release file (either under `/etc/os-release` or under `/usr/lib/os-release`), and the simplest approach that comes to my mind to do is doing what a debug pod would do, mounting `/` as `/host` and then allowing us to have access to those files, and then corectly set the TDX specific QEMU and OVMF (TDVF) paths for the tdx available configurations. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	f2d40da8e4	versions: build: Remove unused td-shim entry We haven't been using nor testing with td-shim, as Cloud Hypervisor does not officially support TDX yet, and TDVF is supposed to be used with QEMU, instead of td-shim. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	ea82740b19	versions: build: Remove TDX specific QEMU Let's remove everything related to the TDX specific QEMU building / shipping from our repo, as we'll be relying on the one coming from the distros. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	4292c4c3b1	versions: build: Remove TDX specific OVMF (TDVF) Let's remove everything related to the TDVF building / shipping from our repo, as we'll be relying on the one coming from the distro. Later on, we may need to re-add TDVF logic, as we're already using upstream edk2 repo / content, but when that's needed we'll simply revert this commit. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Alex Lyn	946f0bdfff	Merge pull request #9609 from fidencio/topic/skip-pull-image-tests-on-tees tests: pull-image: Don't run on TEEs	2024-05-09 08:22:55 +08:00
GabyCT	3b8a910393	Merge pull request #9596 from lifupan/main db: fix the issue of failed to init pci root bus	2024-05-08 13:14:20 -06:00
Gabriela Cervantes	2fb406ed3a	metrics: Fix random write value for FIO This PR fixes the random write value for FIO for qemu by decreasing it to avoid the random failures of the GHA CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-08 18:54:41 +00:00
Fabiano Fidêncio	142342012c	tests: pull-image: Don't run on TEEs Let's skip those tests on TEEs as we've been facing a reasonable amount of issues, most likely on the containerd side, related to pulling the image on the guest. Once we're able to fix the issues on containerd, we can get back and re-enable those by reverting this commit. The decision of disabling the tests for TEEs is because the machines may end up in a state where human intervention is necessary to get them back to a functional state, and that's really not optimal for our CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-08 18:40:22 +02:00
Fabiano Fidêncio	c0bf9e9bc6	Merge pull request #9607 from fidencio/topic/tdx-depend-on-distro-host-stack-part-I ci: Stop building TDX specific QEMU and OVMF	2024-05-08 15:53:15 +02:00
Zvonko Kaiser	fb0b821771	kernel: Add caching of kernel-headers Fixes: #9481 We need to cache the kernel-headers for the NVIDIA GPU initrd/image build. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-08 11:30:39 +00:00
Fabiano Fidêncio	12dc9f83df	ci: Stop building TDX specific QEMU and OVMF This is the first step of the work to start relying on the artefacts coming from the distros (CentOS 9 Stream, and Ubuntu) themselves. Let's have this first one merged, as this will not run the CI due to the changes being on the yaml itself, and then follow-up with the changes needed on other parts of the project (kata-deploy, runtime, etc). Fixes: #9590 -- part I Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-08 11:39:32 +02:00
Alex Lyn	875e6e3815	Merge pull request #9601 from cncal/fix_redundant_log qemu: the error is logged only when it occurs	2024-05-08 08:59:01 +08:00
GabyCT	22087f9db9	Merge pull request #9598 from lifupan/main_shim runtime-rs: fix the issue of the leak of dead shim	2024-05-07 10:14:11 -06:00
GabyCT	a564422b7b	Merge pull request #9582 from cncal/main build: fix the confusing build message if yq doesn't exist in GOPATH/bin	2024-05-07 09:34:27 -06:00
Fabiano Fidêncio	cd84414c63	Merge pull request #9600 from GabyCT/topic/deleteoci versions: Remove oci information from versions file	2024-05-07 13:15:35 +02:00
Fabiano Fidêncio	ddf6b367c7	Merge pull request #9568 from kata-containers/dependabot/go_modules/src/runtime/go_modules-22ef55fa20 build(deps): bump the go_modules group across 5 directories with 8 updates	2024-05-07 13:14:48 +02:00
Steve Horsman	e967db60ab	Merge pull request #9592 from sprt/mariner-before-ch39 tests: adapt Mariner CI to unblock CH v39 upgrade	2024-05-07 11:52:55 +01:00
cncal	15d511af97	qemu: the error is logged only when it occurs Everytime I create contianer on arm64 machine, containerd/kata logs a redundant warning as follows: ``` shell time="2024-05-07" level=warning msg="<nil>" arch=arm64 name=containerd-shim-v2 pid=xxx sandbox=fdd1f05 source=virtcontainers/hypervisor ``` I added an error statement so that the error would be logged when it occurs. Signed-off-by: cncal <flycalvin@qq.com>	2024-05-07 14:28:04 +08:00
Gabriela Cervantes	aecede11fc	versions: Remove oci information from versions file This PR removes oci information from versions file as this is not longer being used in kata containers repository. Fixes #9599 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-06 20:14:00 +00:00
Gabriela Cervantes	b54dc26073	gha: Enable uninstall kbs client function for coco gha workflow This PR enables the uninstall kbs client function for coco gha tdx workflow. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-06 15:55:24 +00:00
Gabriela Cervantes	aaf9b54d97	gha: Add support to install KBS to k8s TDX GHA workflow This PR adds support to install KBS to k8s TDX GHA workflow in order to run confidential attestation tests. Fixes #9451 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-06 15:42:17 +00:00
Gabriela Cervantes	506e17a60d	tests: Add k8s negative policy test This PR adds a k8s negative policy test to the confidential attestation bats test. Fixes #9437 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-06 15:28:54 +00:00
Fupan Li	3694f3d9fe	runtime-rs: fix the issue of the leak of dead shim We should init and asign the runtime instance to runtime handler, otherwise, if the pause container failed to start, which means the runtime instance failed to start, then the following delete & shutdown request wouldn't be run, thus the dead shim would be left. Fixes: #9597 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-05-06 17:31:31 +08:00
Fupan Li	26bee78e8d	db: fix the issue of failed to init pci root bus dragonball reserves 2048G of mmio space for the pci root bus by default on physical addresses greater than 4G. However, for some machines with smaller physical address widths, such as 39-bit wide physical addresses, dragonball reserves the mmio space when initializing the memory. It is less than 2048G, so this commit dynamically calculates and allocates the mmio size of each pci root bus. Fixes: #9509 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-05-06 11:34:18 +08:00
Aurélien Bombo	0cc2b07a8c	tests: adapt Mariner CI to unblock CH v39 upgrade The CH v39 upgrade in #9575 is currently blocked because of a bug in the Mariner host kernel. To address this, we temporarily tweak the Mariner CI to use an Ubuntu host and the Kata guest kernel, while retaining the Mariner initrd. This is tracked in #9594. Importantly, this allows us to preserve CI for genpolicy. We had to tweak the default rules.rego however, as the OCI version is now different in the Ubuntu host. This is tracked in #9593. This change has been tested together with CH v39 in #9588. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-05-03 16:29:12 +00:00
cncal	48d873b52b	build: fix the confusing build message if yq doesn't exist in GOPATH/bin The build message shows that yq was not found when I tried to build runtime binaries, but I've actually installed yq by yum install. Signed-off-by: cncal <flycalvin@qq.com>	2024-05-03 08:34:45 +08:00
cncal	9caa7beb1f	runtime: make kata-runtime check error more understandable If device /dev/kvm does not exist, kata-runtime check would fail with an ambiguous error messae 'no such file or directory'. I added a little more details to make it understandable and it will belike: ``` ERRO[0000] cannot open kvm device: no such file or directory arch=arm64 check-type=full device=/dev/kvm name=kata-runtime pid=2849085 source=runtime ERRO[0000] no such file or directory arch=arm64 name=kata-runtime pid=2849085 source=runtime no such file or directory ``` Signed-off-by: cncal <flycalvin@qq.com>	2024-05-03 08:29:08 +08:00
Zvonko Kaiser	e5e0983b56	Merge pull request #9476 from zvonkok/nvidia-config-tomls config: Add NVIDIA GPU SNP, TDX configuration files	2024-05-02 10:27:10 +02:00
Fabiano Fidêncio	f04a7a55ed	Merge pull request #9563 from fidencio/topic/agent-use-policy-by-default build: Build the shipped agent with policy enabled	2024-05-01 12:22:05 +02:00
Fabiano Fidêncio	33a8701904	Merge pull request #9573 from littlejawa/kata_deploy_crio_conf kata-deploy: configure debugging for crio	2024-05-01 12:19:10 +02:00
Julien Ropé	c2aed995b7	kata-deploy: configure debugging for crio Fix the configuration for crio's log_level Fixes: #9556 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-04-30 17:48:43 +02:00
stevenhorsman	3c2232d898	runtime: fix testVersionString logic - The testVersionString logic use regex to check that the ociVersion is displayed correctly, but with the new go module that version has a `+` in, so we need to quote this to escape special characters Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-30 10:54:49 +01:00
dependabot[bot]	391bc35805	build(deps): bump the go_modules group across 5 directories with 8 updates Bumps the go_modules group with 2 updates in the /src/runtime directory: [github.com/containerd/containerd](https://github.com/containerd/containerd) and [github.com/containers/podman/v4](https://github.com/containers/podman). Bumps the go_modules group with 4 updates in the /src/tools/csi-kata-directvolume directory: [golang.org/x/sys](https://github.com/golang/sys), google.golang.org/protobuf, [golang.org/x/net](https://github.com/golang/net) and [google.golang.org/grpc](https://github.com/grpc/grpc-go). Bumps the go_modules group with 2 updates in the /src/tools/log-parser directory: [golang.org/x/sys](https://github.com/golang/sys) and gopkg.in/yaml.v3. Bumps the go_modules group with 2 updates in the /tests directory: [golang.org/x/sys](https://github.com/golang/sys) and gopkg.in/yaml.v3. Bumps the go_modules group with 2 updates in the /tools/testing/kata-webhook directory: [golang.org/x/sys](https://github.com/golang/sys) and [golang.org/x/net](https://github.com/golang/net). Updates `github.com/containerd/containerd` from 1.7.2 to 1.7.11 - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](https://github.com/containerd/containerd/compare/v1.7.2...v1.7.11) Updates `github.com/containers/podman/v4` from 4.2.0 to 4.9.4 - [Release notes](https://github.com/containers/podman/releases) - [Changelog](https://github.com/containers/podman/blob/v4.9.4/RELEASE_NOTES.md) - [Commits](https://github.com/containers/podman/compare/v4.2.0...v4.9.4) Updates `google.golang.org/protobuf` from 1.29.1 to 1.33.0 Updates `github.com/cyphar/filepath-securejoin` from 0.2.3 to 0.2.4 - [Release notes](https://github.com/cyphar/filepath-securejoin/releases) - [Commits](https://github.com/cyphar/filepath-securejoin/compare/v0.2.3...v0.2.4) Updates `golang.org/x/sys` from 0.15.0 to 0.19.0 - [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0) Updates `google.golang.org/protobuf` from 1.31.0 to 1.33.0 Updates `golang.org/x/net` from 0.19.0 to 0.23.0 - [Commits](https://github.com/golang/net/compare/v0.19.0...v0.23.0) Updates `google.golang.org/grpc` from 1.59.0 to 1.63.2 - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](https://github.com/grpc/grpc-go/compare/v1.59.0...v1.63.2) Updates `golang.org/x/sys` from 0.0.0-20191026070338-33540a1f6037 to 0.1.0 - [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0) Updates `gopkg.in/yaml.v3` from 3.0.0-20200313102051-9f266ea9e77c to 3.0.0 Updates `golang.org/x/sys` from 0.0.0-20220429233432-b5fbb4746d32 to 0.19.0 - [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0) Updates `gopkg.in/yaml.v3` from 3.0.0-20210107192922-496545a6307b to 3.0.0 Updates `golang.org/x/sys` from 0.15.0 to 0.19.0 - [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0) Updates `golang.org/x/net` from 0.19.0 to 0.23.0 - [Commits](https://github.com/golang/net/compare/v0.19.0...v0.23.0) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-type: direct:production dependency-group: go_modules - dependency-name: github.com/containers/podman/v4 dependency-type: direct:production dependency-group: go_modules - dependency-name: google.golang.org/protobuf dependency-type: direct:production dependency-group: go_modules - dependency-name: github.com/cyphar/filepath-securejoin dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/sys dependency-type: indirect dependency-group: go_modules - dependency-name: google.golang.org/protobuf dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/net dependency-type: direct:production dependency-group: go_modules - dependency-name: google.golang.org/grpc dependency-type: direct:production dependency-group: go_modules - dependency-name: golang.org/x/sys dependency-type: indirect dependency-group: go_modules - dependency-name: gopkg.in/yaml.v3 dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/sys dependency-type: indirect dependency-group: go_modules - dependency-name: gopkg.in/yaml.v3 dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/sys dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/net dependency-type: indirect dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2024-04-30 09:46:13 +01:00
Wainer Moschetta	eae429a39b	Merge pull request #9552 from wainersm/kata_cc_dev runtime: new qemu-coco-dev configuration	2024-04-30 05:21:49 -03:00
Zvonko Kaiser	28078ded84	Merge pull request #9570 from stevenhorsman/dependabot-commit-check-skip workflow: static-checks: Skip commit checks for dependabout	2024-04-29 23:00:35 +02:00
Pavel Mores	1dd06cf40d	Merge pull request #9551 from pmores/support-iommu runtime-rs: support IOMMU in qemu VMs	2024-04-29 15:26:11 +02:00
stevenhorsman	0bec8721cc	workflow: Skip commit checks for dependabout Dependabot doesn't follow all our commit format guidelines, so add a check and skip these if the author is `dependabot[bot]` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-29 13:45:51 +01:00
Wainer dos Santos Moschetta	631f6f6ed6	gha: switch CoCo tests on non-TEE to use qemu-coco-dev With the addition of the 'qemu-coco-dev' runtimeClass we no longer need to run CoCo tests on non-TEE environments with 'qemu'. As a result the tests also no longer need to set the "io.katacontainers.config.hypervisor.image" annotation to pods. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-29 05:45:11 -03:00
Wainer dos Santos Moschetta	c6708726ff	kata-deploy: install the new kata-qemu-coco-dev runtimeclass Created the runtimeclasses/kata-qemu-coco-dev.yaml file and updated the list of SHIMS. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-29 05:45:11 -03:00
Wainer dos Santos Moschetta	42fb5d7760	runtime: new qemu-coco-dev configuration Created a new configuration to configure Kata for CoCo without requiring TEE hardware so to allow developers implement/test/debug platform agnostic code on their workstations. It will also ease testing of CoCo features on CI with non-TEE supported VMs. This is based off qemu configuration. The following differences applied: - switched to confidential guest image/initrd - switched to confidential kernel - switched to 9p shared_fs Fixes #9487 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-29 05:45:10 -03:00
Fabiano Fidêncio	d3b300ff95	build: tests: Remove agent-opa Now that the `kata-agent` is being built with policy support, let's stop building the `kata-opa-agent`, reducing the amount of things we need to test and maintain. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-28 12:52:54 +02:00
Fabiano Fidêncio	b1710ee2c0	build: Build the shipped agent with policy enabled Now that the OPA binary is not required anymore, let's start shipping the agent with the policy enabled by default. The agent without policy enabled has 30MB, while it's 34MB with the policy enabled. This 4MB (~10%) increase is, IMHO, worth it in order to reduce the amount of components we have to maintain and test, including the possibility to also reduce the amount of possible rootfs / initrd images. Whoever wants to use the agent without policy enabled can simply do that by building their own agent. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-28 12:52:54 +02:00
Fabiano Fidêncio	7b039eb1b9	Merge pull request #9559 from fidencio/topic/remove-opa-stuff rootfs: Stop building and shipping OPA	2024-04-28 12:52:07 +02:00
Fabiano Fidêncio	fe21d7a58b	rootfs: Stop building and shipping OPA Since OPA binary was replaced by the regorus crate, we can finally stop building and shipping the binary. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-26 18:51:28 +02:00
Fabiano Fidêncio	7dd2fde22d	Revert "rootfs: Make OPA build working in docker for s390x and ppc64le" This reverts commit `d523e865c0`, as we will not depend on the OPA binary anymore. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-26 18:51:27 +02:00
Hyounggyu Choi	62bad976e0	Merge pull request #9562 from BbolroC/bump-golang build: Update golang version to 1.22.2	2024-04-26 17:58:04 +02:00
Steve Horsman	34a1cdc5c7	Merge pull request #9528 from cncal/patch-1 doc: fix missing document link	2024-04-26 15:22:15 +01:00
Hyounggyu Choi	80cb4a6c18	build: Update golang version to 1.22.2 As we have an issue with a golang version for `run-cri-containerd`, it is required to bump the language. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-26 15:50:29 +02:00
Pavel Mores	908ec31d9b	runtime-rs: fix iommu_platform support for qemu vhost-user-fs device iommu_platform support was already added on initial DeviceVhostUserFs introduction, however it incorrectly enabled iommu_platform also on non-CCW (e.g. PCI) systems. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	174fc8f44b	runtime-rs: support iommu_platform for qemu virtio-net device Note that it's only supported on CCW systems. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	0d038f20cc	runtime-rs: support iommu_platform for qemu virtio-serial device iommu_platform is only turned on for CCW systems. PartialEq is added to VirtioBusType to enable the '==' operator. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	66a2dc48ae	runtime-rs: support iommu_platform for qemu vhost-vsock device iommu_platform addition is controlled solely by the configuration file. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	d1e6f9cc4e	runtime-rs: add IOMMU to qemu VM if configured The adding itself is done by a new function add_iommu() that conforms with the add_() convention. Note though that this function is called internally, by the QemuCmdLine constructor, simply because there's nothing to trigger its invocation from QemuInner (unlike the other add_() functions so far). Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	0859f47a17	runtime-rs: add representation of '-device intel-iommu' to qemu-rs Following the golang shim example, the values are hardcoded. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:47:51 +02:00
Pavel Mores	702bf0d35e	runtime-rs: support qemu machine's 'kernel_irqchip' param We will want to set kernel_irqchip when enabling IOMMU and this commit adds the requisite support. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:42:54 +02:00
Alex Lyn	f72c6ba814	Merge pull request #9519 from emanuellima1/impl-rtc runtime-rs: Add RTC to QEMU cmdline	2024-04-26 17:44:47 +08:00
Dan Mihai	b42ddaf15f	Merge pull request #9530 from microsoft/saulparedes/improve_caching genpolicy: changing caching so the tool can run concurrently with itself	2024-04-25 13:06:23 -07:00
David Esparza	ae317a319f	Merge pull request #9549 from JakubLedworowski/fix-tarball-dockerfile build: Fix tarball not building correctly in docker	2024-04-25 09:40:20 -06:00
James O. D. Hunt	5bd614530f	Merge pull request #9525 from jodh-intel/gha-k8s-ch-dm gha: Enable k8s tests for cloud hypervisor with devicemapper	2024-04-25 09:28:09 +01:00
Fabiano Fidêncio	b4360e7e37	Merge pull request #9510 from microsoft/danmihai1/regorus-policy2 agent: use regorus instead of opa	2024-04-24 21:40:29 +02:00
James O. D. Hunt	ff7349b6f0	gha: Enable k8s tests for cloud hypervisor with devicemapper Enable the k8s tests for cloud hypervisor with devicemapper. Fixes: #9221. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Co-authored-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-24 16:32:51 +01:00
Dan Mihai	2400a4d249	Merge pull request #9428 from arc9693/archana1/genplicyfixes genpolicy: implement default methods for K8sResource trait	2024-04-24 08:04:19 -07:00
Dan Mihai	ff385eac41	agent: remove unnecessary comment Remove reminder to initialize Policy earlier, because currently there are no plans to initialize earlier. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-24 14:53:51 +00:00
Jakub Ledworowski	73366da9f9	build: Fix tarball not building correctly in docker When docker is installed on the host system using script from https://get.docker.com/ it automatically creates a docker group with gid=999. Then during docker build process of tarball, eg. make qemu-tdx-experimental-tarball docker is also installed inside the image with the same script, which also automatically adds docker group with gid=999. Then, the build tries to add a new group docker_on_host with gid=999, which already exists, which breaks the build. Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>	2024-04-24 15:35:36 +02:00
Calvin Liu	56a73ee704	doc: fix missing document link Document section hardware-requirements locates to /README.md for now. Signed-off-by: Calvin Liu <flycalvin@qq.com>	2024-04-24 17:34:30 +08:00
Fabiano Fidêncio	4e35f11a3d	Merge pull request #9535 from fidencio/topic/fix-crio-debug-drop-in kata-deploy: Stop append `log_level = "debug"` for CRI-O	2024-04-24 10:03:36 +02:00
Dan Mihai	89c85dfe84	Merge pull request #9432 from UiPath/fix-clh-wait clh: isClhRunning waits for full timeout when clh exits	2024-04-23 13:02:45 -07:00
Hyounggyu Choi	608df9b7df	Merge pull request #9494 from BbolroC/guest-pull-gha-s390x CC: Enable guest-pull tests on non-TEE for s390x	2024-04-23 21:22:37 +02:00
Dan Mihai	e5c3f5fa9b	tests: no generated policy for untested platforms Avoid auto-generating Policy on platforms that haven't been tested yet with auto-generated Policy. Support for auto-generated Policy on these additional platforms is coming up in future PRs, so the tests being fixed here were prematurely enabled. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-23 16:07:03 +00:00
Emanuel Lima	2bc5e3c6e2	runtime-rs: Add RTC to QEMU cmdline Add RTC by hardcoding the ooptions base=utc,driftfix=slew,clock=host Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-04-23 10:46:30 -03:00
Fabiano Fidêncio	d190c9d4d9	kata-deploy: Stop append `log_level = "debug"` for CRI-O This should only be done once, and if CRI-O restarts, there's a big chance kata-deploy will also restart and the user would end up with a file that looks like: ``` [crio] log_level = "debug" [crio] log_level = "debug" [crio] log_level = "debug" ... ``` And that would simply cause CRI-O to not start. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-23 14:51:35 +02:00
Greg Kurz	42a79801f3	Merge pull request #9524 from littlejawa/fix_createruntime_hook_not_called runtime: Call CreateRuntime hooks at container creation time	2024-04-23 13:43:36 +02:00
Fupan Li	469c4e4f44	Merge pull request #9335 from Tim-Zhang/fix-passfd-fifo-open passfd-io: fix FIFO opening and vsock handling	2024-04-23 09:04:45 +08:00
Alex Lyn	bc2cf95e7a	Merge pull request #9517 from amshinde/update-storage-source-pciblock runtime-rs: Update storage source for pci block devices	2024-04-23 07:32:36 +08:00
Dan Mihai	5d31eb4847	agent: use regorus 0.1.4 Use regorus 0.1.4 from crates.io, instead of its source code repository. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 23:21:17 +00:00
Dan Mihai	ed6412b63c	tests: k8s: reduce the policy tests output noise Hide some of the kubectl output, to reduce the size and redundancy of this output. Fixes: #9388 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 19:59:33 +00:00
Dan Mihai	df23eb09a6	agent: use regorus instead of opa Implement Agent Policy using the regorus crate instead of the OPA daemon. The OPA daemon will be removed from the Guest rootfs in a future PR. Fixes: #9388 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 19:58:30 +00:00
Dan Mihai	58e608d61a	tests: remove k8s-policy-set-keys.bats Remove k8s-policy-set-keys.bats in preparation for using the regorus crate instead of the OPA daemon for evaluating the Agent Policy. This test depended on sending HTTP requests to OPA. Fixes: #9388 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 19:49:38 +00:00
Dan Mihai	b509c1beee	agent: lock anyhow version to 1.0.58 Lock anyhow version to 1.0.58 because: - Versions between 1.0.59 - 1.0.76 have not been tested yet using Kata CI. However, those versions pass "make test" for the Kata Agent. - Versions 1.0.77 or newer fail during "make test" - see https://github.com/kata-containers/kata-containers/issues/9538. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 19:49:15 +00:00
Archana Shinde	cc6b671101	runtime-rs: Update storage source for pci block devices In case of block devices using virtio-block, we need to pass the pci-path as the storage source field to the agent. Current the virt-path is being passed which works just for mmio block devices. In the future when support is added for scsi, block-ccw and pmem devices, the storage source would need to be handled accordingly. Fixes: #9034 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-04-22 11:36:58 -07:00
Hyounggyu Choi	f10744df99	CC: Enable guest-pull tests on non-TEE for s390x This commit is to add a new CI job to run-k8s-tests-on-zvsi.yaml. Why the job is not configured in run-kata-coco-tests.yaml by having it integrated with `run-k8s-tests-coco-nontee` is: - It uses k3s instead of AKS - It runs on a self-hosted runner These differences make the integrated job not easy to read and maintain when it comes to incorporating other platforms in the near future. Fixes: #9467 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-22 17:15:20 +02:00
Greg Kurz	6ca0f09710	Merge pull request #9518 from microsoft/danmihai1/agent-cargo-lock agent: update cargo.lock	2024-04-22 13:36:06 +02:00
Tim Zhang	aeba483ec8	agent: avoid fd leakage of passfd-io In do_create_container and do_exec_process, we should create the proc_io first, in case there's some error occur below, thus we can make sure the io stream closed when error occur. Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-04-22 17:39:33 +08:00
Tim Zhang	8441187d5e	runtime-rs: fix FIFO handling Fixes: #9334 In linux, when a FIFO is opened and there are no writers, the reader will continuously receive the HUP event. This can be problematic. To avoid this problem, we open stdin in write mode and keep the stdin-writer We need to open the stdout/stderr as the read mode and keep the open endpoint until the process is delete. otherwise, the process would exit before the containerd side open and read the stdout fifo, thus runD would write all of the stdout contents into the stdout fifo and then closed the write endpoint. Then, containerd open the stdout fifo and try to read, since the write side had closed, thus containerd would block on the read forever. Here we keep the stdout/stderr read endpoint File in the common_process, which would be destroied when containerd send the delete rpc call, at this time the containerd had waited the stdout read return, thus it can make sure the contents in the stdout/stderr fifo wouldn't be lost. Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-04-22 17:39:33 +08:00
Tim Zhang	d68eb7f0ad	agent: Fix close_stdin for passfd-io In scenario passfd-io, we should wait for stdin to close itself instead of manually intervening in it. Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-04-22 17:39:32 +08:00
Steve Horsman	ff9985fc50	Merge pull request #9490 from wainersm/port_attestation_nontee_job gha: move attestation tests to run-k8s-tests-coco-nontee	2024-04-22 10:23:11 +01:00
Archana Choudhary	4a010cf71b	genpolicy: add default implementations for K8sResource trait This commit adds default implementations for following methods of K8sResource trait: - generate_policy - serialize Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	6edc3b6b0a	genpolicy: add default implementation for use_sandbox_pidns This patch adds a default implementation for the use_sandbox_pidns and updates the structs that implement the K8sResource trait to use the default. Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	d5d3f9cda7	genpolicy: add default implementation for use_host_network - Provide default implementation for use_host_network - Remove default implementation from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	9a3eac5306	genpolicy: add default impl for get_containers - Provide default impl for get_containers - Remove default impl from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	2db3470602	genpolicy: add default impl for get_container_mounts_and_storages - Provide default impl for get_container_mounts_and_storages - Remove default impl from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	09b0b4c11d	genpolicy: add default implementation for get_sandbox_name - Provide default implementation for get_sandbox_name in K8sResource trait - Remove default implementation from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:55:32 +00:00
Archana Choudhary	43e9de8125	genpolicy: add default implementation for get_annotations - Provide default implementation for get_annontations. - Remove default implementation from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:55:32 +00:00
Saul Paredes	2149cb6502	genpolicy: changing caching so the tool can run concurrently with itself Based on 3a1461b0a5186a92afedaaea33ff2bd120d1cea0 Previously the tool would use the layers_cache folder for all instances and hence delete the cache when it was done, interfereing with other instances. This change makes it so that each instance of the tool will have its own temp folder to use. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-19 15:46:30 -07:00
Wainer dos Santos Moschetta	1e35291fd5	gha: move attestation tests to run-k8s-tests-coco-nontee The new run-k8s-tests-coco-nontee job should be the home of attestation tests. Changed run-k8s-tests-coco-nontee to get KBS installed and by the time the KBS variable is exported in the environment then the attestation tests will kick in (likewise they will skip in run-k8s-tests-on-aks). Fixes #9455 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-19 14:51:30 -03:00
Steve Horsman	7e12d588c0	Merge pull request #9485 from sparky005/update_golang.org/x/net update golang.org/x/net	2024-04-19 11:26:13 +01:00
Amulya Meka	12964256a4	Merge pull request #9521 from Amulyam24/gha gha: tag k8s tests on ppc64le to ppc64le-runner-01	2024-04-19 15:08:08 +05:30
Julien Ropé	70e798ed35	runtime: Call CreateRuntime hooks at container creation time CreateRuntime hooks are called at the CreateSandbox time, but not after CreateContainer. Fixes: #9523 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-04-19 10:25:02 +02:00
Alex Lyn	3456483df9	Merge pull request #9513 from stevenhorsman/bump-stale-version gha: stale: Bump stalebot version	2024-04-19 15:15:10 +08:00
Alex Lyn	c147f0f4ed	Merge pull request #9516 from sprt/rlz-340 release: bump version for 3.4.0 release	2024-04-19 15:12:26 +08:00
Amulyam24	8255ed248a	gha: tag k8s tests on ppc64le to ppc64le-runner-01 This PR aims at running the k8s tests to one runner on ppc64le. Fixes: #9520 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-04-19 12:04:25 +05:30
Hyounggyu Choi	304dc1e4da	doc: Update how-to-run-kata-containers-with-SE-VMs.md This is to update a document `how-to-run-kata-containers-with-SE-VMs` on using confidential artifacts to build a secure image. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-19 08:31:12 +02:00
Hyounggyu Choi	8fbed9f6a4	local-build: Use confidential kernel and initrd for boot-image-se This is to make `boot-image-se-tarball` use confidential kernel and initrd instead of vanilla version of artifacts. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-19 07:09:04 +02:00
Dan Mihai	4242801b1c	agent: update cargo.lock Update Kata Agent's Cargo.lock after the recent changes to Cargo.toml. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-18 17:12:48 +00:00
Aurélien Bombo	95971e4a42	release: bump version for 3.4.0 release Release v3.4.0. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-04-18 17:08:06 +00:00
Steve Horsman	6dd038fd58	Merge pull request #9501 from zvonkok/check-fixes kata: Remove check for "Fixes" in PR	2024-04-18 17:48:50 +01:00
Hyounggyu Choi	2b9c439fcf	Merge pull request #9508 from BbolroC/gha-s390x-k8s-label gha: Make integration tests for s390x run on s390x-large runners	2024-04-18 18:05:01 +02:00
Adil Sadik	1c5ca0c915	runtime: update golang.org/x/net updates golang.org/x/net to newer version that closes some reported vulnerabilities and security issues Fixes #9486 Signed-off-by: Adil Sadik <sparky.005@gmail.com>	2024-04-18 10:55:02 -04:00
Tim Zhang	221c5b51fe	dragonball: fix EPOLLHUP/EPOLLERR events handling in vsock 1. EPOLLHUP events also need to be read and will be got len 0. 2. We should kill the connection when EPOLLERR events are received. Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-04-18 20:47:02 +08:00
Hyounggyu Choi	49a0d57f66	gha: Make integration tests for s390x run on s390x-large runners This is to make a workflow `run-k8s-tests` and `run-cri-containerd` (s390x and zvsi) run only on the runners labeled by `s390x-large`. Fixes: #9507 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-18 14:35:24 +02:00
stevenhorsman	cf5c3dc155	gha: stale: Bump stalebot version - Bump the stalebot action version to v9 as that fixes the ``` Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/stale@v8. ``` warning. Fixes: #9512 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-18 11:41:09 +01:00
Steve Horsman	bf16b18180	Merge pull request #9503 from stevenhorsman/stale-pr-remove-date gha: stale: Remove the start-date	2024-04-18 09:36:27 +01:00
Hyounggyu Choi	566a6de594	Merge pull request #9505 from BbolroC/remove-crio-nightly-test-s390x gha: Remove k8s-cri-containerd-rhel9-e2e-tests for s390x	2024-04-18 09:31:07 +02:00
Hyounggyu Choi	cc22dc33f2	Merge pull request #9489 from BbolroC/install-opa-in-docker rootfs: Make OPA build working in docker for s390x and pp…	2024-04-18 00:26:11 +02:00
Dan Mihai	5ceed689eb	Merge pull request #9492 from microsoft/danmihai1/pod-tests tests: k8s: inject agent policy failures (part 3)	2024-04-17 14:01:11 -07:00
Hyounggyu Choi	e046f5e652	gha: Remove k8s-cri-containerd-rhel9-e2e-tests for s390x This commit is simply to remove a CI workflow `k8s-cri-containerd-rhel9-e2e-tests`. Fixes: #9504 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-17 15:36:42 +02:00
Zvonko Kaiser	eda3bfe2ef	config: Add NVIDIA GPU SNP, TDX configuration files Fixes: #9475 For TDX and SNP add NVIDIA specific configuration files Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-04-17 12:49:13 +00:00
Wainer Moschetta	2d8e7933c5	Merge pull request #9461 from GabyCT/topic/uninstallkbs tests/k8s: Add uninstall kbs client command function	2024-04-17 09:36:37 -03:00
Zvonko Kaiser	d7b24c04e5	Merge pull request #9473 from zvonkok/gpu-image-initrd-versions version: add initrd, image NVIDIA sections	2024-04-17 13:22:05 +02:00
stevenhorsman	7235988605	gha: stale: Remove the start-date As documented in https://github.com/actions/stale?tab=readme-ov-file#start-date > The start date is used to ignore the issues and pull requests created before the start date. > Particularly useful when you wish to add this stale workflow on an existing repository > and only wish to stale the new issues and pull requests. As we don't want need to treat PRs older than May 2023 as a special case, then remove this option. Fixes: #9502 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-17 11:19:56 +01:00
Zvonko Kaiser	395e93acd5	kata: Remove Issue - PR dependency We've discussed this over and over. Let's try to get to an agreement here. I will use this issue to remove the mandatory Issue - PR dependency. Fixes: #9500 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-04-17 09:53:08 +00:00
Archana Shinde	af3b19ed18	Merge pull request #9084 from amshinde/document-intel-gpu-vfio docs: Document Intel Discrete GPUs usage with Kata	2024-04-16 16:17:03 -07:00
Archana Shinde	973a15332a	spell-check: Add missing words to spell-check Add missing words to spell-check dictionaries Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-04-16 11:50:02 -07:00
Archana Shinde	6f97dc1f60	static-checks: Rename file in doc to make static checks happy Configuration file for qemu with runtime-rs was recently renamed. Doc contains name for old file. This was somehow not caught in the CI earlier. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-04-16 11:50:02 -07:00
Archana Shinde	87f0097b18	docs: Document Intel Discrete GPUs usage with Kata Document describes the steps needed to pass an entire Intel Discrete GPU as well a GPU SR-IOV interface to a Kata Container. Fixes: #9083 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-04-16 11:50:02 -07:00
Dan Mihai	2c4d1ef76b	tests: k8s: inject agent policy failures (part 3) Auto-generate the policy and then simulate attacks from the K8s control plane by modifying the test yaml files. The policy then detects and blocks those changes. These test cases are using K8s Pods. Additional policy failures are injected during CI using other types of K8s resources - e.g., using Jobs and Replication Controllers - from separate PRs. Fixes: #9491 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-16 18:15:12 +00:00
Dan Mihai	c26dad8fe5	Merge pull request #9294 from burgerdev/burgerdev/genpolicy-configurable-pause genpolicy: support insecure registries and custom pause containers	2024-04-16 09:39:33 -07:00
GabyCT	9238daf729	Merge pull request #9464 from microsoft/danmihai1/rc-tests tests: k8s: inject agent policy failures (part2)	2024-04-16 10:01:39 -06:00
Hyounggyu Choi	d523e865c0	rootfs: Make OPA build working in docker for s390x and ppc64le The commit is to make the OPA build from source working in `ubuntu-rootfs-osbuilder`. To achieve the goal, the configuration is changed as follows: - Switch the make target to `ci-build-linux-static` not triggering docker-in-docker build - Install go in the builder image for s390x and ppc64le Fixes: #9466 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-16 16:49:12 +02:00
Greg Kurz	aca6a1bcb5	Merge pull request #9353 from pmores/pr-8866-follow-up runtime-rs: refactor qemu driver	2024-04-16 16:07:36 +02:00
Fabiano Fidêncio	7bb5490676	Merge pull request #9479 from wainersm/fix_coco_nontee_jobs gha: make run-kata-coco-tests inherit secrets	2024-04-16 13:46:52 +02:00
Hyounggyu Choi	7b11fd2546	Merge pull request #9471 from BbolroC/coco-kernel-version-s390x version: Add coco name and version for {image,initrd} for s390x	2024-04-15 16:03:20 +02:00
Wainer dos Santos Moschetta	77541008fc	gha: make run-kata-coco-tests inherit secrets The new CoCo non-tee job introduced on commit `0d5399ba92` need to read secrets like AZ_TENANT_ID, so run-kata-coco-tests workflow should inherit the secrets from the caller workflow. Fixes #9477 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-15 10:53:44 -03:00
Zvonko Kaiser	78e3ebb011	version: add initrd, image NVIDIA sections Fixes: #9472 For initrd and image, the related NVIDIA will not use the default targets and we will pin them to a specific release. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-04-15 13:31:35 +00:00
Wainer Moschetta	c85e1ca674	Merge pull request #9404 from ldoktor/ci-mcp-timeout ci.ocp: Increase the MCP update time	2024-04-15 09:42:14 -03:00
Hyounggyu Choi	3ec209dcf1	Merge pull request #9469 from BbolroC/coco-kernel-config-s390x kernel: Adjust s390x config for confidential containers	2024-04-15 13:55:28 +02:00
Hyounggyu Choi	8fce600493	version: Add coco name and version for {image,initrd} for s390x In order to build a coco {image,initrd}, it is required to specify its name and version in versions.yaml. This commit is to add the configuration for them, respectively. Fixes: #9470 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-15 12:53:00 +02:00
Hyounggyu Choi	a792dc3e2b	kernel: Adjust s390x config for confidential containers `CONFIG_TN3270_TTY` and `CONFIG_S390_AP_IOMMU` are dropped for s390x in 6.7.x which is used for a confidential kernel. But they are still used for a vanilla kernel. So we need to add them to the whitelist. Fixes: #9465 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-15 10:28:59 +02:00
Hyounggyu Choi	32f58abfde	Merge pull request #9403 from BbolroC/runtime-rs-ci-qemu CI: Enable GHA cri-containerd workflow for runtime-rs with QEMU	2024-04-15 09:31:25 +02:00
Xuewei Niu	402d8a968e	Merge pull request #9430 from UiPath/fix-agent-shutdown agent: shutdown vm on exit when agent is used as init process	2024-04-15 10:47:07 +08:00
Wainer Moschetta	0a04f54a8e	Merge pull request #9454 from GabyCT/topic/pulltype gha: Define unbound PULL TYPE variable	2024-04-12 14:48:56 -03:00
Wainer Moschetta	a0b21d0e14	Merge pull request #9424 from wainersm/cc_guest_pull-encrypted CC: run guest-pull tests on non-TEE jobs	2024-04-12 09:34:35 -03:00
Hyounggyu Choi	cf20a6a4ae	gha: Add qemu-runtime-rs to VMM matrix for run-cri-containerd This commit expands the VMM matrix for run-cri-containerd, adding a new item `qemu-runtime-rs` for a test scenario where the VMM is QEMU and runtime-rs is employed. This expansion affects the workflows for both x86_64 and s390x platforms. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-12 12:25:53 +02:00
Hyounggyu Choi	606f8e1ab2	runtime-rs: Adjust configuration for qemu-runtime-rs To make `qemu-runtime-rs` working for CI, we have to rename a configuration template file and `CONFIG_FILE_QEMU` in Makefile. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-12 12:25:53 +02:00
Hyounggyu Choi	3c217c6c15	ci\|cri-containerd: Introduce qemu-runtime-rs for KATA_HYPERVISOR `qemu-runtime-rs` will be utilized to handle a test scenario where the VMM is QEMU and runtime-rs is employed. Note: Some of the tests are skipped. They are going to be reintegrated in the follow-up PR (Check out #9375). Fixes: #9371 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-12 12:25:53 +02:00
Alexandru Matei	9e01732f7a	agent: shutdown vm on exit when agent is used as init process Linux kernel generates a panic when the init process exits. The kernel is booted with panic=1, hence this leads to a vm reboot. When used as a service the kata-agent service has an ExecStop option which does a full sync and shuts down the vm. This patch mimicks this behavior when kata-agent is used as the init process. Fixes: #9429 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-04-12 11:32:31 +03:00
Alexandru Matei	54923164b5	clh: isClhRunning waits for full timeout when clh exits isClhRunning uses signal 0 to test whether the process is still alive or not. This doesn't work because the process is a direct child of the shim. Once it is dead the process becomes zombie. Since no one waits for it the process lingers until its parent dies and init reaps it. Hence sending signal 0 in isClhRunning will always return success whether the process is dead or not. This patch calls wait to reap the process, if it succeeds that means it is our child process, if not we send the signal. Fixes: #9431 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-04-12 11:31:53 +03:00
Dan Mihai	e51cbdcff9	tests: k8s: inject agent policy failures (part2) Auto-generate the policy and then simulate attacks from the K8s control plane by modifying the test yaml files. The policy then detects and blocks those changes. These test cases are using K8s Replication Controllers. Additional policy failures will be injected using other types of K8s resources - e.g., using Pods and/or Jobs - in separate PRs. Fixes: #9463 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-11 21:08:53 +00:00
Markus Rudy	77540503f9	genpolicy: add support for insecure registries genpolicy is a handy tool to use in CI systems, to prepare workloads before applying them to the Kubernetes API server. However, many modern build systems like Bazel or Nix restrict network access, and rightfully so, so any registry interaction must take place on localhost. Configuring certificates for localhost is tricky at best, and since there are no privacy concerns for localhost traffic, genpolicy should allow to contact some registries insecurely. As this is a runtime environment detail, not a target environment detail, configuring insecure registries does not belong into the JSON settings, so it's implemented as command line flags. Fixes: #9008 Signed-off-by: Markus Rudy <webmaster@burgerdev.de>	2024-04-11 22:29:03 +02:00
Wainer dos Santos Moschetta	4f74617897	tests: pass --overwrite-existing to aks get-credentials By passing --overwrite-existing to `aks get-credentials` it will stop asking if I want to overwrite the existing credentials. This is handy for running the scripts locally. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta	3508f3a43a	tests/k8s: use CoCo image on guest-pull when non-TEE When running on non-TEE environments (e.g. KATA_HYPERVISOR=qemu) the tests should be stressing the CoCo image (/opt/kata/share/kata-containers/kata-containers-confidential.img) although currently the default image/initrd is built to be able to do guest-pull as well. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta	c24f13431d	tests/k8s: enable guest-pull tests on non-TEE Enabled guest-pull tests on non-TEE environment. It know requires the SNAPSHOTTER environment variable to avoid it running on jobs where nydus-snapshotter is not installed Fixes: #9410 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta	0d5399ba92	gha: Create CoCo tests jobs on non-TEE Created the new run-k8s-tests-coco-nontee jobs for running CoCo tests on non-TEE. It currently generates the run-k8s-tests-coco-nontee(qemu, nydus, guest-pull) job only to run the guest-pull tests. Fixes: #9410 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-11 15:31:40 -03:00
Gabriela Cervantes	5420595d03	tests/k8s: Add uninstall kbs client command function This PR adds the function to uninstall kbs client command function specially when we are running with baremetal devices. Fixes #9460 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-11 17:06:11 +00:00
Steve Horsman	6b2d655857	Merge pull request #9457 from justxuewei/fs_manager_tests agent: Fix the issue with the "test_new_fs_manager" test	2024-04-11 17:02:58 +01:00
Fabiano Fidêncio	5611233ed8	Merge pull request #9439 from microsoft/danmihai1/job-tests tests: k8s: inject agent policy failures	2024-04-11 17:21:54 +02:00
Markus Rudy	bc2292bc27	genpolicy: make pause container image configurable CRIs don't always use a pause container, but even if they do the concrete container choice is not specified. Even if the CRI config can be tweaked, it's not guaranteed that registries in the public internet can be reached. To be portable across CRI implementations and configurations, the genpolicy user needs to be able to configure the container the tool should append to the policy. Signed-off-by: Markus Rudy <webmaster@burgerdev.de>	2024-04-11 16:26:35 +02:00
Markus Rudy	8b30fa103f	genpolicy: parse json settings during config init Decouple initialization of the Settings struct from creating the AgentPolicy struct, so that the settings are available for evaluating, extending or overriding command line arguments. Signed-off-by: Markus Rudy <webmaster@burgerdev.de>	2024-04-11 16:17:33 +02:00
Xuewei Niu	50f78ec52c	agent: Fix the issue with the "test_new_fs_manager" test This patch introduces a one-time cpath to mitigate the cgroup residuals. It might break the device cgroup merging rules when the cgroup has children. Fixes: #9456 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-04-11 18:06:05 +08:00
GabyCT	08dcdc62de	Merge pull request #9423 from GabyCT/topic/improvecleanup tests: Improve the kbs_k8s_delete function	2024-04-10 14:28:21 -06:00
Gabriela Cervantes	4a2ee3670f	gha: Define unbound PULL TYPE variable This PR defines the PULL_TYPE variable to avoid failures of unbound variable when this is being test it locally. Fixes #9453 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-10 17:16:19 +00:00
GabyCT	dab837d71d	Merge pull request #9450 from GabyCT/topic/fixinnydus gha: Fix indentation in gha run script	2024-04-10 11:07:56 -06:00
David Esparza	9e1368dbc5	Merge pull request #9391 from dborquez/add-onednn-openvino-ml-benchs add onednn and openvino ml-benchmarks	2024-04-09 19:03:00 -06:00
Dan Mihai	ea31df8bff	Merge pull request #9185 from microsoft/saulparedes/genpolicy_add_containerd_pull genpolicy: Add optional toggle to pull images using containerd	2024-04-09 12:29:19 -07:00
Gabriela Cervantes	6ebdcf8974	gha: Fix indentation in gha run script This PR fixes an identation in gha run script. Fixes #9449 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-09 16:37:17 +00:00
Greg Kurz	89353249fc	Merge pull request #8988 from beraldoleal/ci-docs docs: adding an initial CI documentation	2024-04-09 18:26:15 +02:00
Dan Mihai	2252490a96	tests: k8s: inject agent policy failures Auto-generate the policy and then simulate attacks from the K8s control plane by modifying the test yaml files. The policy then detects and blocks those changes. These test cases are using K8s Jobs. Additional policy failures will be injected using other types of K8s resources - e.g., using Pods and/or Replication Controllers - in future PRs. Fixes: #9406 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-09 15:36:57 +00:00
David Esparza	facf3c9364	metrics: Add onednn benchmark. This PR adds onednn test to exercise additional ML benchmarks. Onednn is an Intel-optimized library for Deep Neural Networks. Fixes: #9390 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	3bde511d0d	metrics: Add openvino benchmark. This PR adds openvino test in order to exercise additional ML benchmarks. OpenVino bench used to optimize and deploy deep learning models. Fixes: #9389 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	b37c5f8ba1	metrics:libs: Add HTTPS and HTTP vars to docker build. Include HTTP and HTTPS env variables in the building docker images because they are required to download packages such as Phoronix. Added a restriction that verifies that docker building images is performed as root. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	3355dd9e2b	metrics:libs: Adds a function to set new kata configuration. Adds a function that receives as a single parameter the name of a valid Kata configuration file which will be established as the default kata configuration to start kata containers. Adds a second function that returns the path to the current kata configuration file. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	cb4380d1c9	metrics: common: Add function to clean the cache. The function clear the Page Cache only. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	3a419ba3b1	metrics: common: Add function to update kata config. Add an extra function that updates kata config to use the max num. of vcpus available and to use the available memory in the system. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
Beraldo Leal	959e56525c	docs: adding an initial CI documentation This is actually a first attempt to document our CI, and all this content was based on the document created by Fabiano Fidencio (kudos to him). We are just moving the content and discussion from Google Docs to here. I used the "poetic license" to add some notes on what I believe our CI will look like in the future. Fixes #9006 Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-04-09 09:21:47 -04:00
Saul Paredes	51498ba99a	genpolicy: toggle containerd pull in tests - Add v1 image test case - Install protobuf-compiler in build check - Reset containerd config to default in kubernetes test if we are testing genpolicy - Update docker_credential crate - Add test that uses default pull method - Use GENPOLICY_PULL_METHOD in test Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-08 19:28:29 -07:00
Dan Mihai	f60c9eaec3	Merge pull request #9398 from microsoft/danmihai1/policy-test-cleanup tests: k8s: improve the Agent Policy tests	2024-04-08 15:37:07 -07:00
Gabriela Cervantes	fb4c359cc2	tests: Improve the kbs_k8s_delete function This PR improves the kbs_k8s_delete function to verify that the resources were properly deleted for baremetal environments. Fixes #9379 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-08 18:03:07 +00:00
Saul Paredes	c96ebf237c	genpolicy: add containerd pull method Add optional toggle to use existing containerd installation to pull and manage container images. This adds support to a wider set of images that are currently not supported by standard pull method, such as those that use v1 manifest. Fixes: #9144 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-08 09:56:59 -07:00
Greg Kurz	8b996b9307	Merge pull request #9331 from egernst/foobar katautils: check number of cores on the system intead of go runtime	2024-04-08 18:38:49 +02:00
Greg Kurz	934beb5ae4	Merge pull request #9421 from gkurz/bump-node-js-20 gha: Bump various actions to use Node.js 20	2024-04-08 18:22:28 +02:00
Wainer Moschetta	fba1d394d7	Merge pull request #9369 from ChengyuZhu6/sandbox-image agent:image: Support different pause image in the guest for guest pull	2024-04-08 11:06:21 -03:00
Steve Horsman	3242f55691	Merge pull request #8870 from LindaYu17/aa2main port attestation agent from CCv0 branch to main branch	2024-04-08 15:01:07 +01:00
James O. D. Hunt	42936cb92c	Merge pull request #9372 from jodh-intel/docs-kata-manager-update docs: kata-manager: Update with latest details	2024-04-08 13:23:23 +01:00
stevenhorsman	864e9c22ba	agent: doc: Add new config doc Document the new guest_components_rest_api config parameter Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
stevenhorsman	29a5652e31	packaging: guest-components, set new environment variables - Set KBC_PROVIDER and ATTESTER rather than TEE_PLATFORM to avoid tss build issues for vTPM attester(s) - There are future plans to make a matching TEE_PLATFORM, so this can be simplified once that is available Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
stevenhorsman	a284a20a14	tests: Filter CoCo tests on ppc64le/arm - At the moment we aren't supporting ppc64le or aarch64 for CoCo, so filter out these tests from running Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
stevenhorsman	a0c03966c2	versions: Bump guest-components - Bump guest-components to try and test compatibility with the latest version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
stevenhorsman	101a5bf273	packaging: Update guest-components Dockerfile - Switch to Ubuntu 20.04 for building guest-components as The rootfs is based on 20.04, so we need matching GLIBC versions. See #8955 - Add dependencies needed by TDX verifier as we want to build for all platforms Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
Gabriela Cervantes	6d85025e59	test/k8s: Add basic attestation test - Add basic test case to check that a ruuning pod can use the api-server-rest (and attestation-agent and confidential-data-hub indirectly) to get a resource from a remote KBS Fixes #9057 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Co-authored-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-08 11:38:53 +01:00
Biao Lu	f0edec84f6	agent: Launch api-server-rest If 'rest_api' is configured, let's start the api-server-rest after the attestation-agent and the confidential-data-hub have been started. Fixes: #7555 Signed-off-by: Biao Lu <biao.lu@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-04-08 11:38:53 +01:00
Biao lu	4d752e6350	agent: Add config for api-server-rest Add configuration for 'rest api server'. Optional configurations are 'agent.rest_api=attestation' will enable attestation api 'agent.rest_api=resource' will enable resource api 'agent.rest_api=all' will enable all (attestation and resource) api Fixes: #7555 Signed-off-by: Biao Lu <biao.lu@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-04-08 11:06:14 +01:00
Biao Lu	f476d671ed	agent: Launch the confidential data hub Let's introduce a new method to start the confidential data hub and the attestation agent. The former depends on the later, and it needs to be started before the RPC server. Starting the attestation components is based on whether the confidential containers guest components binaries are found in the rootfs. Fixes: #7544 Signed-off-by: Biao Lu <biao.lu@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-04-08 11:06:14 +01:00
Greg Kurz	be8f0cb520	Merge pull request #9402 from deagon/feat/debug-threads qemu: show the thread name when enable the hypervisor.debug option	2024-04-08 11:04:36 +02:00
Hyounggyu Choi	e39be7a45e	Merge pull request #9415 from BbolroC/fix-dir-removal-error GHA: Implement secondary GITHUB_WORKSPACE cleanup on 1st failure	2024-04-08 10:44:44 +02:00
ChengyuZhu6	8c897f822c	agent:image: Support different pause image in the guest for guest pull Support different pause images in the guest for guest-pull, such as k8s pause image (registry.k8s.io/pause) and openshift pause image (quay.io/bpradipt/okd-pause). Fixes: #9225 -- part III Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-04-07 09:00:10 +08:00
GabyCT	9d2c5b180e	Merge pull request #9419 from GabyCT/topic/fxlatency metrics: Improve latency test cleanup	2024-04-05 16:31:00 -06:00
Wainer Moschetta	aae7048d4f	Merge pull request #9273 from ldoktor/kcli-coco-kbs tests: Support for kbs setup on kcli	2024-04-05 18:55:58 -03:00
Fabiano Fidêncio	f09bb98f51	Merge pull request #8840 from fidencio/topic/update-tdx-artefacts-to-the-new-host-os tdx: Update TDX artefacts to be used with the Ubuntu 23.10 / CentOS 9 stream OSVs.	2024-04-05 22:36:03 +02:00
Fabiano Fidêncio	cdb8531302	hypervisor: Simplify TDX protection detection Let's rely on the kvm module 'tdx' parameter to do so. This aligns with both OSVs (Canonical, Red Hat, SUSE) and the TDX adoption (https://github.com/intel/tdx-linux) stacks. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 19:51:27 +02:00
Fabiano Fidêncio	2ee03b5dc3	tdvf: Adapt the build command This is done in order to match the example from: https://github.com/intel/tdx-linux/wiki/Instruction-to-set-up-TDX-host-and-guest#build-tdvf-image Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 19:51:27 +02:00
Fabiano Fidêncio	b7cccfa019	qemu: tdx: Adapt command line This commit is a mess, but I'm not exactly sure what's the best way to make it less messy, as we're getting QEMU TDX to work while partially reverting `1e34220c41`. With that said, let me cover the content of this commit. Firstly, we're reverting all the changes related to "memory-backend-memfd-private", as that's what was used with the previous host stack, but it seems it didn't fly upstream. Secondly, in order to get QEMU to properly work with TDX, we need to enforce the 'private=on' knob and use the "memory-backend-ram", and we're doing so, and also making sure to test the `private=on` newly added knob. I'm sorry for the confusion, I understand this is not optimal, I just don't see an easy path to do changes without leaving the code broken during those changes. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 19:51:27 +02:00
Greg Kurz	424a5e243f	gha: Bump to `actions/[down\|up]load-artifact@v4` (all the rest) `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. This fixes all remaining sites. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:51 +02:00
Greg Kurz	dbc5dc7806	gha: Bump to `actions/[down\|up]load-artifact@v4` (k8s tests on garm) `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. As explained at [1] : > The contents of an Artifact are uploaded together into an immutable > archive. They cannot be altered by subsequent jobs. Both of these > factors help reduce the possibility of accidentally corrupting > Artifact files. This means that artifacts cannot have the same name. Adapt the `run-k8s-tests-on-garm` workflow accordingly by embedding all the other `${{ vmm.* }}` fields and `${{ inputs.tag }}` in the artifact names that would otherwise collide. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:51 +02:00
Greg Kurz	62a54ffa70	gha: Bump to `actions/[down\|up]load-artifact@v4` (kata static tarball) `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. As explained at [1] : > The contents of an Artifact are uploaded together into an immutable > archive. They cannot be altered by subsequent jobs. Both of these > factors help reduce the possibility of accidentally corrupting > Artifact files. This means that artifacts cannot have the same name. Adapt all `build-kata-static-tarball` workflows accordingly by embedding `${{ matrix.asset }}` in the artifact names that would otherwise collide. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:51 +02:00
Greg Kurz	7f2ce914a1	gha: Bump to `actions/checkout@v4` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Greg Kurz	0a43d26c94	gha: Bump to `docker/login-action@v3` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Greg Kurz	06c9c0d7db	gha: Bump to `docker/build-push-action@v5` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Greg Kurz	8c21844aef	gha: Bump to `docker/setup-buildx-action@v3` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Greg Kurz	03cbe6a011	gha: Bump to `docker/setup-qemu-action@v3` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Hyounggyu Choi	4493459937	GHA: Implement secondary GITHUB_WORKSPACE cleanup on 1st failure Occasionally, the removal of GITHUB_WORKSPACE fails for self-hosted runners because one of the subdirectories is not empty. This is likely due to another process occupying the directory at the time. Implementing a secondary cleanup resolves this issue. This commit focuses on the implementation for the secondary cleanup. Fixes: #9317 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-05 11:41:51 +02:00
Fabiano Fidêncio	6b4cc5ea6a	Revert "qemu: tdx: Workaround SMP issue with TDX 1.5" This reverts commit `d1b54ede29`. Conflicts: src/runtime/virtcontainers/qemu.go This commit was a hack that was needed in order to get QEMU + TDX to work atop of the stack our CI was running on. As we're moving to "the officially supported by distros" host OS, we need to get rid of this. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 10:23:52 +02:00
Fabiano Fidêncio	582b5b6b19	govmm: tdx: Expose the private=on\|off knob The private=on\|off knob is required in order to properly lauunch a TDX guest VM. This is a brand new property that is part of the still in-flight patches adding TDX support on QEMU. Please, see: `3fdd8072da` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 10:23:52 +02:00
Fabiano Fidêncio	fe5adae5d9	qemu-tdx: Update to v8.1.0 + TDX patches Let's update the QEMU to the one that's officially maintained by Intel till all the TDX patches make their way upstream. We've had to also update python to explicitly use python3 and add python3-venv as part of the dependencies. Fixes: #8810 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 10:23:51 +02:00
Alex Lyn	0e0a361f0e	Merge pull request #8782 from Apokleos/device-increate-count bugfix and refactor device increate count	2024-04-05 13:43:49 +08:00
Dan Mihai	6f9f8ae285	Merge pull request #9413 from microsoft/saulparedes/ensure_unique_rg_in_gha gha: ensure unique resource group name	2024-04-04 17:13:09 -07:00
GabyCT	80d926c357	Merge pull request #9411 from microsoft/danmihai1/k8s-job tests: k8s-job: wait for job successful create	2024-04-04 15:14:56 -06:00
Gabriela Cervantes	8e5d401be0	metrics: Improve latency test cleanup This PR improves the latency test cleanup in order to avoid random failures of leaving the pods. Fixes #9418 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-04 20:43:53 +00:00
Saul Paredes	f20caac1c0	gha: ensure unique resource group name There's an rg name duplication situation that got introduced by #9385 where 2 different test runs might have same rg name. Add back uniqueness by including the first letter of GENPOLICY_PULL_METHOD to cluster name. Fixes: #9412 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-04 13:13:32 -07:00
GabyCT	aae2679f09	Merge pull request #9409 from GabyCT/topic/ghrunset gha: Define GH_PR_NUMBER variable in gha run k8s common script	2024-04-04 09:46:48 -06:00
Eric Ernst	da01bccd36	katautils: check number of cores on the system intead of go runtime We used to utilize go runtime's "NumCPUs()", which will give the number of cores available to the Go runtime, which may be a subset of physical cores if the shim is started from within a cpuset. From the function's description: "NumCPU returns the number of logical CPUs usable by the current process." As an example, if containerd is run from within a smaller CPUset, the maximum size of a pod will be dictated by this CPUset, instead of what will be available on the rest of the system. Since the shim will be moved into its own cgroup that may have a different CPUset, let's stick with checking physical cores. This also aligns with what we have documented for maxVCPU handling. In the event we fail to read /proc/cpuinfo, let's use the goruntime. Fixes: #9327 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2024-04-03 16:09:16 -07:00
Dan Mihai	3e72b3f360	tests: k8s-job: wait for job successful create Don't just verify SuccessfulCreate - wait for it if needed. Fixes: #9138 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 22:11:15 +00:00
Gabriela Cervantes	73f27e28d1	gha: Define GH_PR_NUMBER variable in gha run k8s common script This PR defines the GH_PR_NUMBER variable in gha run k8s common script to avoid failures like unbound variable when running locally the scripts just like the GHA CI. Fixes #9408 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-03 18:25:00 +00:00
GabyCT	c5c229b330	Merge pull request #9397 from GabyCT/topic/removeconmon versions: Remove conmon information from versions.yaml	2024-04-03 11:14:43 -06:00
GabyCT	12947b1ba6	Merge pull request #9344 from GabyCT/topic/kerneldoc docs: Remove stale kernel information	2024-04-03 11:13:54 -06:00
Dan Mihai	07c23a05f2	Merge pull request #9385 from microsoft/saulparedes/add_genpolicy_yaml_params gha: add GENPOLICY_PULL_METHOD	2024-04-03 09:20:16 -07:00
Lukáš Doktor	b8382cea88	ci.ocp: Increase the MCP update time updating the machine config takes even longer than 1200s, use 60m to be sure everything is updated. Fixes: #9338 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-04-03 15:01:29 +02:00
Alex Lyn	935a1a3b40	runtime-rs: refactor decrease_attach_count with do_decrease_count Try to reduce duplicated code in decrease_attach_count with public new function do_decrease_count. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-03 17:19:19 +08:00
Alex Lyn	4f0fab938d	runtime-rs: refactor increase_attach_count with do_increase_count Try to reduce duplicated code in increase_attach_count with public new function do_increase_count. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-03 17:19:19 +08:00
Alex Lyn	fff64f1c3e	runtime-rs: introduce dedicated function do_decrease_count Introduce a dedicated public function do_decrease_count to reduce duplicated code in drivers' decrease_attach_count. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-03 17:19:08 +08:00
Alex Lyn	5750faaf31	runtime-rs: introduce dedicated function do_increase_count Since there are many implementations of reference counting in the drivers, all of which have the same implementation, we should try to reduce such duplicated code as much as possible. Therefore, a new function is introduced to solve the problem of duplicated code. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-03 17:09:17 +08:00
Dan Mihai	f800bd86f6	tests: k8s-sandbox-vcpus-allocation.bats policy Use the "allow all" policy for k8s-sandbox-vcpus-allocation.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:33 +00:00
Dan Mihai	4211d93b87	tests: k8s-nginx-connectivity.bats policy Use the "allow all" policy for k8s-nginx-connectivity.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:26 +00:00
Dan Mihai	5dcf64ef34	tests: k8s-volume.bats allow all policy Use the "allow all" policy for k8s-volume.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:18 +00:00
Dan Mihai	04085d8442	tests: k8s-sysctls.bats allow all policy Use the "allow all" policy for k8s-sysctls.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:10 +00:00
Dan Mihai	839993f245	tests: k8s-security-context.bats allow all policy Use the "allow all" policy for k8s-security-context.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:03 +00:00
Dan Mihai	02a050b47e	tests: k8s-seccomp.bats allow all policy Use the "allow all" policy for k8s-seccomp.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:56 +00:00
Dan Mihai	543e40b80c	tests: k8s-projected-volume.bats allow all policy Use the "allow all" policy for k8s-projected-volume.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:47 +00:00
Dan Mihai	3f94e2ee1b	tests: k8s-pod-quota.bats allow all policy Use the "allow all" policy for k8s-pod-quota.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:37 +00:00
Dan Mihai	ba23758a42	tests: k8s-optional-empty-secret.bats policy Use the "allow all" policy for k8s-optional-empty-secret.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:30 +00:00
Dan Mihai	e4ff6b1d91	tests: k8s-measured-rootfs.bats allow all policy Use the "allow all" policy for k8s-measured-rootfs.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:23 +00:00
Dan Mihai	2821326a7e	tests: k8s-liveness-probes.bats allow all policy Use the "allow all" policy for k8s-liveness-probes.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:15 +00:00
Dan Mihai	9af3e4cc4a	tests: k8s-inotify.bats allow all policy Use the "allow all" policy for k8s-inotify.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:08 +00:00
Dan Mihai	bd45e948cc	tests: k8s-guest-pull-image.bats policy Use the "allow all" policy for k8s-guest-pull-image.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:00 +00:00
Dan Mihai	be3797ef7c	tests: k8s-footloose.bats allow all policy Use the "allow all" policy for k8s-footloose.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 02:59:50 +00:00
Dan Mihai	18f5e55667	tests: k8s-empty-dirs.bats allow all policy Use the "allow all" policy for k8s-empty-dirs.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 02:59:44 +00:00
Dan Mihai	ef22bd8a2b	tests: k8s: replace run_policy_specific_tests Check from: - k8s-exec-rejected.bats - k8s-policy-set-keys.bats if policy testing is enabled or not, to reduce the complexity of run_kubernetes_tests.sh. After these changes, there are no policy specific commands left in run_kubernetes_tests.sh. add_allow_all_policy_to_yaml() is moving out of run_kubernetes_tests.sh too, but it not used yet. It will be used in future commits. Fixes: #9395 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 02:59:28 +00:00
Guoqiang Ding	cd0c31e185	qemu: show the thread name when enable the hypervisor.debug option Add debug-threads=on in the name argument if debug enabled. Fixes: #9400 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-04-03 10:36:52 +08:00
Saul Paredes	8a92e81f98	gha: add GENPOLICY_PULL_METHOD Add GENPOLICY_PULL_METHOD that will be used to test pulling container images in genpolicy using the oci-distribution crate and/or the containerd interface. GENPOLICY_PULL_METHOD will start being used in a future PR. Fixes: #9384 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-02 19:03:28 -07:00
Gabriela Cervantes	f3957352f0	versions: Remove conmon information from versions.yaml This PR removes conmon information from versions.yaml as this is not longer being used in kata containers repository. Fixes #9396 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-02 16:25:45 +00:00
Dan Mihai	39805822fc	tests: k8s: reduce policy testing complexity Don't add the "allow all" policy to all the test YAML files anymore. After this change, the k8s tests assume that all the Kata CI Guest rootfs image files either: - Don't support Agent Policy at all, or - Include an "allow all" default policy. This relience/assumption will be addressed in a future commit. Fixes: #9395 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-02 16:18:31 +00:00
Alex Lyn	7795f9c016	Merge pull request #9365 from GabyCT/topic/removerunc versions: Remove runc version information	2024-04-02 09:21:56 +08:00
Alex Lyn	fa8049af6c	Merge pull request #9383 from Apokleos/unified-cgrp-cmdline kata-agent: enabling cgroups-v2 by systemd.unified_cgroup_hierarchy	2024-04-02 09:08:04 +08:00
Alex Lyn	07bfdf4a22	Merge pull request #9275 from Apokleos/swap-hooks-bindmnt kata-agent: Change order of guest hook and bind mount processing	2024-04-02 07:40:10 +08:00
Alex Lyn	c88014834b	kata-agent: enabling cgroups-v2 by systemd.unified_cgroup_hierarchy Configure the system to mount cgroups-v2 by default during system boot by the systemd system, We must add systemd.unified_cgroup_hierarchy=1 parameter to kernel cmdline, which will be passed by kernel_params in configuration.toml. To enable cgroup-v2, just add systemd.unified_cgroup_hierarchy=true[1] to kernel_params. Fixes: #9336 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-01 18:45:12 +08:00
alex.lyn	548f252bc4	runtime-rs: bugfix incorrect use of refcount before vfio attach When there's a pod with multiple containers, there may be case that attach point more than 2, we should not return Err in that case when we are doing attach ops, but just return Ok. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-01 11:28:57 +08:00
Alex Lyn	aa9cd232cd	Merge pull request #9358 from GabyCT/topic/nerdrandom gha: Update journal log names for nerdctl artifacts	2024-04-01 09:50:16 +08:00
Alex Lyn	dfa8832406	Merge pull request #9345 from c3d/bug/9342-agent-test-errors agent: Fix errors in `make check`	2024-04-01 09:48:44 +08:00
Dan Mihai	3a7dbcfc17	Merge pull request #9367 from microsoft/danmihai1/infinite-io-stream-copy-loop runtime: remove stream copy infinite loop	2024-03-29 09:37:44 -07:00
Dan Mihai	600f9266f3	runtime: remove stream copy infinite loop This reverts commit `1c5693be86`. Avoid apparent infinite loop when ReadStreamRequest is blocked by policy - for some of the pods. When running the k8s-limit-range.bats test with Policy enabled, the Shim + VMM never get terminated on my cluster. Not sure why the sandbox clean-up works better for other tests, but the k8s-limit-range test pod gets stuck in an infinite loop: stdout io stream copy error happens: error = %wrpc error: code = PermissionDenied desc = \"ReadStreamRequest is blocked by policy ... policy check: ReadStreamRequest ... stdout io stream copy error happens: error = %wrpc error: code = PermissionDenied desc = \"ReadStreamRequest is blocked by policy ... policy check: ReadStreamRequest ... Fixes: #9380 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-28 22:43:28 +00:00
James O. D. Hunt	13966f4d1d	docs: kata-manager: Add help for permissions issue The 3.3.0 release installs the `kata-manager` script with overly restrictive permissions (see #9373), so add details to help users handle the situation. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:22:10 +00:00
James O. D. Hunt	5589e4e291	docs: kata-manager: Update with latest details Now that v3.3.0 has been released, simplify the `kata-manager` documentation. Fixes: #9227. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:22:10 +00:00
James O. D. Hunt	52fe60c94b	docs: kata-manager: Fix heading levels Add an extra heading indent so that there is only a single top-level heading. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:21:31 +00:00
Dan Mihai	ebb26edf42	Merge pull request #9347 from microsoft/danmihai1/reduce-exec-test-policy-prints genpolicy: reduce policy debug prints	2024-03-27 15:12:10 -07:00
Gabriela Cervantes	a32418bf32	versions: Remove runc version information This PR removes the runc version information as this is not longer being used in the kata containers scripts. Fixes #9364 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-27 20:32:38 +00:00
Steve Horsman	b3acbe0b7f	Merge pull request #8046 from fitzthum/clean-config runtime: remove unimplemented CoCo configurations	2024-03-27 19:39:48 +00:00
Tobin Feldman-Fitzthum	04d021bd12	packaging: remove SERVICEOFFLOAD option Since we're removing the unused service_offload parameter, don't set it in any of the packaging scripts. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum	9856fe5bea	runtime: remove ServiceOffload parameter Since we no longer use the service_offload configuration, remove the ServiceOffload field from the image struct. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum	a18c7ca307	runtime: remove unimplemented CoCo configurations These experimental options were added 2 years ago in anticipation of features that would be added in CoCo. These do not match the features that were eventually added and will soon be ported to main. Fixes: #8047 Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:06 -05:00
Steve Horsman	53fa1fd82d	Merge pull request #9349 from fidencio/topic/ci-k8s-update-cpuid k8s: confidential: Update cpuid to its latest release	2024-03-27 16:57:36 +00:00
Chengyu Zhu	e66a5cb54d	Merge pull request #9332 from ChengyuZhu6/guest-pull-timeout Support to set timeout to pull large image in guest	2024-03-28 00:34:08 +08:00
Christophe de Dinechin	82c4079fd0	agent: Remove useless loop This is the report from `make check`: ``` error: this loop never actually loops --> src/signal.rs:147:9 \| 147 \| / loop { 148 \| \| select! { 149 \| \| _ = handle => { 150 \| \| println!("INFO: task completed"); ... \| 156 \| \| } 157 \| \| } \| \|_________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop = note: `#[deny(clippy::never_loop)]` on by default ``` There is only one option: you get something or a timeout. You never retry, so the report is correct. Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Christophe de Dinechin	df5c88cdf0	agent: Remove lint error about `.flatten` running forever The lint report is the following: ``` error: `flatten()` will run forever if the iterator repeatedly produces an `Err` --> src/rpc.rs:1754:10 \| 1754 \| .flatten() \| ^^^^^^^^^ help: replace with: `map_while(Result::ok)` \| note: this expression returning a `std::io::Lines` may produce an infinite number of `Err` in case of a read error --> src/rpc.rs:1752:5 \| 1752 \| / reader 1753 \| \| .lines() \| \|________________^ = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#lines_filter_map_ok = note: `-D clippy::lines-filter-map-ok` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::lines_filter_map_ok)]` ``` This commit simply applies the suggestion. Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Christophe de Dinechin	bfb55312be	agent: Fix `.enumerate` errors during `make check` Running `make check` in the `src/agent` directory gives: ``` error: you seem to use `.enumerate()` and immediately discard the index --> rustjail/src/mount.rs:572:27 \| 572 \| for (_index, line) in reader.lines().enumerate() { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_enumerate_index = note: `-D clippy::unused-enumerate-index` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::unused_enumerate_index)]` help: remove the `.enumerate()` call \| 572 \| for line in reader.lines() { \| ~~~~ ~~~~~~~~~~~~~~ Checking tokio-native-tls v0.3.1 Checking hyper-tls v0.5.0 Checking reqwest v0.11.18 error: could not compile `rustjail` (lib) due to 1 previous error warning: build failed, waiting for other jobs to finish... make: *** [../../utils.mk:177: standard_rust_check] Error 101 ``` Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Greg Kurz	e1068da1a0	Merge pull request #9326 from gkurz/draft-release Only tag and publish the release when it is fully ready	2024-03-27 15:59:59 +01:00
ChengyuZhu6	c50d3ebacc	tests:k8s: Add a test to pull large images in the guest Add a test to pull large images in the guest. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:44 +08:00
ChengyuZhu6	8551ee9533	how-to: add createcontainer timeout to sandbox config documentation add createcontainer timeout annotation to sandbox config documentation. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:44 +08:00
ChengyuZhu6	c2dc13ebaa	runtime: support to configure CreateContainer Timeout in configurations support to configure CreateContainerRequestTimeout in the configurations. e.g.: [runtime] ... create_container_timeout = 300 Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout. In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:41 +08:00
Chengyu Zhu	87fc17d4d2	Merge pull request #9341 from ChengyuZhu6/guest-pull-doc docs: Add documents for kata guest image management	2024-03-27 21:20:22 +08:00
ChengyuZhu6	95b2f7f129	how-to: Add a document for kata guest image management usage Add a document for kata guest image management usage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 20:09:37 +08:00
Greg Kurz	693c9487d4	docs: Adjust release documentation Most of the content of `docs/Stable-Branch-Strategy.md` got de-facto deprecated by the re-design of the release process described in #9064. Remove this file and all its references in the repo. The `## Versioning` section has some useful information though. It is moved to `docs/Release-Process.md`. The documentation of the `PATCH` field is adapted according to new workflow. Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-27 12:41:48 +01:00
Steve Horsman	45aba769c0	Merge pull request #9346 from cmaf/ci-remove-repo-docs Remove additional links to tests directory	2024-03-27 11:13:32 +00:00
Steve Horsman	a1a615a7c8	Merge pull request #9356 from stevenhorsman/agent-opa-ppc64le-s390x workflows: Build agent-opa for more archs	2024-03-27 08:53:28 +00:00
ChengyuZhu6	2224f6d63f	runtime: support to configure CreateContainer timeout in annotation Support to configure CreateContainerRequestTimeout in the annotations. e.g.: annotations: "io.katacontainers.config.runtime.create_container_timeout": "300" Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout. In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 15:44:29 +08:00
ChengyuZhu6	39bd462431	runtime: support to set timeout for CreateContainerRequest In the situation to pull images in the guest #8484, it’s important to account for pulling large images. Presently, the image pull process in the guest hinges on `CreateContainerRequest`, which defaults to a 60-second timeout. However, this duration may prove insufficient for pulling larger images, such as those containing AI models. Consequently, we must devise a method to extend the timeout period for large image pull. Fixes: #8141 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 15:44:29 +08:00
Gabriela Cervantes	a997e282be	gha: Update journal log names for nerdctl artifacts This PR updates the journal log name for nerdctl artifacts to make sure that we have different names in case we add a parallel GHA job. Fixes #9357 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-26 20:03:54 +00:00
GabyCT	c163d9f114	Merge pull request #9329 from GabyCT/topic/seun scripts: Fix unbound variables in k8s setup script	2024-03-26 11:19:33 -06:00
stevenhorsman	9aa675abb9	workflows: Build agent-opa for more archs Since https://github.com/kata-containers/kata-containers/pull/7769, we support building the OPA binary into the ppc64le and s390x arch versions of the rootfs, so build the policy enabled agent to match for those architectures too. Fixes: #9355 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-03-26 17:02:14 +00:00
Lukáš Doktor	a671b3fc6e	tests: Use full svc address to check kbs service the service might not listen on the default port, use the full service address to ensure we are talking to the right resource. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-26 16:59:02 +01:00
Lukáš Doktor	6b0eaca4d4	tests: Add support for nodeport ingress for the kbs setup this can be used on kcli or other systems where cluster nodes are accessible from all places where the tests are running. Fixes: #9272 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-26 16:59:00 +01:00
Greg Kurz	5009fabde4	release: Keep it draft until all artifacts have been published The automated release workflow starts with the creation of the release in GitHub. This is followed by the build and upload of the various artifacts, which can be very long (like hours). During this period, the release appears to be fully available in https://github.com/kata-containers/kata-containers/ even though it lacks all the artifacts. This might be confusing for users or automation consuming the release. Create the release as draft and clear the draft flag when all jobs are done. This ensure that the release will only be tagged and made public when it is fully usable. If some job fails because of network timeout or any other transient error, the correct action is to restart the failed jobs until they eventually all succeed. This is by far the quicker path to complete the release process. If the workflow is canceled for some reason, the draft release is left behind. A new run of the workflow will create a brand new draft release with the same name (not an issue with GitHub). The draft release from the previous run should be manually deleted. This step won't be automated as it looks safer to leave the decision to a human. [1] https://github.com/kata-containers/kata-containers/releases Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-26 14:48:05 +01:00
Pavel Mores	4c72b02e53	runtime-rs: remove the now-unused code of NetDevice The remaining code in network.rs was mostly moved to utils.rs which seems better home for these utility functions anyway (and a closely related function open_named_tuntap() has already lived there). ToString implementation for Address was removed after some consideration. Address should probably ideally implement Display (as per RFC 565) which would also supply a ToString implementation, however it implements Debug instead, probably to enable automatic implementation of Debug for anything that Address is a member of, if for no other reason. Rather than having two identical functions this commit simply switches to using the Debug implementation for printing Address on qemu command line. Fixes #9352 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:52:40 +01:00
Pavel Mores	c94e55d45a	runtime-rs: make QemuCmdLine own vsock file descriptor Make file descriptors to be passed to qemu owned by QemuCmdLine. See commit 52958f17cd for more explanation. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	0cf0e923fc	runtime-rs: refactor QemuCmdLine::add_network_device() signature add_network_device() doesn't need to be passed NetworkInfo since it already has access to the full HypervisorConfig. Also, one of the goals of QemuCmdLine interface's design is to avoid coupling between QemuCmdLine and the hypervisor crate's device module, if at all possible. That's why add_network_device() shouldn't take device module's NetworkConfig but just parts that are useful in add_network_device()'s implementation. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	a4f033f864	runtime-rs: add should_disable_modern() utility function is_running_in_vm() is enough to figure out whether to disable_modern but it's clumsy and verbose to use. should_disable_modern() streamlines the usage by encapsulating the verbosity. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	12e40ede97	runtime-rs: reimplement add_network_device() using Netdev & DeviceVirtioNet This commit replaces the existing NetDevice-based implementation with one using Netdev and DeviceVirtioNet. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	0a57e2bb32	runtime-rs: refactor NetDevice in qemu driver In keeping with architecture of QemuCmdLine implementation we split the functionality into two objects: Netdev to represent and generate the -netdev part and DeviceVirtioNet for the -device virtio-net-<transport> part. This change is a pure refactor, existing functionality does not change. However, we do remove some stub generalizations and govmm-isms, notably: - we remove the NetDev enum since the only network interface types that kata seems to use with qemu are tuntap and macvtap, both of which are implemented by the same -netdev tap - enum DeviceDriver is also left out since it doesn't seem reasonable to try to represent VFIO NICs (which are completely different from virtio-net ones) with the same struct as virtio-net - we also remove VirtioTransport because there's no use for it so far, but with the expectation that it will be added soon. We also make struct Netdev the owner of any vhost-net and queue file descriptors so that their lifetime is tied ultimately to the lifetime of QemuCmdLine automatically, instead of returning the fds to the caller and forcing it to achieve the equivalent functionality but manually. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	7f23734172	runtime-rs: reduce generate_netdev_fds() dependencies generate_netdev_fds() takes NetworkConfig from which it however only needs a host-side network device name. This commit makes it take the device name directly, making the function useful to callers who don't have the whole NetworkConfig but do have the requisite device name. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	d4ac45d840	runtime-rs: refactor clear_fd_flags() The idea of this function is to make sure O_CLOEXEC is not set on file descriptors that should be inherited by a child (=hypervisor) process. The approach so far is however rather heavy-handed - clearing all flags is unjustifiably aggresive for a low-level function with no knowledge of context whatsoever. This commit refactors the function so that it only does what's expected and renames it accordingly. It also clarifies some of its call sites. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:14 +01:00
Fabiano Fidêncio	cfe75f9422	k8s: confidential: Update cpuid to its latest release Since v2.2.6 it can detect TDX guests on Azure, so let's bump it even if Azure peer-pods are not currently used as part of our CI. Fixes: #9348 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-26 10:21:12 +01:00
Chengyu Zhu	d16971e37e	Merge pull request #9325 from ChengyuZhu6/image_service agent:image: Refactor code to improve memory efficiency of image service	2024-03-26 10:38:37 +08:00
Dan Mihai	6c72c29535	genpolicy: reduce policy debug prints Kata CI has full debug output enabled for the cbl-mariner k8s tests, and the test AKS node is relatively slow. So debug prints from policy are expensive during CI. Fixes: #9296 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-26 02:21:26 +00:00
Alex Lyn	cec943fc26	Merge pull request #9244 from Apokleos/dgb-gpu runtime-rs/dragonball: add support building kernel with upcall and GPU hotplug	2024-03-26 08:53:54 +08:00
Chelsea Mafrica	4e3deb5a3b	tools: Fix path for installing yq in packaging script The lib.sh script uses the right directory but the wrong path for the script that installs yq; fix it. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00
Chelsea Mafrica	cfb977625e	docs: Remove links to tests repo Remove links to tests repo and update with corresponding location in the current repo. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00
Chelsea Mafrica	d69514766e	src: Remove references to files in tests repo Change scripts and source that uses files in the tests repo to use the corresponding file in the current repo. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00
Gabriela Cervantes	ddef2be4f1	docs: Remove stale kernel information This PR removes stale kernel information from the README document. Fixes #9343 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-25 15:57:00 +00:00
Greg Kurz	e9e94d2dbd	release: Give a pretty name to all steps For a prettier rendering in the web UI. Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-25 15:50:35 +01:00
Greg Kurz	dce6ea57b2	release: Simplify the `create-new-release` action of `release.sh` Now that the version is an invariant for the entire workflow, it isn't required to obtain it with an environment variable. Just rely on the content of the `VERSION` file like other actions. Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-25 15:50:35 +01:00
Alex Lyn	5c54315a87	dragonball: fix CI failure due to poor UT adaptation. Fixes: #9144 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-25 20:25:27 +08:00
Alex Lyn	079d894496	kernel: bump version in kata config version Fixes: #9140 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-25 20:25:27 +08:00
Alex Lyn	070c3fa657	docs: add doc about building kernel with upcall and GPU hotplug We need some docs about how to build a guest kernel to support both Upcall and Nvidia GPU Passthrough(hotplug) at the same time. This patch is to do such thing to help users to build a guest kernel with support both Upcall and Nvidia GPU hotplug/unlplug. Fixes: #9140 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-25 20:25:17 +08:00
ChengyuZhu6	06b9935402	docs: Add a document for kata guest image management design Add a document for kata guest image management design. Related feature: #8484 Fixes: #9225 -- part I Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-25 18:17:23 +08:00
Chengyu Zhu	4029d154ba	Merge pull request #9313 from ChengyuZhu6/rtest agent: Refactor unit tests to leverage rstest for parameterization	2024-03-25 10:31:45 +08:00
Alex Lyn	bc309b9865	kernel: add CONFIG_CRYPTO_ECDSA into whitelist CONFIG_CRYPTO_ECDSA is not supported in older kernels such as 5.10.x which may cause building broken problem if we build such kernel with NVIDIA GPU in version 5.10.x So this patch is to add CONFIG_CRYPTO_ECDSA into whitelist.conf to avoid break building guest kernel with NVIDIA GPU. Fixes: #9140 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-25 08:05:31 +08:00
ChengyuZhu6	f47408fdf4	agent:image: Refactor code to improve memory efficiency of image service Currently, `.lock().await.clone()` results in `Option<ImageService>` being duplicated in memory with each call to `singleton()`. Consequently, if kata-agent receives numerous image pulling requests simultaneously, it will lead to the allocation of multiple `Option<ImageService>` instances in memory, thereby consuming additional memory resources. In image.rs, we introduce two public functions: `merge_bundle_oci()` and `init_image_service()`. These functions will encapsulate the operations on `IMAGE_SERVICE`, ensuring that its internal details remain hidden from external modules such as `rpc.rs`. Fixes: #9225 -- part II Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-25 07:46:50 +08:00
ChengyuZhu6	7a49ec1c80	agent:util: Refactor the unit tests to leverage rstest Refactor the unit tests in util.rs to leverage rstest for parameterization. Fixes: #9314 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-23 10:49:53 +08:00
ChengyuZhu6	2df2b4d30d	agent:namespace: Refactor unit tests to leverage rstest Refactor the unit tests in `namespace.rs` to leverage rstest for parameterization. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-23 10:49:48 +08:00
Hyounggyu Choi	d915a79e2d	Merge pull request #9280 from BbolroC/enable-qemu-on-s390x runtime-rs: Enable qemu on s390x	2024-03-22 23:58:42 +01:00
Fabiano Fidêncio	25cd28a32b	Merge pull request #9337 from fidencio/topic/bump-nydus-snapshotter versions: Update nydus-snapshotter to v0.13.11	2024-03-22 22:18:18 +01:00
Hyounggyu Choi	81aaa34bd6	runtime-rs: Add DeviceVirtioSerial and DeviceVirtconsole It is observed that virtiofsd exits immediately on s390x if there is no attached console devices. This commit resolves the issue by migrating `appendConsole()` from runtime and being triggered in `start_vm()`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-22 19:27:13 +01:00
Hyounggyu Choi	2cfe745efb	runtime-rs: Enable memory backend option for Machine for s390x For s390x, it requires an additional option `memory-backend` for `-machine`. Otherwise, virtiofsd exits with HandleRequest(InvalidParam). This commit is to add a field `memory_backend` to `struct Machine` and turn it on for s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-22 19:27:13 +01:00
Hyounggyu Choi	9bcfaad625	runtime-rs: Add ccw block device for rootfs Like nvdimm for x86_64, a block device for s390x should be treated differently with `virtio-blk-ccw`. This is to generate a QEMU command line parameter for a block device by using `-blockdev` and `-device` if the `vm_rootfs_driver` is set to `virtio-blk-ccw`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-22 19:27:13 +01:00
David Esparza	3e40051634	Merge pull request #9255 from dborquez/thread_pid_function runtime-rs: ch: Implement full thread/tid/pid handling	2024-03-22 10:05:02 -06:00
Fabiano Fidêncio	d0949759ec	versions: Update nydus-snapshotter to v0.13.11 This version brings in a fix for cleaning up k3s/rke2 environments, which directly impacts the TDX machine that's part of our CI. Fixes: #9318 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-22 14:56:18 +01:00
Greg Kurz	e4f6a778a8	Merge pull request #9321 from fidencio/topic/releases-follow-up-VI Revert "release: Skip --generate-notes for this release"	2024-03-22 10:44:40 +01:00
GabyCT	a67382fd00	Merge pull request #9324 from GabyCT/topic/udevguide docs: Update libseccomp instructions in Developers Guide	2024-03-21 14:25:41 -06:00
Gabriela Cervantes	d54cdd3f0c	scripts: Fix unbound variables in k8s setup script This PR fixes the unbound variables error when trying to run the setup script locally in order to avoid errors. Fixes #9328 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-21 19:10:16 +00:00
Chengyu Zhu	9a4cb96262	Merge pull request #9312 from ChengyuZhu6/show-feature agent: Add guest-pull to the list of agent features in announce()	2024-03-21 23:35:29 +08:00
David Esparza	b498e140a1	runtime-rs: ch: Implement full thread/tid/pid handling Add in the full details once cloud-hypervisor/cloud-hypervisor#6103 has been implemented, and the feature is available in a Cloud Hypervisor release. Fixes: #8799 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-03-21 08:24:53 -06:00
James O. D. Hunt	1e684f5848	Merge pull request #9259 from jodh-intel/tests-add-static-checks-announce tests: static checker: Add announce message	2024-03-21 13:59:36 +00:00
ChengyuZhu6	754399d909	agent: Add guest-pull to the list of agent features in announce() Add guest-pull to the list of agent features in announce(). Fixes: #9225 -- part IV Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-21 20:01:52 +08:00
Xuewei Niu	9c4f9dcb35	Merge pull request #9311 from studychao/chao/fix_mtrr Dragonballl: introduce MTRR regs support	2024-03-21 17:24:27 +08:00
Hyounggyu Choi	9b2c08935b	runtime-rs: Pass different device argument based on bus type Currently, `*-pci` is used as an argument for the device config. It is not true for a case where a different type of bus is used. s390x uses `ccw`. This commit is to make it flexible to generate the device argument based on the bus type. A structure `DeviceVhostUserFsPci` and `VhostVsockPci` is renamed to `DeviceVhostUserFs` and `VhostVsock` because the structure name is not bound to a certain bus type any more. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-21 09:25:37 +01:00
GabyCT	03f3d3491d	Merge pull request #9265 from GabyCT/topic/fixnydusclean gha: Fix nydus namespace clean up	2024-03-20 16:17:38 -06:00
GabyCT	702a8a440f	Merge pull request #9309 from GabyCT/topic/fixlograndom gha: Update journal log names for kubernetes artifacts	2024-03-20 16:17:17 -06:00
Gabriela Cervantes	05f4dc1902	docs: Update libseccomp instructions in Developers Guide This PR updates the libseccomp instructions in the Developers Guide. Fixes #9323 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-20 20:44:24 +00:00
GabyCT	163103d59e	Merge pull request #9307 from GabyCT/topic/fixdocreq docs: Update links in the Documentation Requirements document	2024-03-20 14:29:04 -06:00
Gabriela Cervantes	af18221ab7	docs: Update links in the Documentation Requirements document This PR updates the url links in the Documentation Requirements document. Fixes #9306 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-20 15:45:49 +00:00
Gabriela Cervantes	a855ecf21b	gha: Update journal log names for kubernetes artifacts This PR updates the journal log names for kubernetes artifacts in order to make sure that we have different names when we are running parallel GHA jobs. Fixes #9308 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-20 15:44:20 +00:00
Gabriela Cervantes	4fb8f8705f	gha: Fix nydus namespace clean up This PR terminates the nydus namespace to avoid the error of that the flag needs an argument. Fixes #9264 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-20 15:41:39 +00:00
Fabiano Fidêncio	0278fc8a91	Revert "release: Skip --generate-notes for this release" This reverts commit `0fa59ff94b`, as now we'll be able to use the `--generate-notes`, hopefully, without blowing the allowed limit. Fixes: #9064 - part VI Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-20 15:48:22 +01:00
James O. D. Hunt	577abd014b	tests: static checker: Add announce message Added an announcement message to the `static-checks.sh` script. It runs platform / architecture specific code so it would be useful to display details of the platform the checker is running on to help with debugging. Fixes: #9258. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-20 13:41:26 +00:00
James O. D. Hunt	4af4a8ad2b	tests: static checker: Create setup function Move some of the common code into a setup function. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-20 11:58:28 +00:00
Fabiano Fidêncio	1aec4f737a	Merge pull request #9316 from fidencio/topic/releases-follow-up-V release: Skip --generate-notes for this release	2024-03-20 10:50:14 +01:00
Fabiano Fidêncio	0fa59ff94b	release: Skip --generate-notes for this release This release is a special case, as we've slacked for 6 months and the release content is way too long ... long enough to exceed the allowed limit for the release notes. With this in mind we'll just remove the `--generate-notes` for now, and then revert this commit as soon as the release is out, as releases should be happening every month and, ideally, we won't reach this situation never ever again. Fixes: #9064 - part V Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-20 10:32:11 +01:00
Hyounggyu Choi	7b3d1adb8c	libs: Bump sysinfo to v0.30.5 It has been observed that the runtime stops running around `sysinfo::total_memory()` while adjusting a config on s390x. This is to update the crate to the latest version which happened to resolve the issue. (No explicit release note for this) Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-20 09:27:13 +01:00
Chao Wu	5a4b858ece	Dragonballl: introduce MTRR regs support MTRR, or Memory-Type Range Registers are a group of x86 MSRs providing a way to control access and cache ability of physical memory regions. During our test in runtime-rs + Dragonball, we found out that this register support is a must for passthrough GPU running CUDA application, GPU needs that information to properly use GPU memory. fixes: #9310 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-03-20 14:18:16 +08:00
Fabiano Fidêncio	19eb45a27d	Merge pull request #8484 from ChengyuZhu6/guest-pull Merge basic guest pull image code to main	2024-03-19 23:15:39 +01:00
Hyounggyu Choi	6e782826c7	Merge pull request #9305 from BbolroC/handle-comment-for-skipped-tests CI\|k8s: Handle skipped tests with a comment for filter_out_per_arch	2024-03-19 22:54:03 +01:00
Fabiano Fidêncio	8911d3565f	gha: tests: Filter out confidential tests for aarch64 / ppc64le Those two architectures are not TEE capable, thus we can just skip running those tests there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-19 18:06:01 +01:00
Fabiano Fidêncio	d14e9802b6	gha: k8s: Set {https,no}_proxy correctly for TDX This is needed as the TDX machine is hosted inside Intel and relies on proxies in order to connect to the external world. Not having those set causes issues when pulling the image inside the guest. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-19 18:06:00 +01:00
Fabiano Fidêncio	291b14bfb5	kata-deploy: Add the ability to set {https,no}_proxy if needed Let's make sure those two proxy settings are respected, as those will be widely used when pulling the image inside the guest on the Confidential Containers case. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	5bad18f9c9	agent: set https_proxy/no_proxy before initializing agent policy When the https_proxy/no_proxy settings are configured alongside agent-policy enabled, the process of pulling image in the guest will hang. This issue could stem from the instantiation of `reqwest`’s HTTP client at the time of agent-policy initialization, potentially impacting the effectiveness of the proxy settings during image guest pulling. Given that both functionalities use `reqwest`, it is advisable to set https_proxy/no_proxy prior to the initialization of agent-policy. Fixes: #9212 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	db9f18029c	README: Add https_proxy and no_proxy to agent README Add agent.https_proxy and agent.no_proxy to the table in the agent README. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	e23737a103	gha: refactor code with yq for better clarity refactor code with yq for better clarity: Before: ```bash yq write -i "${tools_dir}/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml" 'spec.template.spec.containers[0].env[7].value' "${KATA_HYPERVISOR}:${SNAPSHOTTER}" ``` After: ```bash yq write -i \ "${tools_dir}/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml" \ 'spec.template.spec.containers[0].env[7].value' \ "${KATA_HYPERVISOR}:${SNAPSHOTTER}" ``` Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	2c0bc8855b	tests: Make sure to install yq before using it Make sure to install yq before using it to modify YAML files. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	c52b356482	tests: add guest pull image test Add a test case of pulling image inside the guest for confidential containers. Signed-off-by: Da Li Liu <liudali@cn.ibm.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com> Co-authored-by: Megan Wright <Megan.Wright@ibm.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	e8c4effc07	tests: refactor the check for hypervisor to a function Extract two reusable functions for confidential tests in confidential_common.sh - check_hypervisor_for_confidential_tests: verifies if the input hypervisor supports confidential tests. - confidential_setup: performs the common setup for confidential tests. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	6e5e4e55d0	rootfs: add ca file to guest rootfs To access the URL, the component to pull image in the guest needs to send a request to the remote. Therefore, we need to add CA to the rootfs. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	8724d7deeb	packaging: Enable to build agent with PULL_TYPE feature Enable to build kata-agent with PULL_TYPE feature. We build kata-agent with guest-pull feature by default, with PULL_TYPE set to default. This doesn't affect how kata shares images by virtio-fs. The snapshotter controls the image pulling in the guest. Only the nydus snapshotter with proxy mode can activate this feature. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	cd6a84cfc5	kata-deploy: Setting up snapshotters per runtime handler Setting up snapshotters per runtime handler as the commit (`6cc6ca5a7f`) described. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:05:59 +01:00
ChengyuZhu6	ba242b0198	runtime: support different cri container type check To support handle image-guest-pull block volume from different CRIs, including cri-o and containerd. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:05:59 +01:00
ChengyuZhu6	874d83b510	agent/image: Use guest provided pause image By default the pause image and runtime config will provided by host side, this may have potential security risks when the host config a malicious pause image, then we will use the pause image packaged in the rootfs. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Arron Wang <arron.wang@intel.com> Co-authored-by: Julien Ropé <jrope@redhat.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com>	2024-03-19 18:05:59 +01:00
ChengyuZhu6	c269b9e8c6	agent: Add guest-pull feature for kata-agent Add "guest-pull" feature option to determine that the related dependencies would be compiled if the feature is enabled. By default, agent would be built with default-pull feature, which would support all pull types, including sharing images by virtio-fs and pulling images in the guest. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:05:59 +01:00
Aurélien	192250c52e	Merge pull request #9299 from sprt/sprt/mariner-normal-tests ci: aks: also run tests in normal instance for Mariner	2024-03-19 11:34:20 -05:00
ChengyuZhu6	965da9bc9b	runtime: support to pass image information to guest by KataVirtualVolume support to pass image information to guest by KataVirtualVolumeImageGuestPullType in KataVirtualVolume, which will be used to pull image on the guest. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	cfd14784a0	agent: Introduce ImagePullHandler to support IMAGE_GUEST_PULL volume As we do not employ a forked containerd in confidential-containers, we utilize the KataVirtualVolume which storing the image information as an integral part of `CreateContainer`. Within this process, we store the image information in rootfs.storage and pass this image url through `CreateContainerRequest`. This approach distinguishes itself from the use of `PullImageRequest`, as rootfs.storage is already set and initialized at this stage. To maintain clarity and avoid any need for modification to the `OverlayfsHandler`,we introduce the `ImagePullHandler`. This dedicated handler is responsible for orchestrating the image-pulling logic within the guest environment. This logic encompasses tasks such as calling the image-rs to download and unpack the image into `/run/kata-containers/{container_id}/images`, followed by a bind mount to `/run/kata-containers/{container_id}`. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	462051b067	agent/image: merge container spec for images pulled inside guest When being passed an image name through a container annotation, merge its corresponding bundle OCI specification and process into the passed container creation one. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Arron Wang <arron.wang@intel.com> Co-authored-by: Jiang Liu <gerry@linux.alibaba.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: wllenyj <wllenyj@linux.alibaba.com> Co-authored-by: jordan9500 <jordan.jackson@ibm.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	cec1916196	agent: Support https_proxy/no_proxy config for image download in guest Containerd can support set a proxy when downloading images with a environment variable. For CC stack, image download is offload to the kata agent, we need support similar feature. Current we add https_proxy and no_proxy, http_proxy is not added since it is insecure. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Arron Wang <arron.wang@intel.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	9cddd5813c	agent/image: Enable image-rs crate to pull image inside guest With image-rs pull_image API, the downloaded container image layers will store at IMAGE_RS_WORK_DIR, and generated bundle dir with rootfs and config.json will be saved under CONTAINER_BASE/cid directory. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Arron Wang <arron.wang@intel.com> Co-authored-by: Jiang Liu <gerry@linux.alibaba.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	2b3a00f848	agent: export the image service singleton instance Export the image service singleton instance. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Jiang Liu <gerry@linux.alibaba.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	1f1ca6187d	agent: Introduce ImageService Introduce structure ImageService, which will be used to pull images inside the guest. Fixes: #8103 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> co-authored-by: wllenyj <wllenyj@linux.alibaba.com> co-authored-by: stevenhorsman <steven@uk.ibm.com>	2024-03-19 17:22:33 +01:00
Hyounggyu Choi	b381743dd5	CI\|k8s: Handle skipped tests with a comment for filter_out_per_arch This commit updates `filter_k8s_test.sh` to handle skipped tests that include comments. In addition to the existing parameter expansion, the following expansions have been added: - Removal of a comment - Stripping of trailing spaces Fixes: #9304 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-19 17:21:25 +01:00
Chelsea Mafrica	42dfe0e8d1	Merge pull request #9286 from jodh-intel/agent-show-enabled-features agent: Show features enabled at build time	2024-03-19 08:54:49 -07:00
Wainer Moschetta	e6501aa4ad	Merge pull request #9229 from ldoktor/ocp-ci ocp.ci: Various fixes and improvements to the OCP pipeline	2024-03-19 11:13:01 -03:00
James O. D. Hunt	46aec0f15a	Merge pull request #9293 from jodh-intel/kata-manager-fix-containerd-for-docker kata-manager: Fix Docker install	2024-03-19 10:06:44 +00:00
Fabiano Fidêncio	e0a6b6449f	Merge pull request #9302 from BbolroC/fix-permission-issue-on-s390x-runners gha: Place pre-action on s390x runner for kata-deploy during release	2024-03-19 10:42:23 +01:00
Hyounggyu Choi	f2bc819644	gha: Place pre-action on s390x runner for kata-deploy during release This is to place a pre-action step for the kata-deploy job in order to clean up the github workspace directory before checking out the repo. Fixes: #9301 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-19 10:18:38 +01:00
Alex Lyn	7af2df408e	Merge pull request #9295 from likebreath/0318/fix_clh_default_netconfig runtime-rs: ch: Provide valid default value for NetConfig	2024-03-19 15:17:18 +08:00
Xuewei Niu	99d0e5fff8	Merge pull request #9270 from zvonkok/kata-agent-bind-mount kata-agent: optional bind flag	2024-03-19 10:39:23 +08:00
Aurélien Bombo	71a1be9c57	ci: aks: also run tests in normal instance for Mariner Currently we're only running the small instance tests. This adds the normal instance tests as well. Fixes: #9298 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-03-18 23:33:17 +00:00
Bo Chen	ad4262e86b	runtime-rs: ch: Provide valid default value for NetConfig The current default value of IP `0.0.0.0` with mask `0.0.0.0` will cause ioctl error when being used to create and configure TAP device, with newer version of Cloud Hypervisor [1]. This patch replaces them with valid value that are the same as the Go-lang runtime [2]. [1] https://github.com/cloud-hypervisor/cloud-hypervisor/pull/5924 [2] `e3f7852738/src/runtime/virtcontainers/pkg/cloud-hypervisor/client/model_net_config.go (L40-L57)` Fixes: #9254 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-03-18 15:47:58 -07:00
Fabiano Fidêncio	e3f7852738	Merge pull request #9289 from fidencio/topic/releases-follow-up-IV releases: Simply the release in order to avoid pushing a commit updating the VERSION file	2024-03-18 17:38:58 +01:00
James O. D. Hunt	a6c3f75872	kata-manager: Fix Docker install Fix the Docker install by removing the second (erroneous) call to `containerd_installed()` in `handle_docker()`. Without this fix, installing using Docker (`-D`) will work iff you already have containerd installed. However, if you do not have containerd installed, the `containerd_installed()` function returns 1, which exits the script as we're running with `set -e`, leaving a broken Docker installation. > Note: containerd is installed via Docker's `get-docker.sh` script. Fixes: #9292. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-18 14:08:35 +00:00
stevenhorsman	0ab8e61a64	release: Remove release type from arch release Now we don't have minor and major releases and we are now generating a new version in the release workflow, we can tidy up the arch specific releases workflows to remove the extra required inputs Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-03-18 12:27:57 +00:00
Greg Kurz	3cfc1b6ba7	releases: Adjust documentation to the new workflow This drops the documentation of the legacy release scripts and adds a quick description of the scripts of the new workflow. It also highlights the bump of the `VERSION` file. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-18 12:57:02 +01:00
Greg Kurz	76c640767e	releases: Drop Makefile It isn't used anymore. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-18 12:54:00 +01:00
Greg Kurz	bfe19e68e8	kata-deploy: Adapt `test-kata.sh` to the new release workflow All releases are now created in the `main` branch following the very same workflow. No need to special case pre-releases. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-18 12:54:00 +01:00
Fabiano Fidêncio	12578f11bc	releases: Assume VERSION has the correct version to be released This is done in order to avoid having to push a commit to the main branch, which is against the defined rules on GitHub. By doing this, we need to educate ourselves to always bump the VERSION file as soon as a release is cut out. As a side effect of this change, we can drop the release-major and release-minor workflows, as those are not needed anymore. Fixes: #9064 - part IV Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-16 13:30:58 +01:00
Fabiano Fidêncio	8ce50269fe	release: Bump the VERSION file to the next release number 3.3.0 it will be. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-16 13:21:27 +01:00
Xuewei Niu	9f512c016e	Merge pull request #9282 from gkurz/runtime-rs-fds-for-qemu runtime-rs: Consolidate the handling of fds passed to QEMU	2024-03-16 10:26:11 +08:00
Greg Kurz	1e526a4769	runtime-rs: Consolidate the handling of fds passed to QEMU File descriptors that are passed to QEMU need some special care. We want them to be closed when the QEMU process is started. But at the same time, it is required that the associated rust File structures, either coming from the` std::fs` or the `tokio::fs` crates, are still in scope when the QEMU process is forked. This is currently achieved by keeping File structures in variables at the outer scope of `start_vm()`. This scheme is currently duplicated, with similar justifications in the corresponding comments. Consolidate all this handling in one place with a more generic explanation. Fixes #9281 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-15 16:14:59 +01:00
James O. D. Hunt	9ef59488d9	agent: Show features enabled at build time The agent now has a number of optional build-time features that can be enabled. Add details of these features to the following areas: - Version output (`kata-agent --version`) - Announce message (so that the details are always added to the journal at agent startup). - The response message returned by the ttRPC `GetGuestDetails()` API. Fixes: #9285. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-15 13:29:21 +00:00
Chelsea Mafrica	2c50d3c393	Merge pull request #9278 from wainersm/github_env_fix tests: fix nounset error with $GITHUB_ENV	2024-03-14 16:39:13 -07:00
Greg Kurz	6a112cc7a5	runtime-rs: Fix missing dependency Some previous contribution missed to run cargo clippy. Fix the dependency now so that it doesn't cause noise in future contributions. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-14 23:19:38 +01:00
Dan Mihai	b3b00e00a6	Merge pull request #9246 from microsoft/danmihai/default-env genpolicy: default env if image doesn't have env	2024-03-14 11:01:43 -07:00
Dan Mihai	6094f1e31d	Merge pull request #9250 from microsoft/danmihai1/k8s-pid-ns2 tests: k8s: k8s-pid-ns.bats auto-generated policy	2024-03-14 10:10:24 -07:00
Zvonko Kaiser	c15e19c806	kata-agent: optional bind flag Fixes: #9269 From https://github.com/opencontainers/runtime-spec/blob/main/config.md#mounts type (string, OPTIONAL) The type of the filesystem to be mounted. bind may be only specified in the oci spec options -> flags update r#type The agent will ignore bind mounts if they are only specified in the OCI spec options and not in the flags. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-03-14 14:42:01 +00:00
Hyounggyu Choi	1dac6b1357	runtime-rs: Configure s390x specific flags for Makefile s390x supports a different machine type `s390-ccw-virtio` and it is not required to configure cpu features by default for the platform. A hypervisor `dragonball` is not supported on s390x so that `DBCMD` is not necessary. `vm-rootfs_driver` should be set to `virtio-blk-ccw`. This commit is to set the architecture-specific flags for Makefile. Fixes: #9158 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-14 13:05:35 +01:00
Wainer dos Santos Moschetta	981f95df55	tests: fix nounset error with $GITHUB_ENV Initialize $GITHUB_ENV to avoid nounset error when running the scripts locally out of Github Actions. Fixed commit `9ba5e3d2a8` Fixes #9217 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-13 14:57:38 -03:00
Dan Mihai	ac27caf1b4	Merge pull request #9248 from microsoft/danmihai1/k8s-exec.bats2 tests: k8s: k8s-exec.bats auto-generated policy	2024-03-13 09:21:12 -07:00
Alex Lyn	2aa3519520	kata-agent: Change order of guest hook and bind mount processing The guest_hook_path item in configuration.toml allows OCI hook scripts to be executed within Kata's guest environment. Traditionally, these guest hook programs are pre-built and included in Kata's guest rootfs image at a fixed location. While setting guest_hook_path = "/usr/share/oci/hooks" in configuration.toml works, it lacks flexibility. Not all guest hooks reside in the path /usr/share/oci/hooks, and users might have custom locations. To address this, a more flexible and configurable approach is to be proposed that allows users to specify their desired path. This could include using a sandbox bind mount path for hooks specific to that particular container. However, The current implementation of guest hooks and bind mounts in kata-agent has a reversed order of execution compared to the desired behavior. To achieve the intended functionality, we simply need to swap the order of their implementation. Fixes: #9274 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-13 20:30:32 +08:00
Steve Horsman	8f4cbd49d7	Merge pull request #9263 from Amulyam24/gha-fixes gha: ensure that the self hosted runner is in desired state before running the workflow	2024-03-13 10:49:29 +00:00
Zvonko Kaiser	63dff9a9f2	kata-agent: CreateContainer Hook Fixes: #9267 The doc states we have support for all lifecycle hooks. There are still some missing. This is the first issue regarding the CreateContainer hook which is run before pivot_root but after prestart and createruntime Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-03-13 09:24:25 +00:00
Amulyam24	3f4b24be8b	gha: ensure that self hosted runner is prepared before running the workflow This PR ensures that the self hosted runner is prepared by taking necesary actions before running the workflow. The script prepare_runner.sh checks the following: 1. Ensure that containerd/docker is up and running 2. Make sure that the repository workspace is cleaned up and has no conflicts 3. Remove/cleanup any leftover files from the previous runs Fixes: #9262 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-03-13 14:20:10 +05:30
Alex Lyn	410afcc913	Merge pull request #8866 from Apokleos/netdev-qemu-rs runtime-rs: add netdev params to cmdline for qemu-rs.	2024-03-13 13:07:43 +08:00
Dan Mihai	e8c2a45ce0	tests: k8s: k8s-pid-ns.bats auto-generated policy Auto-generate policy for k8s-pid-ns.bats. Fixes: #9249 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-12 22:34:46 +00:00
Lukáš Doktor	46e62eecb1	ci.ocp: Log the full grepped line rather than the expected msg we are grepping for an expected message but it might contain extra bits of information fruitful for later debugging. Let's include it in the output and the full log in case of an error. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 17:03:46 +01:00
Lukáš Doktor	7ff2eb508e	ci.ocp: Increase the mcp update timeout we're hitting this timeout quite often, looks like newer OCP takes longer to reconfigure. Increase the timeout to 1200. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	cc02329fd1	ci.ocp: Add a cleanup script This script doesn't serve as a complete cleanup, but it can be used as a best-effort cleaner between deploying different versions of kata-containers on the same OCP cluster. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	b811ee0650	ci.ocp: Allow to override the kata-deploy image sometimes we want to test a different than the latest image (eg. when verifying a PR via ghcr images or when bisecting a failure over older builds). Let's add a KATA_DEPLOY_IMAGE variable for that while keeping the latest image by default. Fixes: #9228 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	2936503b24	ci.ocp: Always replace the kata-deploy image in OCP pipeline previously we only replaced the image when the previously defined one matched the "old_img". This is good to avoid modifying developers custom changes, but it might lead to hard-to-debug issues when the image stays different. Let's ensure we always replace the image with the one we asked for. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	6525c94065	ci.ocp: Add a workaround to optionally enable skip_mount_home the latest upstream kata-containers requires the skip_mount_home to be enabled, which is default on OCP 4.14+ but disabled on OCP 4.13-. Let's use a "WORKAROUND_9206_CRIO" (called by kata-containers GH issue) variable to allow users to enable this treatement when needed. Related to: #9206 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	739d627b4e	ci.ocp: Turn selinux relabel failures into warnings Instead of failing the pipeline let's proceed with an error message that selinux setup failed so, in case of a later failure, we know what might have caused it while keeping the coverage in case of a false setup issue. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	76c452d4e0	ci.ocp: Wait for all pods to finish the work previously we only waited for a random pod to finish the selinux relabel, which could be error-prone. Let's wait for all of the podst to contain the expected message. Increase the timeout to 120s as some pods might take a little bit longer to finish. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:34:56 +01:00
Lukáš Doktor	f7febd07a0	ci.ocp: Allow to re-apply the selinux workaround in case we re-apply the selinux workaround or if user had already existing similar rule the relabel_selinux was failing. Let's allow it to modify the existing rules as well to avoid such issues. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:02:21 +01:00
Lukáš Doktor	fbbea68f1f	ci.ocp: Ignore selinux setup on non-selinux cluster improve our selinux workaround to work well on non-selinux clusters. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:02:20 +01:00
Alex Lyn	e2ae8ba79b	runtime-rs: add network device into Qemu's cmdline It will open tuntap device and vhost-net device and store device files. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:28:54 +08:00
Alex Lyn	d3bca4597e	runtime-rs: add open_named_tuntap to open a named tuntap device. The open_named_tuntap function is designed as a public function to open a tuntap device with the specified name. However, in order to reference existing methods in dbs_utils, we still need to keep the reference "path = "../../../dragonball/src/dbs_utils" in dependencies and cannot hide it. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:26:32 +08:00
Alex Lyn	005b333976	runtime-rs: add network helpers and impl ToQemuParams Add network helpers and impl ToQemuParams trait to build netdev params which are put into cmdline for Qemu VM running. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:25:39 +08:00
Alex Lyn	63786934f4	runtime-rs: set network namespace for qemu process and netdev. We need ensure the add_network_device happens in netns and move qemu process into netns which keeps the qemu process running in this net namespace. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:21:43 +08:00
Alex Lyn	69a5e5b955	runtime-rs: add network device handler in start_vm. Add network device handler in start_vm, which is sepcially for Qemu VM running with added net params to command line. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:18:01 +08:00
Alex Lyn	a116b252c8	Merge pull request #9236 from jodh-intel/docs-improve-install-details docs: install: Simplify instructions	2024-03-12 14:29:38 +08:00
Alex Lyn	a31fb35e5d	Merge pull request #9231 from UiPath/fix/clh-pid-init clh: initialize clh pid before using it	2024-03-12 13:43:24 +08:00
Alex Lyn	9f6003adde	runtime-rs: add a new netns field in struct QemuInner. We need add a new netns field in struct QemuInner, and initialize it with argument passed down in prepare_vm(). Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-11 16:02:39 +08:00
Alex Lyn	f571ec84d2	runtime-rs: add a public method to support process entering netns. The enter_netns function is designed as a public method to help VMMs running as a independent process enter a network namespace, reducing duplicate code. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-11 15:55:52 +08:00
Alex Lyn	4176fcc3c6	runtime-rs: make the code for cleanup fd flags as public method. It just move the related code to a public file(utils.rs) and make it a common method for both vsock and network, or some others. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-11 15:52:20 +08:00
Alex Lyn	b1038704e0	runtime-rs: make NetnsGuard common for hypervisor and resource. In order to better support non-builtin vmm usage of NetnsGuard and reduce code duplication, we need to move it to a common path that can be referenced by both hypervisor and resource manager. In this patch, it just do moving code from network/utils/netns.rs to kata-sys-utils/src/netns.rs Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-11 15:38:42 +08:00
Alexandru Matei	617b0114b3	clh: initialize clh pid before using it The PID needs to be initialized before calling isClhRunning. waitVMM() uses isClhRunning and is called by launchClh() just before returning from function. Fixes: #9230 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-03-09 13:53:51 +02:00
Dan Mihai	88b7a44271	tests: k8s: k8s-exec.bats auto-generated policy Auto-generate policy for k8s-exec.bats. Fixes: #9247 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-08 17:48:20 +00:00
Steve Horsman	54e5ce2464	Merge pull request #9154 from chungeun-choi/change-deprecated-package fixed - Change the deprecated module from 'io/util' to util. 'io/util…	2024-03-08 15:05:43 +00:00
Steve Horsman	e9bbf2f67b	Merge pull request #9203 from fidencio/topic/releases-follow-up-III release: Ensure the release-type is passed to workflows	2024-03-08 14:09:36 +00:00
Alex Lyn	c73597c39d	Merge pull request #9208 from studychao/chao/fix_virt_ci Dragonball: fix unit test problems when switching to new virt github machine	2024-03-08 09:41:05 +08:00
Chengyu Zhu	d49391a555	Merge pull request #8798 from LindaYu17/setpolicy add setpolicy function to kata-runtime tool	2024-03-08 06:31:57 +08:00
Dan Mihai	5398b6466c	Merge pull request #9224 from 3u13r/sidecar-container genpolicy: add restartPolicy to container struct	2024-03-07 12:59:55 -08:00
GabyCT	35d8f82232	Merge pull request #9242 from GabyCT/topic/enabldebugnerd gha: Add collect artifacts step to nerdctl workflow	2024-03-07 13:34:40 -06:00
Wainer Moschetta	91998af173	Merge pull request #9114 from wainersm/ci_kbs_cli CI: add KBS utilities for attestation tests	2024-03-07 16:34:03 -03:00
Dan Mihai	4c3d6fadc8	genpolicy: default env if image doesn't have env Use containerd's default environment for container images that don't specify the Env field. Also, re-enable policy env variable verification, now that these uncommon images are supported too. Fixes: #9239 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 16:56:06 +00:00
Dan Mihai	b3a02d5e06	Merge pull request #9128 from microsoft/danmihai1/test-genpolicy tests: k8s: auto-generated policy	2024-03-07 08:50:47 -08:00
Fabiano Fidêncio	8faab965a7	gh: Fix payload-after-push tags We now expect the arch specific images to be tagged as kata-containers-latest-${arch}. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-07 12:02:51 +00:00
Fabiano Fidêncio	eab78cf1ba	release: Reword the extra notes added as part of the release We're trying to keep just the bare minimum info, as we really would like to not have the list of commits, and mainly the list of new contributors, trucated from the release notes. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-07 12:02:51 +00:00
Fabiano Fidêncio	658fb6972b	release: Ensure the release-type is passed to workflows We need to ensure the release type is passed down to workflows, otherwise we'll fail to get the correct release version for tagging the daemonset images. Fixes: #9064 - part III Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-07 12:02:51 +00:00
Alex Lyn	a0a50f5e52	Merge pull request #9191 from Apokleos/fix-kata-ctl-exec0 kata-ctl: Support using container short ID to enter guest.	2024-03-07 19:26:40 +08:00
Wainer dos Santos Moschetta	8ea9ac515e	tests/k8s: update kbs repository Recently confidential-containers/kbs repository was renamed to confidential-containers/trustee. Github will automatically resolve the old URL but we better adjust it in code. The trustee repository will be cloned to $COCO_TRUSTEE_DIR. Adjusted file paths and pushd/popd's to use $COCO_KBS_DIR ($COCO_TRUSTEE_DIR/kbs). On versions.yaml changed from `coco-kbs` to `coco-trustee` as in the future we might need other trustee components, so keeping it generic. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	c669567cd3	tests/k8s: add utils to set KBS policies Added the kbs_set_resources_policy() function to set the KBS policy. Also the kbs_set_allow_all_resources() and kbs_set_deny_all_resources to set the "allow all" and "deny all" policy, respectively. Fixes #9056 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	6f0d38094d	tests/k8s: add utils to set KBS resources Added utility functions to manage resources in KBS: - kbs_set_resource(), where the resource data is passed via argument - kbs_set_resource_from_file(), where the resource data is found in a file Fixes #9056 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	2a374422c5	tests/k8s: add function to install kbs-client Added kbs_install_cli function to build and install the kbs-client executable if not present into the system. Removed the stub from gha-run.sh; now the install kbs-client in the .github/workflows/run-kata-deploy-tests-on-aks.yaml will effectively install the executable. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	4141875ffd	ci/lib.sh: set GOPATH default value Scripts sourcing ci/lib.sh need to set $GOPATH otherwise it will fail. This ensure that GOPATH is set to ${HOME}/go unless it is already exported. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	e410aef4fa	tests/k8s: add utils to get kbs service address Added functions to return the service host, port or full-qualified HTTP address, respectively, kbs_k8s_svc_host(), kbs_k8s_svc_port(), and kbs_k8s_svc_http_addr(). Fixes #9056 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Leonard Cohnen	e30e8ab7dc	genpolicy: add restartPolicy to container struct This adds support for sidecar container introduced in Kubernetes 1.28 Fixes: #9220 Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-03-07 12:00:14 +01:00
Chungeun Choi	bad263f399	runtime: Replace deprecated module io/ioutil" to "io" This change updates the module import to use 'util' instead of the deprecated 'io/util' Fixes: #9166 Signed-off-by: Chungeun Choi <ce.choi@okestro.com>	2024-03-07 10:56:06 +00:00
Alex Lyn	ef9a38e551	shim-interface: add Copyright of AntGroup in file shim-interface.rs Fixes: #9189 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-07 15:46:32 +08:00
Alex Lyn	2972a3a675	shim-interface: add UT for get_uds_with_sid Fixes: #9189 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-07 15:45:44 +08:00
Alex Lyn	7145243bd3	kata-ctl: Support using container short ID to enter guest. Fixes: #9189 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-07 15:44:47 +08:00
Linda Yu	bb77d2d7e6	docs: add docs on how to set policy by kata-runtime Fixes: #8797 Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-03-07 15:00:23 +08:00
Linda Yu	1c5693be86	stream: repeat copybuffer if it is blocked by policy copyBuffer returns and the streams will be closed when error occurs. If the error contains "blocked by policy" it means the log output is disabled by policy with "ReadStreamRequest" and "WriteStreamRequest" set to false. But at this moment, we want the real stream still working (not be seen) because we might want to enable logging for debugging purpose, so we repeat copybuffer in this case to avoid streams being closed. Fixes: #8797 Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-03-07 15:00:23 +08:00
Linda Yu	eda419cb03	kata-runtime: add set policy function to kata-runtime logging/debugging information might probably be disabled in production due to security consideration, but we'd better provide an approach for customer to get logging information during runtime, this PR implement setpolicy function in kata-runtime tools, although it can set whole policy other than logging. setpolicy would evokes remote attestation, which means before setting policy during runtime, user has to reconfigure new policy hash in KBS/AS. usage: kata-runtime policy set policy.rego --sandbox-id XXXXXXXX Fixes: #8797 Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-03-07 15:00:23 +08:00
Dan Mihai	c08b696d9e	tests: k8s: k8s-shared-volume generated policy Auto-generate policy for k8s-shared-volume.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	b24758fad8	tests: k8s: k8s-scale-nginx auto-generated policy Auto-generate policy for k8s-scale-nginx.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	af9ac8d194	tests: k8s: k8s-replication auto-generated policy Auto-generate policy for k8s-replication.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	56689c6800	tests: k8s: k8s-qos-pods auto-generated policy Auto-generate policy for k8s-qos-pods.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	0179f53469	tests: k8s: k8s-parallel auto-generated policy Auto-generate policy for k8s-parallel.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	73a8b61c2e	Merge pull request #9243 from microsoft/danmihai1/genpolicy-unblock-ci genpolicy: disable env variable verification	2024-03-06 21:44:18 -08:00
Dan Mihai	e61ef30a76	genpolicy: disable env variable verification Disable env variable verification to unblock CI, until container images that don't specify the Env variables will be handled correctly (see #9239). Also, mark the image config Env field as optional, thus allowing policy generation for these container images. Fixes: #9240 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 01:59:18 +00:00
Gabriela Cervantes	94fdcda7f7	scripts: Add collect artifacts function in nerdctl gha run script This PR adds the collect artifacts function in nerdctl gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-06 19:48:12 +00:00
Gabriela Cervantes	f902ee78d0	gha: Add collect artifacts step to nerdctl workflow This PR adds the collect artifacts step to nerdctl workflow. Fixes #9241 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-06 19:41:16 +00:00
GabyCT	640ed591bd	Merge pull request #9219 from GabyCT/topic/fixkerneldoc docs: Remove stale kernel information at README documentation	2024-03-06 10:24:31 -06:00
James O. D. Hunt	b1d4cbd9d1	utils: spell-checker: Fix grep warning Fix the `grep(1)` warning caused by the unnecessary escaping of the hash/sharp symbol. Fixes: #9235. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-06 13:21:15 +00:00
James O. D. Hunt	5257bfa9a9	docs: install: Simplify instructions Move the "build from source" and "manual installation" details to the developer guide. This makes the installation landing page clearer for users. Fixes: #9234. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-06 13:14:03 +00:00
Ryan Savino	fdfc825bc4	Merge pull request #9174 from ryansavino/snp-qemu-stable-coco-tag versions: SNP qemu updated to stable coco tagged version	2024-03-06 01:03:10 -06:00
GabyCT	83e39a206c	Merge pull request #9223 from jodh-intel/tests-add-k3s-artifacts tests: Add k3s artifacts	2024-03-05 13:37:21 -06:00
James O. D. Hunt	a67ed2f1c2	tests: Add k3s artifacts The k3s distribution of k8s uses an embedded version of containerd and configures it to log to a file, not the journal. Hence, although we collect the journal as a test artifact, we also need to collect the actual log files for containerd. Also collect the k3s containerd config files to help with debugging. Fixes: #9104. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-05 17:54:20 +00:00
GabyCT	9fab57acc8	Merge pull request #9217 from wainersm/revert_collect_artifacts gha: export start_time to collect artifacts properly	2024-03-05 11:11:49 -06:00
Gabriela Cervantes	12be4cf828	docs: Remove stale kernel information at README documentation This PR removes stale kernel information at README documentation. Fixes #9218 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-05 16:46:45 +00:00
Wainer dos Santos Moschetta	9ba5e3d2a8	gha: export start_time to collect artifacts properly The jobs running on garm will collect journal information. The data gathered is based on the time the tests started running. The $start_time is exported on run_tests() and used in collect_artifacts(). It happens that run_tests() and collect_artifacts() are called on different steps of the workflow and the environment variables aren't preserved between them, i.e, $start_time exported on the first step is not available on the subsequents. To solve that issue, let's save $start_time in the file pointed out by $GITHUB_ENV that Github actions uses to export variables. In case $GITHUB_ENV is empty then probably it is running locally outside of Github, so it won't save the start time value. Fixes #9217 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-05 12:15:20 -03:00
James O. D. Hunt	b761a80bd1	Merge pull request #9059 from jodh-intel/kata-manager-add-hypervisor-option kata-manager: Allow hypervisor to be changed	2024-03-05 09:30:04 +00:00
Alex Lyn	bf5edc8e73	Merge pull request #9155 from Jimmy-Xu/fix-build-gpu-kernel gpu: fix build guest kernel with gpu	2024-03-05 16:53:44 +08:00
Greg Kurz	0320198889	Merge pull request #9206 from lifupan/main CI: fix the issue of ci failure on crio	2024-03-05 09:52:13 +01:00
Fupan Li	628f57aca0	Merge pull request #9193 from UiPath/fix/clh-dax clh: Enable DAX for rootfs	2024-03-05 09:39:22 +08:00
Wainer Moschetta	38088a934b	Merge pull request #9184 from wainersm/fix_kata_deploy_bats tests/kata-deploy: fix checker for kata-deploy running	2024-03-04 20:50:37 -03:00
GabyCT	77d048da4d	Merge pull request #9065 from wainersm/ci_install_kbs CI: Install KBS on k8s for attestation tests	2024-03-04 16:59:01 -06:00
GabyCT	a4153f3b71	Merge pull request #9210 from GabyCT/topic/addtestreadme docs: Add general README for tests section	2024-03-04 16:54:28 -06:00
Gabriela Cervantes	5d50262422	docs: Add general tests documentation in main README This PR adds the general tests documentation in main README of the kata containers repository. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-04 21:53:01 +00:00
Gabriela Cervantes	d5fa2bebd5	docs: Add general README for tests section This PR adds general README documentation for the tests section in the kata containers repository. Fixes #9209 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-04 21:50:37 +00:00
GabyCT	4dea9019ab	Merge pull request #9126 from GabyCT/topic/addartifactsk gha: Storing artifacts for logs of k8s tests garm	2024-03-04 15:41:54 -06:00
Gabriela Cervantes	fc5e040d96	scripts: Apply general fixes to variables in gha-run script This PR applies general fixes to variables in gha-run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-04 18:54:15 +00:00
James O. D. Hunt	7af892f8d8	docs: Update kata-manager docs for switching hypervisor Add details to the README for `kata-manager` showing how to list available hypervisor configs (packaged and local), and switch between the configurations. Also, update the hypervisors page to show a lot more detail about the hypervisor configurations, including the "short name" used by `kata-manager` for switching hypervisor config. > Note: > > These changes only apply to the current default golang runtime. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 12:24:31 +00:00
James O. D. Hunt	4f6fef1f61	docs: Whitespace fix Remove extraneous whitespace from hypervisors doc. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 12:18:05 +00:00
James O. D. Hunt	1ac3caf656	kata-manager: Allow hypervisor to be changed Add new options to allow the configured hypervisor to be changed: - `-L`: List available _packaged_ hypervisor config short names. - `-e`: List available _local_ hypervisor config names. - `-H <hypervisor>`: Install Kata then switch to the specified hypervisor. - `-S <hypervisor>`: Switch to the specified hypervisor (by config short name [Errors if Kata not installed]). For example, to install Kata and configure it to use Cloud Hypervisor with the golang Kata runtime: ```bash $ kata-manager.sh -H clh ``` To switch back to the default hypervisor: ```bash $ kata-manager.sh -S default ``` To show details of the available packaged configs: ```bash $ kata-manager.sh -L ``` To show details of the local configs: ```bash $ kata-manager.sh -e ``` > Notes: > > - This change only applies to the current default (golang) Kata runtime. > > - Although this is mainly for users wishing to switch hypervisor (by > changing the Kata config file to another of the packaged config files > provided for specific hypervisors), strictly it allows users to change > to _any_ config file. For example, if the user has a config file called > `/etc/kata-containers/configuration-my-custom-config.toml`, they could > switch to this by running: > > ```bash > $ kata-manager.sh -S my-custom-config > ``` > > - The "config short names" are the hypervisor specific part of the configuration file name. > For example, the config short name for file `configuration-qemu.toml` is > `qemu` and the config short name for `configuration-clh.toml` is `clh`. Fixes: #8305. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 12:18:00 +00:00
James O. D. Hunt	0bb558c0b9	kata-manager: Fix symlink handling The `configure_kata()` function modifies the configuration file to enable debug. But it was doing this by calling `sed -i` which, by default, creates a new _file_ from the `configuration.toml` symbolic link. This defeated the point of the symbolic link which is supposed to resolve to the local copy of the pristine config file, so we now use the GNU sed(1) specific `---follow-symlinks` option to retain the sym-link. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 11:15:39 +00:00
James O. D. Hunt	455637b30a	kata-manager: Show message when checking file Add an info message just before the archive file is checked. This keeps the user informed about what is happening as it can take a few seconds to perform the checks on slower systems. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 11:15:39 +00:00
James O. D. Hunt	ce350450e8	kata-manager: Sort options in usage Ensure the usage statement lists all options in alphabetical order. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 11:15:39 +00:00
James O. D. Hunt	159d29665a	kata-manager: Whitespace fixes Remove extraneous whitespace. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 11:15:39 +00:00
Chao Wu	9f0eab904b	Dragonball: fix test_signal_handler a) There is some unknown syscalls triggered in new github virt machine that would break the make test process with SIGSYS after applying SeccompFilter. In order to fix this, we change the allowlist in this unit test for seccompfileter into a blocklist to avoid meeting the unknown syscalls. b) lazy static METRICS is not fully initialize in the unit test and may lead to unstable result for this UT. fixes: #9207 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-03-04 16:27:27 +08:00
Chao Wu	253fe72435	Dragonball: fix test_handler_insert_region the mmap region start guest addr hard-code a value and later there would be check whether the mentioned addr is larger than or equal to mem_end (default to host_phy_mem >> 1) in order to satisfy the requirement for DaxMemory. Since github virt machine phy_mem is larger than previous CI machine we use, the hard-code value could no longer be worked. To fix this, we change the address to mem_end in unit test to avoid the influence of host machine change. fixes: #9207 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-03-04 16:27:19 +08:00
Jimmy-Xu	5ada7329b8	gpu: fix build guest kernel with nvidia gpu - enable CONFIG_MTRR,CONFIG_X86_PAT on x86_64 for nvidia gpu - optimize -f of build-kernel.sh, clean old kernel path and config before setup - add kernel 5.16.x Fixes: #9143 Signed-off-by: Jimmy-Xu <xjimmyshcn@gmail.com>	2024-03-04 09:40:42 +08:00
Fupan Li	07e0cf1855	CI: fix the issue of ci failure on crio PR #8760 tentatively tried to have the shim to run in its own mount namespace for the sake of improving isolation between the sandbox and the host. Thus crio storage drivers shouldn't create a PRIVATE bind mount on their home directory. Otherwise, the container's rootfs mount wouldn't be propagated to kata runtime's mount namespace, and kata runtime couldn't access the container's rootfs files. So, when kata cooperated with crio, crio should set skip_mount_home=true for its storage overlay. Fixes: #9028 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-03-03 20:53:36 +08:00
Wainer dos Santos Moschetta	2c24977cb1	tests/k8s: allow to overwrite the cluster name _print_cluster_name() create a string based information like the pull request number and commit SHA. However, when you are developing the scripts you might want to use an arbitrary name, so it was introduced the $AKS_NAME variable that once exported it will overwrite the generated name. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta	5e4b7bbd04	tests/k8s: expose KBS service externally Until this point the deployed KBS service is only reachable from within the cluster. This introduces a generic mechanism to apply an Ingress configuration to expose the service externally. The first implemened ingress is for AKS. In case the HTTP application routing isn't enabled in the cluster (this is required for ingress), an add-on is applied. It was added the get_cluster_specific_dns_zone() and enable_cluster_http_application_routing() helper functions to gha-run-k8s-common.sh. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta	e1e0b94975	tests/k8s: introduce the CoCo kbs library Introduce the tests/integration/kubernetes/confidential_kbs.sh library that contains functions to manage the KBS on CI. Initially implemented the kbs_k8s_deploy() and kbs_k8s_delete() functions to, respectively, deploy and delete KBS on Kubernetes. Also hooked those functions in the tests/integration/kubernetes/gha-run.sh script to follow the convention of running commands from Github Workflows: $ .tests/integration/kubernetes/gha-run.sh deploy-coco-kbs $ .tests/integration/kubernetes/gha-run.sh delete-coco-kbs Fixes #9058 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-02 12:39:26 -03:00
Wainer dos Santos Moschetta	6a28c94d99	tests/k8s: add a kustomize installer Kustomize has been used on some of our internal components (e.g. kata-deploy) to manage k8s deployments. On CI it has been used the `sed` tool to edit kustomization.yaml files, but `kustomize` is more suitable for that purpose. So in order to use that tool on CI scripts in the future, this commit introduces the `install_kustomize()` function that is going to download and install the binary in /usr/local/bin in case it's found on $PATH. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-02 12:39:26 -03:00
Xuewei Niu	daab76de36	Merge pull request #9201 from liubogithub/liubo/dev/panic_fix_3 katautils: fix panic on tracing.	2024-03-02 10:27:02 +08:00
GabyCT	4a0cfc4e3f	Merge pull request #9199 from GabyCT/topic/enablecri gha: Enable cri-containerd tests for cloud hypervisor runtime-rs	2024-03-01 12:23:16 -06:00
Steve Horsman	1ec33b8879	Merge pull request #9200 from wainersm/ci_install_kbs-timeout gha: increase timeout of KBS steps	2024-03-01 16:00:21 +00:00
Gabriela Cervantes	7299dbdb43	gha: Store journalctl logs This PR stores the journalctl logs. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-01 15:17:20 +00:00
Gabriela Cervantes	342d3a320d	gha: Add collect artifacts function in gha-run script This PR adds the collect artifacts function in gha-run script for the kubernetes tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-01 15:17:20 +00:00
Gabriela Cervantes	2070e3481e	gha: Storing artifacts for logs of k8s tests garm This PR helps to store the artifacts for different logs for k8s tests on garm. Fixes #9103 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-01 15:17:20 +00:00
Greg Kurz	df17bf95d5	Merge pull request #9169 from ldoktor/backport-ocp ci.ocp: Backport service-up detection fixes	2024-03-01 16:09:55 +01:00
Greg Kurz	dc6bda19bf	Merge pull request #9179 from gkurz/fix-k8s-sandbox-vcpus-allocation-check tests: k8s: Adapt k8s-sandbox-vcpus-allocation.bats to kubernetes v1.29	2024-03-01 15:55:07 +01:00
Lukáš Doktor	6fffbaa190	ci.ocp: Backport service-up detection fixes This backports the: 9060e930caf2d20f413df07778d3ab497493161c ci.ocp: Add debug output on HTTP service failure these logs are vital to analyze a setup failure. a10a1e2c9cbc21afc1e80f22b0fb8634d27cbd8d ci.ocp: Improve the service-up detection waiting for the first response is not sufficient as OCP returns html page without error even when the route is not yet established describing the issue (why it doesn't reply with 500?). Waiting for the correct output should do better. commits from the kata-containers/tests repo. Fixes: #8653 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-01 12:04:20 +01:00
Alex Lyn	13a20957cb	Merge pull request #9164 from Apokleos/directvol-csi-dockerfile csi-kata-directvolume: add Dockerfile for building csi image	2024-03-01 18:12:19 +08:00
Alex Lyn	f69428a1e7	csi-kata-directvolume: add Dockerfile for building csi image Fixes: #9163 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-01 10:41:51 +08:00
Liu Bo	b6f8355ea3	katautils: fix panic on tracing. This fixes a panic on tracing on container exit. The root cause is that global var needs to be set by "=" instead of ":=". Fixes: #9102 Signed-off-by: Liu Bo <liub.liubo@gmail.com>	2024-02-29 18:40:23 -08:00
Wainer dos Santos Moschetta	24c163e6e1	tests/kata-deploy: fix checker for kata-deploy running Currently, the checking for kata-deploy is running assume that the daemonset scheduled at least one pod, however it might not had and the kubectl wait command fails due to "error: no matching resources found". On CI I've observed that fail intermittently. I suspect the service account kata-deploy-sa take a while to show up then no kata-deploy is scheduled in meanwhile. Changed the checker logic to use waitForProcess() to keep testing if it is already running, or hit the timeout (still 10m). Fixes #9183 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-29 22:26:27 -03:00
Wainer dos Santos Moschetta	4410df7233	gha: increase timeout of KBS steps The step to deploy KBS on run-k8s-tests-on-aks workflow should be increased so that there is enough time for checking the service is healthy and exposed. Likewise the step that builds the kbs-client which requires enough time to build the executable. Fixes #9058 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-29 22:05:58 -03:00
Dan Mihai	11b603e5f1	Merge pull request #9139 from microsoft/saulparedes/genolicy_panic_subpath genpolicy: panic when we see a volume mount subpath	2024-02-29 12:18:56 -08:00
Gabriela Cervantes	beb592b309	gha: Enable cri-containerd tests for cloud hypervisor runtime-rs This PR enables the cri-containerd tests for cloud hypervisor runtime-rs. Fixes #9198 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-29 20:18:16 +00:00
GabyCT	a4f5815f6b	Merge pull request #9182 from GabyCT/topic/addclhcri gha: Add cloud-hypervisor (runtime-rs) support to cri-containerd tests	2024-02-29 14:12:01 -06:00
Gabriela Cervantes	0f595cf15b	gha: General variable fixes to gha-run script This PR adds general variable fixes to gha-run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-29 18:15:27 +00:00
Alexandru Matei	6856e8f678	clh: Enable DAX for rootfs Fixes: #9192 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-02-29 18:01:47 +02:00
Greg Kurz	f3442cdef9	tests: k8s: Adapt k8s-sandbox-vcpus-allocation.bats to kubernetes v1.29 Kubernetes v1.29 introduced a new `PodReadyToStartContainers` condition that gets inserted at index 0 in the conditions array. This means that the expected `PodCompleted` reason can now be either at index 0 with kubernetes v1.28 and older or at index 1 starting with kubernetes v1.29. This is fragile at best since the `kubectl wait` doesn't allow to combine multiple checks. Also, checking the reason is dubious as it doesn't really tell if the pods have actually completed or not. Check the pod phase to be `Succeeded` instead, this guarantees that : > All containers in the Pod have terminated in success, and will not > be restarted. Fixes #9178 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-29 17:00:31 +01:00
Greg Kurz	f89120662d	tests: k8s: Wait for all pods concurrently A single invocation of `kubectl wait` can handle all pods. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-29 17:00:31 +01:00
Greg Kurz	58bc026656	Merge pull request #9180 from fidencio/topic/actually-add-the-pause-image-into-the-rootfs rootfs: Fix PAUSE_IMAGE_TARBALL addition to the rootfs	2024-02-29 13:56:32 +01:00
Chengyu Zhu	c01ba58b3d	Merge pull request #9176 from ChengyuZhu6/stale_doc docs: renew stale link	2024-02-29 18:35:26 +08:00
Fabiano Fidêncio	1d2f7afd1f	Merge pull request #9188 from fidencio/topic/releases-follow-up-II releases: Second round of follow-up fixes	2024-02-29 10:59:36 +01:00
Fabiano Fidêncio	c9dfe49152	gha: payload: Fix env var declarations This was introduced by `a45988766c`, but didn't follow the correct format for the env declaration. Fixes: #9064 - part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-29 10:52:49 +01:00
Fabiano Fidêncio	1c3a769822	gha: payload: Don't use concurrency for this job We want all payloads to be built and published, regardless if there's a new PR merged. This will help people to easily trace / debug issues. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-29 10:52:45 +01:00
Fabiano Fidêncio	02af62b66c	gha: payload: Stop generating payloads for the stable branches We've decided to not maintain stable branches anymore, thus we can only trigger this workflow for the `main` branch. For more details, please, see: https://github.com/kata-containers/kata-containers/issues/9064 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-29 10:42:25 +01:00
Fabiano Fidêncio	b4061a1c23	Merge pull request #9170 from fidencio/topic/releases-follow-up-I release: Add the needed fixes for the release process	2024-02-29 10:36:20 +01:00
ChengyuZhu6	e5d3627794	docs: renew stale link Renew the stale link "https://github.com/containerd/containerd/tree/main/runtime/v2" to the latest "https://github.com/containerd/containerd/tree/main/core/runtime/v2". Fixes: #9177 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-29 15:03:02 +08:00
Fabiano Fidêncio	0022474164	rootfs: Fix PAUSE_IMAGE_TARBALL addition to the rootfs We were never passing the arguments to add the PAUSE_IMAGE to the rootfs, leading to it never being present in the confidential image / initrd. Fixes: #9032 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 22:42:27 +01:00
GabyCT	aacbbde35d	Merge pull request #9172 from GabyCT/topic/docpradvice docs: Update Code PR advice document	2024-02-28 13:37:28 -06:00
Gabriela Cervantes	3cd319fcc2	scripts: General fixes to the gha-run script This PR implements general fixes to the gha-run script for the cri-containerd tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-28 19:32:51 +00:00
Gabriela Cervantes	5a498948c8	scripts: Skip cri-containerd in gha-run script This PR skips the cri-containerd in gha-run script for cloud hypervisor runtime-rs. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-28 19:30:38 +00:00
Gabriela Cervantes	4bfb9c30e7	gha: Add cloud-hypervisor (runtime-rs) support to cri-containerd tests This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs, as part of the cri-containerd tests. Fixes #9181 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-28 19:24:18 +00:00
Wainer Moschetta	c4b8270073	Merge pull request #9009 from wainersm/runk_bats tests/runk: fix the "run ps command" flaky test	2024-02-28 15:58:36 -03:00
Wainer Moschetta	129ce84705	Merge pull request #9116 from wainersm/ci_install_kbs-workflow gha: k8s: prepare AKS workflow to install the CoCo KBS	2024-02-28 14:43:41 -03:00
Gabriela Cervantes	ec1dde1d01	docs: Update Code PR advice document This PR updates the code pr advice document to make the proper references now that we have move the test repository to the kata containers repository. Fixes #9171 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-28 16:14:22 +00:00
Ryan Savino	9e9dae8efb	versions: SNP qemu updated to stable coco tagged version New qemu fork of AMDESE created in confidential-containers project. SNP qemu version now pointed to stable tag at: https://github.com/confidential-containers/qemu/tree/amd-snp-202402240000 Fixes: #9173 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-02-28 09:28:14 -06:00
Fabiano Fidêncio	068d80a9cb	docs: releases: Update link for the release actions This allows users to go directly to the action page whenever a release needs to be cut. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	520cd90c43	release: Remove the "test-" from the release version This is not needed anymore as we can run the tests from any branch, and we can patch this locally before doing a test. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	22b19d0637	release: Add a step to get the release tags GitHub actions is fun and always willing to play tricks with us. This nice little kid decided that `echo "FOO=\"bar zaz\"" >> $GITHUB_ENV` is not valid, and it simply breaks things in a way that is a pain to debug. But hey, we take this path, and after doing so I realised that the correct way to export that is `echo "FOO=bar zaz" >> $GITHUB_ENV`. I know, this looks incorrect, but this fellow never stops surprising us. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	cdf1e4afde	release: Fix typo in the arm arch For some reason I'd changed arm64 to arm4 in a previous (already merged) commit. :-/ Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	3db0630bc1	release: Add our own bits to the release notes I'm getting here the most relevant parts of what we had as part of the release-notes.sh script. As the script will not be used anymore, it's been removed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	aaf38aca98	release: Fix typo in the _upload_libseccomp_tarball() RELEASE_VERSIOB -> RELEASE_VERSION Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	397167836b	release: Fix yq installation For some reason we need to force its installation in the GOPATH, otherwise yq is not found. Ideally we should switch to a packaged version of yq, but that's a topic for another series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:55 +01:00
Fabiano Fidêncio	6915131adc	release: Fix KATA_DEPLOY_{IMAGE_TAGS,REGISTRIES} declaration Otherwise we may end up with an unbound variable. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:55 +01:00
Fabiano Fidêncio	757f958943	release: Adjust tags used to publish our deamonset We need to adjust the tags as when this workflow ends up being called from the release side, we'll receive "refs/tags/main" as the GITHUB_REF, and in that case we must use the release version. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:51 +01:00
Fabiano Fidêncio	d339366a16	release: Get the release version from our internal function This is utterly counter intuitive, but if we change a file during the GitHub Action, the checkout done for the next workflow won't have that file updated, but rather the branch on its original state when the workflow was created. This makes us safe to always "calculate" the next release version from the VERSION file at the time the workflow was triggered. This requires us to have the release type exported for the whole workflow. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:30:06 +01:00
Fabiano Fidêncio	8023d64b1a	release: Adjust "needs" in the release workflow Without those we'll end up running steps in parallel that should actually wait for a previous step to be completed. While here, let's also correct some of the "needs" that were waiting fro the wrong workflow to be finished. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:30:06 +01:00
Fabiano Fidêncio	d10b818de5	release: Add missing return to _check_required_env_var() Otherwise none of the calls to this function will actually continue after it's called. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:30:06 +01:00
Fabiano Fidêncio	0aa82e7050	release: Add missing env vars to _check_required_env_var() We missed doing this as part of `50011e89a0`, but we also need to check for: * RELEASE_VERSION * GH_TOKEN * ARCHITECTURE * KATA_STATIC_TARBALL While here, let's fix a ARCHITECURE -> ARCHITECTURE typo. Fixes: #9064 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:30:05 +01:00
Chengyu Zhu	bb4c608b32	Merge pull request #9110 from ChengyuZhu6/agent_option agent: Add all agent configuration options to README	2024-02-28 18:50:44 +08:00
Dan Mihai	352e2af5f0	Merge pull request #9153 from microsoft/danmihai1/clh-bootVM-timeout runtime: clh: minimum 10s timeout for CreateVM + BootVM	2024-02-27 09:58:01 -08:00
Wainer dos Santos Moschetta	b44e0c4e7c	gha: k8s: prepare AKS workflow to install the CoCo KBS Changed the "run k8s tests on AKS" workflows to get the CoCo KBS installed so that we can run attestation tests. The plan is to run attestation tests only on a subset of non-TEE jobs initially, so this commit restricts to install KBS only on kata-qemu configuration. Actually at this point it is added only stubs commands to tests/integration/kubernetes/gha-run.sh that should be implemented in a future commit. Fixes #9058 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-27 13:51:15 -03:00
Wainer Moschetta	6186410e35	Merge pull request #8949 from wainersm/tests_nydus tests/nydus: refactor the teardown()	2024-02-27 09:52:44 -03:00
ChengyuZhu6	731c490ded	agent: Add all agent configuration options to README Add all agent configuration options to README so that users can more easily understand what these options do and how to configure them at runtime. Fixes: #9109 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-27 17:35:19 +08:00
Fabiano Fidêncio	4aa40f1bbb	Merge pull request #9146 from fidencio/topic/releases release: Update everything in this repo related to the release and its process	2024-02-27 10:30:49 +01:00
Fabiano Fidêncio	111bb3ec66	release: Add "test-" into the release name This commit should be merged as it's now, then we trigger a test release, fix whatever has to be fixed, and drop it as soon as we know our workflows are working as expected. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:03 +01:00
Fabiano Fidêncio	d69766c0b2	docs: Update the release process Now that we've simplified it by quite a lot, let's update the documentation accordingly. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:03 +01:00
Fabiano Fidêncio	a85481110a	releases: Remove scripts that won't be used anymore Those are not needed anymore as we're automating our release process around GitHub actions. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:03 +01:00
Fabiano Fidêncio	e714c37521	gha: Remove workflows related to backporting stuff We're not doing backports anymore, as we're getting rid of the stable branches in favour of having a better release cadence from the main branch. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	3229c777e7	kata-deploy: Remove "stable" yamls As we're not maintaining a stable branch anymore, let's get rid of the kata-deploy stable pieces. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	008293f015	gha: Add release-{major,minor} workflows Those will allow us to cut a release just by a single click, instead of the current process we have. Fixes: #9064 -- part I Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	f9f04dca2b	gha: release: Update the workflow The release workflow is now updated to be a `workflow_call`, and it includes the steps that had to be manually done in the past, such as updating the needed files and creating the release itself. While on this, the kata-deploy multiarch manifest tags have been updated to match the new release scheme. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	f0675a163a	release: Add _next_release_version() This function returns the version of the next release (the one about to be cut), and it'll be used as part of our new workflow that will take care of the release. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	4675364d8d	release: Add _update_version_file() function Let's add a function that will be responsible for bumping the project's version in the VERSION file, and push it to the branch as part of the release process that will be introduced. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	a99f9026e1	release: Add _create_new_release() This is a helper function that will be used to create a new release as part of our release process workflow (which will still be modified). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	fd699625fe	release: Add _upload_libseccomp_tarball() As the name of the function says, it's responsible for uploading the libseccomp source tarballs as par of our release process. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	d517fa54ac	release: Add _upload_vendored_code_tarball() As hinted by the name of the function, this is used to generate and upload the vendored code we have as its own tarball. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	94b30fcb14	release: Add _upload_versions_yaml_file() As the name says, this function will be used to upload the versions.yaml file during a given release process of the project. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	50011e89a0	release: Add _upload_kata_static_tarball This function, as it names says, will be used to upload the kata-static.tar.xz tarballs generated during the release process. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	a45988766c	release: Add _publish_multiarch_manifest() This function, as it names says, will be used to publish multiarch manifests for the Kata Containers CI and Kata Containers releases. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:01 +01:00
Fabiano Fidêncio	fb2ef32c04	release: Introduce the release.sh helper For now this script does nothing, but we're introducing it in order to redduce the diffs for the next commits in this series. My intention is to have as much as possible related to the release as part of this helper script, and it'll be populated function by function while replacing content that's "hard coded" (and duplicated) on different GitHub actions. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:01 +01:00
GabyCT	1a6c378d26	Merge pull request #9161 from GabyCT/topic/testsreadme docs: Update link for tests in README	2024-02-26 14:50:46 -06:00
Gabriela Cervantes	94615a4fd4	docs: Update link for tests in README This PR updates the link for the tests in README for Kata Containers. Fixes #9160 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-26 15:43:33 +00:00
Wainer dos Santos Moschetta	0f8c36d990	tests/nydus: refactor the teardown() This refactor the teardown() of tests/integration/nydus/nydus_tests.sh: * Moved boilerplate code that kill process to a loop; * Doesn't leave teardown() if a process failed to get killed, so that other clean up routines are ran; * Check if the pid exist then attempt to kill the process, so avoid this misleading message: ``` Usage: kill [options] <pid> [...] Options: <pid> [...] send signal to every <pid> listed -<signal>, -s, --signal <signal> specify the <signal> to be sent -q, --queue <value> integer value to be sent with the signal -l, --list=[<signal>] list all signal names, or convert one to a name -L, --table list all signal names in a nice table -h, --help display this help and exit -V, --version output version information and exit For more details see kill(1). ``` Fixes #8948 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:21:43 -03:00
Wainer dos Santos Moschetta	0f0ce9a81b	tests/runk: replace the busybox image It's recommended to avoid images from docker.io to avoid errors related with hitting the pull limits that happens mostly on bare-metal machines. So this replaced the docker.io's busybox with quay.io/prometheus/busybox. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:11:05 -03:00
Wainer dos Santos Moschetta	bba8b5b2b4	tests/runk: fix flaky test The "run ps command" test has failed once in a while because it doesn't wait the sh command to start within the container, consequently `ps` won't report the amount of lines expected. Fixes #8975 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:09:29 -03:00
Wainer dos Santos Moschetta	28a63070f7	gha: fix step name in run-runk-tests Likely copied from the tracing workflow by mistake. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:09:29 -03:00
Wainer dos Santos Moschetta	8a606eb94d	tests/runk: convert to bats Migrated runk tests from pure shell script to bats to be consistent with other test suites. The install_dependencies() will install the bats tool locally. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:09:23 -03:00
Xuewei Niu	bb5e33b33a	Merge pull request #9100 from littlejawa/fix_5738_metrics_memory runtime: remove kata_shim_netdev metric	2024-02-26 19:01:21 +08:00
James O. D. Hunt	0ea30f44cf	Merge pull request #9076 from jodh-intel/add-survey-link-to-release-notes packaging: release notes: Don't show shortlist by default, and add survey link	2024-02-26 10:25:19 +00:00
Steve Horsman	483ecbadf0	Merge pull request #9142 from ChengyuZhu6/protoc build-checks: Install protoc in the ci environments	2024-02-26 09:52:31 +00:00
Dan Mihai	f4509b806b	runtime: clh: minimum 10s timeout for CreateVM + BootVM Relax the timeout for calling CLH's CreateVM + BootVM APIs. When hitting the older 1s timeout, killing a half-booted Guest and retrying the same boot sequence could have been wasteful and resulting in unstable CI testing on slower Hosts. Fixes: #9152 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-24 19:15:57 +00:00
GabyCT	4f3c83cd12	Merge pull request #9115 from GabyCT/topic/adddief scripts: Add an enhanced die function	2024-02-23 12:03:02 -06:00
Saul Paredes	9b7bd376eb	genpolicy: panic when we see a volume mount subpath Based on https://github.com/kata-containers/runtime/issues/2812 Fixes: #9145 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-02-23 09:56:51 -08:00
James O. D. Hunt	8c72abe38d	packaging: Add link to survey in release notes Add a link in the release notes to the Kata Container survey, to advertise it, and hopefully encourage users to take the survey. Fixes: #9074. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-23 09:57:52 +00:00
James O. D. Hunt	0391c0de82	packaging: Add twistie to release notes shortlog Add a "twistie" / arrow (`▶`) that the user can click on to see the full list of commits _if they want to_. This way, the release notes become easier to read and we can display information below the shortlog which would (probably) normally not be seen due to the huge long list of commits. Fixes: #9075. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-23 09:57:52 +00:00
ChengyuZhu6	3cc55ff8af	build-checks: Install protoc in the ci environments To test PR #8484 for pulling image in the guest with image-rs, the compilation process for the kata-agent relies on protoc: https://github.com/kata-containers/kata-containers/actions/runs/8016317290/job/21898040849?pr=8484 https://github.com/kata-containers/kata-containers/actions/runs/8016534530/job/21898654435?pr=8484 Fixes: #9141 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-23 17:38:13 +08:00
Xuewei Niu	89c76d7d8d	Merge pull request #9125 from gkurz/fix-agent-cgroup-ns agent: Run container workload in its own cgroup namespace (cgroup v2 guest only)	2024-02-23 10:40:17 +08:00
Steve Horsman	e342a9adc4	Merge pull request #9119 from ChengyuZhu6/pause-confidential kata-deploy: Add pause image to confidential rootfs	2024-02-22 17:10:55 +00:00
Steve Horsman	531dcd2f25	Merge pull request #9132 from ChengyuZhu6/nydus-snapshotter-version gha: bump nydus snapshotter version to v0.13.8	2024-02-22 17:10:42 +00:00
Steve Horsman	dfa6e932bb	Merge pull request #9122 from ChengyuZhu6/snapshotter-clean gha: try to cleanup nydus snapshotter before deploying it	2024-02-22 13:30:04 +00:00
Julien Ropé	1c306fe4a6	runtime-rs: stop reporting net dev metrics for the shim For consistency with the go runtime. As the shim itself is not using the network (all its communication with other processes is done with local unix sockets), there is no reason to keep gathering and reporting shim-specific network metrics. Actual network usage of the kata containers can be found from the existing agent network metrics (kata_guest_netdev_stat). Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-02-22 14:00:00 +01:00
Julien Ropé	9de65707ca	runtime: stop reporting net dev metrics for the shim As part of the shim network metrics, the shim is reporting network interfaces from the host with no namespace isolation - this gives insight in interfaces not tied to the kata containers, and causes an increase in resource usage for kata metrics. As the shim itself is not using the network (all its communication with other processes is done with local unix sockets), there is no reason to keep gathering and reporting shim-specific network metrics. Actual network usage of the kata containers can be found from the existing hypervisor network metrics (kata_hypervisor_netdev) and from the agent network metrics (kata_guest_netdev_stat). Fixes: #5738 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-02-22 14:00:00 +01:00
ChengyuZhu6	8ab3894dc5	gha: try to cleanup nydus snapshotter before deploying it CI failed to deploy nydus snapshotter because it was not cleaned up last time. So we can try to cleanup nydus snapshotter before deploying it. Fixes: #9121 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-22 18:51:14 +08:00
Alex Lyn	5d3ae360ed	Merge pull request #9130 from Apokleos/bugfix-dragonball-invalidOperation runtime-rs: bugfix for GPU passthrough failed with InvalidOperation.	2024-02-22 17:47:09 +08:00
ChengyuZhu6	f16f709a5e	kata-deploy: Add pause image to confidential rootfs For confidential containers, the pause image needs to be installed in the rootfs. Fixes: #9118 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-22 15:41:16 +08:00
ChengyuZhu6	d8db3fb17f	gha: bump nydus snapshotter version to v0.13.8 Bump nydus snapshotter version to v0.13.8 to fix the bug in v0.13.7 : https://github.com/containerd/nydus-snapshotter/pull/582 Fixes: #9131 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-22 15:35:08 +08:00
Alex Lyn	014e0f4e46	runtime-rs: bugfix for GPU passthrough failed with InvalidOperation. We need initailize the pci_hotplug_enabled with true before we do GPU passthrough with runtime-rs/dragonball. Otherwise it fails with error `InvalidOperation`. Fixes: #9129 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-02-22 10:22:32 +08:00
Dan Mihai	58fbb9f6ec	Merge pull request #9073 from microsoft/danmihai1/test-genpolicy3 tests: k8s: generated policy for additional tests	2024-02-21 14:11:51 -08:00
Dan Mihai	b3c3f992ab	tests: k8s: common clean-up on teardown teardown() gets executed after each test case, so there is no need to clean-up before teardown. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	9c164698d3	tests: k8s: k8s-optional-empty-configmap policy Auto-generate policy for k8s-optional-empty-configmap.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	74a52c6d25	tests: k8s: k8s-oom.bats auto-generated policy Auto-generate policy for k8s-oom.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	26a77d67f4	tests: k8s: k8s-number-cpus auto-generated policy Auto-generate policy for k8s-number-cpus. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	9cbdce15fd	tests: k8s: k8s-memory.bats auto-generated policy Auto-generate policy for k8s-memory.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	40209cc0b7	tests: k8s: k8s-limit-range auto-generated policy Auto-generate policy for k8s-limit-range.bats. Also, fix teardown() namespace. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	df3c0318c6	tests: k8s: add set_namespace_to_policy_settings Add set_namespace_to_policy_settings() for changing the pod namespace in genpolicy settings. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	6e14ce93c9	tests: k8s-kill-all-process-in-container policy Auto-generate policy for k8s-kill-all-process-in-container.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	fad7ba0aea	tests: k8s: k8s-job.bats auto-generated policy Auto-generate policy for 8s-job.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	41c2bcbdc5	tests: k8s: k8s-file-volume auto-generated policy Auto-generate policy for k8s-file-volume.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	d84f50db5b	genpolicy: fix typo in policy logging Improve logging, for easier debugging. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	81e641814f	tests: k8s: k8s-cpu-ns auto-generated policy Auto-generate policy for k8s-cpu-ns.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	bc6d3fc238	tests: k8s: k8s-env.bats auto-generated policy Auto-generate policy for k8s-env.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	0a4fc071ac	tests: k8s: k8s-custom-dns auto-generated policy Auto-generate policy for k8s-custom-dns.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	f693f49e92	tests: k8s: k8s-credentials-secrets policy Auto-generate policy for k8s-credentials-secrets.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	d3d27bbb5b	tests: k8s: k8s-configmap auto-generated policy Auto-generate policy for k8s-configmap.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	b318535536	tests: k8s: auto-generate k8s-caps.bats policy Auto-generated policy for k8s-caps.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Greg Kurz	600b951afd	agent: Run container workload in its own cgroup namespace When cgroup v2 is in use, a container should only see its part of the unified hierarchy in `/sys/fs/cgroup`, not the full hierarchy created at the OS level. Similarly, `/proc/self/cgroup` inside the container should display `0::/`, rather than a full path such as : 0::/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podde291f58_8f20_4d44_aa89_c9e538613d85.slice/crio-9e1823d09627f3c2d42f30d76f0d2933abdbc033a630aab732339c90334fbc5f.scope What is needed here is isolation from the OS. Do that by running the container in its own cgroup namespace. This matches what runc and other non VM based runtimes do. Fixes #9124 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-21 13:14:13 +01:00
Greg Kurz	14886c7b32	agent: lint code Run cargo-clippy to reduce noise in actual functional changes. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-21 13:14:13 +01:00
ChengyuZhu6	cddaf2ce97	kata-deploy: Remove specific kernel/initrd/image leftovers in Makefile Remove specific kernel/initrd/image leftovers in Makefile of local-build, which is the part of #9026. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-21 18:24:10 +08:00
Chelsea Mafrica	241a56989a	Merge pull request #9090 from GabyCT/topic/pulldockerimage gha: docker: Pull docker image as part of the dependencies	2024-02-20 14:28:53 -08:00
GabyCT	ea78013c7e	Merge pull request #9079 from GabyCT/topic/removecilink docs: Update CI link into the README	2024-02-20 14:11:13 -06:00
GabyCT	64c09fe6c5	Merge pull request #9088 from GabyCT/topic/fixnydus gha: nydus: Fix indentation in gha run script	2024-02-20 14:09:54 -06:00
Gabriela Cervantes	ff8a6fa9ef	scripts: Add error script This PR adds the error script to display the error message with much more information to help debugging. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-20 18:30:03 +00:00
Gabriela Cervantes	43a46d5a6b	scripts: Add an enhanced die function This PR adds an enhanced die function in order to dump more information in a yaml format that will help with the debugging. Fixes #9105 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-20 18:27:44 +00:00
Archana Shinde	6d84fe3a37	Merge pull request #8647 from amshinde/cleanup-network Cleanup network to make sure physical interfaces are restores back to original host driver.	2024-02-20 08:59:53 -08:00
Archana Shinde	6d38fa1530	network: Try removing as many changes as possible during network cleanup In case an error is encountered while removing a network endpoint during network cleanup, we cuurently return immediately with the error. With this change, in case of error we simply log the error and proceed towards removing the next endpoint. With this, we can cleanup the network changes made by the shim as much as possible. This is especially important when multiple interfaces are passed to the network namespace using a network plugin like multus. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-20 06:08:05 -08:00
Archana Shinde	b005cda689	network: Move up defer block tp cleanup network Move the defer for cleaning up network before the call to add network. This way if any change made by add network is reverted by in case of failure. This is particulary important for physical network interfaces as with this step we make sure that driver for the physical interface is reverted back to the original host driver. Without this the physical network iterface will remain bound to vfio. Fixes: #8646 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-20 06:06:42 -08:00
Ryan Savino	61ce7455c5	Merge pull request #9086 from niteeshkd/nd_snp_upm packaging: qemu-snp-experimental: support host kernel with gmem	2024-02-19 10:50:13 -06:00
Fabiano Fidêncio	79dc6e95d1	Merge pull request #9108 from fidencio/topic/ci-k8s-fix-wrong-logic-on-confidential-tests ci: k8s: Fix checks used to skip confidential tests	2024-02-19 12:49:57 +01:00
Xuewei Niu	f9307f6852	Merge pull request #9112 from ChengyuZhu6/vendor runtime: fix checksum mismatch error in `make vendor`	2024-02-19 10:54:38 +08:00
ChengyuZhu6	96c297cb37	runtime: fix checksum mismatch error in `make vendor` Fix checksum mismatch error in `make vendor`. Fixes: #9111 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-18 22:22:38 +08:00
Fabiano Fidêncio	3468ac3b6e	ci: k8s: Fix checks used to skip confidential tests This has been introduced by `53bc4a432b`, where the condition was changed. The correct condition is: * If the list of supported tees does not contain the kata hypervisor and the list of supported non tees does not contain the kata hypervisor. The error is that we were checking whether kata-hypervisor would contain the list of supported tees, and that would almost always be false (unless in the case where the list had an one and only one element). Fixes: #9055 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-18 10:10:45 +01:00
Niteesh Dubey	0538bbfc49	packaging: qemu-snp-experimental: support host kernel with gmem This is required to allow creation of SNP coco on host kernel (e.g. https://github.com/AMDESE/linux ,branch:snp-host-latest) supporting guest private memory for SNP using gmem. Note: This qemu does not work if the host kernel does not support gmem/UPM. Fixes: #9092 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-02-15 16:33:46 +00:00
Wainer Moschetta	db744aa8d2	Merge pull request #9023 from ldoktor/webhook-path tools.kata-webhook: Fix lib path	2024-02-15 12:34:01 -03:00
Fabiano Fidêncio	28b4e5ce51	Merge pull request #9099 from BbolroC/skip-k8s-sandbox-vcpus-allocation-s390x CI\|k8s: Skip vcpu allocation test for s390x	2024-02-15 16:05:18 +01:00
James O. D. Hunt	d1513b2030	Merge pull request #9091 from jodh-intel/packaging-add-kata-manager-script packaging: Add the kata manager script	2024-02-15 13:08:36 +00:00
Hyounggyu Choi	8b3f7f353d	CI\|k8s: Skip vcpu allocation test for s390x A test `vcpu allocation k8s test` exhibits different behavior on s390x For more details, please refer to issue #9093. This commit is to make the test skipped until the issue is resolved on the platform. Fixes: #9093 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-15 12:26:35 +01:00
Fabiano Fidêncio	9178541dfb	Merge pull request #9098 from fidencio/topic/runtime-update-runc-to-v1.1.12 runtime: Update runc to v1.1.12	2024-02-15 09:29:10 +01:00
Fabiano Fidêncio	eea4277fbf	runtime: Update runc to v1.1.12 Although we don't seem to be affected by https://nvd.nist.gov/vuln/detail/CVE-2024-21626, we vendor and use the runc package in a few different places of our code, and we better update the package to its latest release. Fixes: #9097 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-14 23:13:39 +01:00
James O. D. Hunt	8c51e02f55	packaging: Add the kata manager script Add `kata-manager.sh` to the release packages. Fixes: #9066. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-14 17:44:42 +00:00
James O. D. Hunt	e49aeec97f	packaging: Use variable for default binary permissions Create a variable for the default binary permissions. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-14 17:44:35 +00:00
James O. D. Hunt	cc2d96671f	packaging: Remove extraneous whitespace Remove some unnecessary whitespace from a couple of `kata-deploy` files. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> whitespace Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-14 17:44:08 +00:00
Fabiano Fidêncio	c95c37d2ab	Merge pull request #9026 from fidencio/topic/packaging-remove-tee-specific-leftovers packaging: Remove leftovers from the transition from TEE specific kernel / initrd / image to the "confidential" ones	2024-02-13 22:14:26 +01:00
GabyCT	9cf343779f	Merge pull request #9062 from GabyCT/topic/nonteet tests: Add ability to run non-TEE environments	2024-02-13 14:28:07 -06:00
Fabiano Fidêncio	74c8d243ea	versions: Remove TEE specific kernels We've switched to using the confidential one, instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 19:07:33 +01:00
Fabiano Fidêncio	adbe24c283	versions: Remove non-used tdx / sev image and initrd entries We've switched to using the confidential ones, instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 19:07:33 +01:00
Fabiano Fidêncio	6c3338271b	packaging: kernel: Remove sev/snp/tdx specific stuff Now we're using a "confidential" image that has support for all of those. Fixes: #9010 -- part II #8982 -- part II #8978 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 19:07:33 +01:00
Gabriela Cervantes	598c77409a	gha: docker: Pull docker image as part of the dependencies This PR pulls the docker image needed for the test as part of the dependencies in order to avoid failures of timeouts mainly because the image was not properly download it and it is unable to find it. Fixes #9089 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-13 17:48:31 +00:00
Gabriela Cervantes	53bc4a432b	tests: Add ability to run non-TEE environments This PR adds the ability to run k8s confidential tests in a non-TEE environment. Fixes #9055 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-13 17:27:55 +00:00
Fabiano Fidêncio	14f4480f12	packaging: Remove specific TEEs image / initrd leftovers Let's remove the targets as those are not built anymore as part of our CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 18:03:12 +01:00
Fabiano Fidêncio	0c761f14b3	packaging: Remove specific TEEs kernel leftovers Let's remove the targets as those are not built anymore as part of our CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 18:03:11 +01:00
Fabiano Fidêncio	28488f0790	Merge pull request #9082 from fidencio/topic/cleanup-kata-deploy-leftovers-before-start-a-test tests: Remove kata-deploy-tdx test and ensure kata-deploy is always cleaned up before starting the tests	2024-02-13 18:01:16 +01:00
Gabriela Cervantes	54d1f34650	gha: nydus: Fix indentation in gha run script This PR fixes the indentation in gha run script for nydus. Fixes #9087 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-13 16:53:28 +00:00
Fabiano Fidêncio	a867e19da1	gha: tdx: Stop running kata-deploy tests on TDX We only have one TDX machine, let's not make it busier than needed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 14:14:57 +01:00
Fabiano Fidêncio	3877a9f49a	ci: Clean up kata-deploy ds before starting the tests This will ensure no leftovers are in the node, which has been cause the TDX CI to fail every now and then. Fixes: #9081 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 14:10:44 +01:00
Fabiano Fidêncio	8fe7349d3e	Merge pull request #9080 from fidencio/topic/dont-add-the-pause-image-to-the-released-tarball release: Don't ship the pause-image / coco-guest-components as part of the release artefacts	2024-02-13 12:34:29 +01:00
Fabiano Fidêncio	443a5b8327	release: Don't ship the coco-guest-components In the same way that doesn't make sense to ship the pause-image, it also doesn't make sense to ship the coco-guest-components itself as part an release artefact. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 09:47:26 +01:00
Fabiano Fidêncio	0462b33a5b	release: Don't ship the pause-image It doesn't make sense to ship the pause-image itself as an release artefact. The reason we build it and cache it is in order to use it inside the rootfs, and that's it, there's not need to ship it as part of the release, at all. Fixes: #9032 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 09:45:50 +01:00
GabyCT	00be9ae872	Merge pull request #9070 from microsoft/danmihai1/debug-containers tests: k8s: avoid deleting unrelated pods	2024-02-12 15:24:15 -06:00
Gabriela Cervantes	69b325a31c	docs: Update CI link into the README This PR updates the CI link into the README as currently we are using GHA workflows and they are now part of the kata containers repository. Fixes #9078 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-12 20:53:25 +00:00
Greg Kurz	532567bfe9	Merge pull request #8936 from fidencio/topic/fix-cri-o-ci tests: cri-o: Use packages from pkgs.k8s.io	2024-02-12 10:04:53 +01:00
Dan Mihai	42d13a0f33	Merge pull request #9068 from microsoft/danmihai1/dockerfile-linux-musl-gcc tools: avoid rootfs-image build "ln -s" error	2024-02-11 18:02:53 -08:00
Greg Kurz	d7afd31fd4	Merge pull request #8455 from BbolroC/runtime-rs-qemu-config runtime-rs: Add a new config option for QEMU	2024-02-10 08:48:23 +01:00
Dan Mihai	a21ca9b7c9	tests: k8s: avoid deleting unrelated pods Delete the debugger pod created during the test, rather than already existing debugger pods. Also, send the output of "kubectl delete" to stderr, just in case it's useful for debugging. Fixes: #9069 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-09 22:48:41 +00:00
Dan Mihai	a054462eb7	Merge pull request #9051 from microsoft/danmihai1/k8s-copy-file tests: k8s: k8s-copy-file auto-generated policy	2024-02-09 12:30:49 -08:00
Hyounggyu Choi	05c4c8055c	runtime-rs: Configure argument replacement for QEMU in Makefile Last but not least, all placeholders for argument replacement should be configured to generate a configuration file when `QEMUCMD` is defined. This enriches those variables. Additionally, this involves creating a symbolic link to `configuration-qemu.toml` if QEMU is defined as the default hypervisor. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-09 19:31:20 +01:00
Dan Mihai	fcd005774d	tools: avoid rootfs-image build "ln -s" error Avoid error when building for amd64 using: USE_CACHE=no AGENT_POLICY=yes DEBUG=1 \ tools/packaging/kata-deploy/local-build/kata-deploy-binaries.sh \ --build=rootfs-image Fixes: #9067 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-09 17:10:35 +00:00
GabyCT	b8f277676f	Merge pull request #9047 from GabyCT/topic/ukd docs: Remove jenkins reference in kernel documentation	2024-02-09 10:58:06 -06:00
Fabiano Fidêncio	e78a951e03	Merge pull request #8585 from ChengyuZhu6/dependencies-for-guest-pull gha: Setup nydus snapshotter for CoCo tests	2024-02-09 16:45:42 +01:00
Hyounggyu Choi	27cb30d8ce	runtime-rs: Adjust configuration template for runtime-rs There are some variables newly introduced to runtime-rs, such as: - runtime.name - runtime.hypervisor_name - runtime.agent_name - vm_rootfs_driver Additionally some of the placeholders for argument replacement are made hypervisor-specific based on the changes made for dragonball. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-09 16:26:59 +01:00
ChengyuZhu6	97fbf360cc	gha: Cleanup nydus snapshotter by the daemonset Cleanup nydus snapshotter by the daemonset. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-09 14:47:13 +01:00
ChengyuZhu6	43b04fd0c0	gha: Deploy nydus snapshotter by the daemonset We can use daemonset to deploy nydus snapshotter, which will decrease one manual step both for Kata Containers and Confidential Containers CI. Fixes: #8584 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-09 14:47:09 +01:00
Julien Ropé	236c2c7650	tests: cri-o: Update critools version to 1.29 This will also update the version of crio used in kata-monitor tests. Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-02-09 12:15:55 +01:00
Fabiano Fidêncio	344e0580ca	tests: cri-o: Use packages from pkgs.k8s.io CRI-O has moved, for a long time, towards pkgs.k8s.io, see: https://kubernetes.io/blog/2023/10/10/cri-o-community-package-infrastructure/ With this the OBS repo won't be used anymore. Fixes: #8935 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-09 12:15:55 +01:00
Fabiano Fidêncio	03f7cfd429	Merge pull request #9061 from GabyCT/topic/csk tests:k8s: make add_kernel_initrd_anotations function generic	2024-02-09 10:05:58 +01:00
Fabiano Fidêncio	555784268d	Merge pull request #9031 from ChengyuZhu6/guest-pull-rootfs packaging/osbuilder: allow to pull and unpack pause image	2024-02-08 22:21:44 +01:00
Gabriela Cervantes	0b508f301b	tests:k8s: make add_kernel_initrd_anotations function generic This PR replaces the add_kernel_initrd_annotations_to_yaml function more generic so later can be used for other components. Fixes #9054 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-08 19:30:43 +00:00
Dan Mihai	f139c7dc60	tests: k8s: k8s-copy-file auto-generated policy Auto-generate policy for k8s-copy-file.bats. Fixes: #9050 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 13:26:05 +00:00
Dan Mihai	1179306afa	tests: k8s: additional policy testing utilities 1. add_requests_to_policy_settings allows one or more ttrpc requests from the Host to the Guest. Example: add_requests_to_policy_settings "${policy_settings_dir}" \ "ReadStreamRequest" "WriteStreamRequest" 2. add_copy_from_host_to_policy_settings allows executing on the Guest the commands initiated behind the scenes by "kubectl cp" from the Host to the Guest. Example: add_copy_from_host_to_policy_settings "${policy_settings_dir}" 3. add_copy_from_guest_to_policy_settings allows executing on the Guest the commands initiated behind the scenes by "kubectl cp" from the Guest to the Host. Example: add_copy_from_guest_to_policy_settings "${policy_settings_dir}" \ "/tmp/file.txt" Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 13:25:41 +00:00
Steve Horsman	b99f574522	Merge pull request #9037 from niteeshkd/nd_SevSnpGuest runtime: fix creation of SEV confidential container on SNP enabled host.	2024-02-08 09:29:20 +00:00
ChengyuZhu6	a43edd0c30	rootfs: Install pause image into rootfs Install the pause image into the confidential rootfs image and initrd. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-08 16:49:56 +08:00
Greg Kurz	6ead48ec06	Merge pull request #8986 from pmores/drop-shim-v2-address-value-validation runtime-rs: fix interoperability issues between runtime-rs and cri-o	2024-02-08 09:44:12 +01:00
ChengyuZhu6	42ef6bdcae	osbuilder:rootfs: support to unpack pause image to rootfs This env ver will serve us to pass the pause image tarball to the rootfs builder, which will then just unpack the content into the rootfs. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>	2024-02-08 16:29:36 +08:00
ChengyuZhu6	53183cba31	workflow: Enable to build pause image in ci Enable to build pause image static tarball for confidential containers casesi in ci environment. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-08 11:23:23 +08:00
ChengyuZhu6	70a84eca9e	packaging: allow to pull and unpack pause image For Confidential containers stack, the pause image is managed by host side, then it may configure a malicious pause image, we need package a pause image inside the rootfs and don't the pause image from host. But the installation of skopeo is not included in 20.04 release, so we can not directly install skopeo in rootfs and pull pause image. So I plan to let the task as a static build stuff, which would not be influenced by the system version in rootfs. And the pause image will be part of the Kata Containers rootfs that's used by the Confidential Containers usecase. This commit enables the component to be built both locally and in our CI environment with the command: make pause-image-tarball. Fixes: #9032 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>	2024-02-08 11:23:23 +08:00
Dan Mihai	9a780aa98f	genpolicy: improve logging from ExecProcessRequest Additional logging from the ExecProcessRequest rules, for easier debugging. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 02:21:58 +00:00
Dan Mihai	dab567bdfa	genpolicy: add easy way to allow CloseStdinRequest For example, Kata CI's k8s-copy-file.bats transfers files between the Host and the Guest using "kubectl exec", and that results in CloseStdinRequest being called from the Host. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 02:21:58 +00:00
Dan Mihai	8401adb113	genpolicy: update default values 1. Remove PullImageRequest because that is not used in the main branch. It was used in the CCv0 branch. 2. Add default false values for the remaining Kata Agent ttrpc requests. These changes don't change the functionality of the auto generated Policy, but they help with easier understanding the Policy text and the logging from the Rego rules. Fixes: #9049 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 02:21:58 +00:00
Dan Mihai	535db6b29c	Merge pull request #9043 from ChengyuZhu6/assert runtime-rs: fix assert error in `make check`	2024-02-07 18:19:18 -08:00
Dan Mihai	2bb91c9d8f	Merge pull request #8922 from microsoft/danmihai1/k8s-attach-handlers tests: k8s-attach-handlers auto-generated policy	2024-02-07 13:29:50 -08:00
Dan Mihai	01745689e1	Merge pull request #9029 from microsoft/danmihai1/k8s-empty-dirs genpolicy: mount source for non-confidential guest	2024-02-07 11:26:16 -08:00
Dan Mihai	6b5e57f7c7	tests: k8s: address PR review feedback 1. Rename install_kata_common to install_kata_core. 2. Add TODO for better way to install the Kata tools. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 18:51:56 +00:00
Steve Horsman	934d8dca0f	Merge pull request #9045 from ChengyuZhu6/nydus-version nydus: Bump nydus snapshotter version to v0.13.7	2024-02-07 17:20:21 +00:00
Pavel Mores	6346e04cf7	runtime-rs: fix handling of TTRCP_ADDRESS Since cri-o doesn't seem to use address for event publishing as mentioned in the previous commit it will not send it. However, the exact way of not sending it is unfortunately different from what is assumed by runtime-rs. Due to an implementation detail of cri-o which uses containerd libraries for some low-level tasks, TTRPC_ADDRESS will not be missing from environment as assumed, instead it will be present with an empty value. This commit contains a small adjustment to account for that and use LogForwarder even if TTRPC_ADDRESS is present, but with an empty value. Fixes #8985 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-02-07 17:01:04 +01:00
Gabriela Cervantes	ff1ace1c74	docs: Remove jenkins reference in kernel documentation This PR removes the jenkins reference which is not longer being used in the kernel documentation. Fixes #9046 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-07 15:44:07 +00:00
ChengyuZhu6	d0b8e6d8f3	nydus: Bump nydus snapshotter version to v0.13.7 Bump nydus snapshotter version to v0.13.7. The new release name of nydus snapshotter is `nydus-snapshotter-v0.13.7-linux-amd64.tar.gz`, which differs from the version used by kata (`nydus-snapshotter-v0.12.0-x86_64.tgz`). Therefore, we need to update the script to obtain the correct nydus snapshotter name. Fixes: #9044 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-07 22:17:05 +08:00
ChengyuZhu6	34c47e08b2	runtime-rs: fix assert error in test in `make check` Fix assert error: error: used `assert_eq!` with a literal bool --> crates/hypervisor/src/ch/inner.rs:218:9 \| 218 \| assert_eq!(state.jailed, false); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#bool_assert_comparison = note: `-D clippy::bool-assert-comparison` implied by `-D warnings` Fixes: #9042 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-07 19:31:10 +08:00
Archana Shinde	d9ce88ada3	Merge pull request #8704 from amshinde/runtime-rs-clh-implement-persist runtime-rs: implement persist api for cloud-hypervisor	2024-02-07 02:29:33 -08:00
Dan Mihai	dd16bc393f	tests: k8s: k8s-attach-handlers generated policy Automatically generate the test policy for k8s-attach-handlers.bats, if AUTO_GENERATE_POLICY is enabled. Steps: - Create a temporary directory for the current test and copy the common genpolicy settings into this new directory. - Change genpolicy settings in the temp directory to allow the "kubectl exec" command that this test needs. (For CoCo, exec is blocked by the default policy settings) - Auto-generate the policy for the test YAML file. - Test as usual, using the YAML file. - Clean-up the temporary settings described above. Fixes: #8921 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:26:03 +00:00
Dan Mihai	0de407f8b7	tests: k8s: enable AUTO_GENERATE_POLICY Enable AUTO_GENERATE_POLICY for one of the Kata CI K8s test platforms. Additional platforms will be enabled after testing them. When AUTO_GENERATE_POLICY is enabled, create genpolicy settings that are common for all tests. Some of the tests will make temporary copies of these common settings and customize them as needed. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:25:54 +00:00
Dan Mihai	05b2e4f606	tests: k8s: install genpolicy Install the genpolicy app before starting test execution. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:25:42 +00:00
Dan Mihai	8aa8b70573	tests: k8s: add policy test utilities Add script functions useful for auto-generating and testing policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:24:06 +00:00
Dan Mihai	24a17a2e1b	tests: k8s: output the names of test files Output the names of test files, for easier search through logs. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:23:54 +00:00
Dan Mihai	bf533de31a	tests: k8s: add DEBUG support for test scripts Make these scripts easier to debug. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:23:46 +00:00
Dan Mihai	1b4ef672ef	tests: k8s: reduce namespace name duplication 1. Avoid repeating "kata-containers-k8s-tests". 2. Allow users to specify a different test namespace. 3. Introduce the TEST_CLUSTER_NAMESPACE variable, that will also be useful when auto-generating the Agent Policy for these tests. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:23:38 +00:00
Dan Mihai	8a5ba5fb34	tests: k8s: allow run_kubernetes_tests.sh exec Allow everyone to directly execute run_kubernetes_tests.sh, for easier local testing. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:23:30 +00:00
Fabiano Fidêncio	11ba90ebf2	Merge pull request #8958 from fidencio/topic/kata-manager-nerdctl-support kata-manager: Add support for nerdctl installation	2024-02-06 21:33:48 +01:00
GabyCT	d74b6e143f	Merge pull request #8951 from GabyCT/topic/udf metrics: Update packages for TensorFlow ResNet Int8 Dockerfile	2024-02-06 14:29:41 -06:00
GabyCT	6337f300a8	Merge pull request #8628 from GabyCT/topic/enablek8stclh tests: k8s: Enable tests for cloud hypervisor runtime-rs without devicemapper	2024-02-06 14:28:35 -06:00
Niteesh Dubey	3e383674f8	runtime: fix creation of SEV confidential container on SNP enabled host. This is needed to fix the bug which is not allowing to create SEV container on SNP enabled host anymore. This is a regression that was introduced as part of the following commit: `de39fb7d38` Fixes: #9036 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-02-06 19:01:30 +00:00
Hyounggyu Choi	462afcf829	runtime-rs: Copy configuration for QEMU from runtime It makes sense to reuse a configuration template for runtime-golang as a base. This is simply to copy it into the config directory. Fixes: #8441 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-06 19:35:44 +01:00
Fabiano Fidêncio	058f068d67	Merge pull request #9020 from BbolroC/ok-to-test-static-checks-but-x86 gha: Run static-checks on self-hosted runners conditionally	2024-02-06 19:30:21 +01:00
Gabriela Cervantes	cf049fc718	k8s: Skip k8s tests that are not working This PR skips the k8s tests that are not working with cloud hypervisor runtime-rs with its proper issue. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-06 16:52:02 +00:00
Pavel Mores	f0256fded5	runtime-rs: remove validation of shim v2 -address value It appears that under the shim v2 protocol, a shim has no use of its own for the -address value, it just passes it back to container runtime's (mostly containerd or cri-o) event-publishing binary. Since the -address value only flows through the shim, being passed to the shim by a container runtime and then essentially passed back by shim to the container runtime, it seems inappropriate for a shim to validate the value that is fully owned and only used by the container runtime. This commit removes such validation from runtime-rs. Doing so, it solves (part of) an interoperability problem between runtime-rs and cri-o. cri-o seems to intentionally choose not to implement the event-publishing part of the shim v2 protocol and thus it has no value it could pass to runtime-rs for -address. As a result, it sends an empty string which has been failing the excessive validation performed by runtime-rs so far. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-02-06 13:43:09 +01:00
Wainer Moschetta	f1ca5d1563	Merge pull request #8953 from ChengyuZhu6/ci-guest-pull gha: Enable nydus snapshotter in CoCo ci tests	2024-02-06 09:36:59 -03:00
Fabiano Fidêncio	1ccb850ee7	Merge pull request #9027 from fidencio/topic/add-libattest-tdx-into-the-confidential-rootfs rootfs: Add libattest-tdx into the confidential rootfs	2024-02-06 12:52:13 +01:00
Fabiano Fidêncio	ce82b5e3f5	rootfs: Add libtdx-attest into the confidential rootfs This is required as the tdx-attest-rs crate, which is used as part of the guest components, has a runtime dependency on libattest-tdx. Fixes: #9021 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-06 09:13:49 +01:00
Xuewei Niu	67d9847fac	Merge pull request #9025 from wainersm/cri-containerd_fix_loop cri-containerd: fix loop in TestContainerMemoryUpdate()	2024-02-06 14:49:57 +08:00
Amulya Meka	354a3093fa	Merge pull request #9019 from Amulyam24/k8s-fix gha: add GOPATH env var to the ppc64le k8s workflow	2024-02-06 11:01:49 +05:30
Alex Lyn	1ab9a21492	Merge pull request #8552 from deagon/fix/missing-port-type runtime: missing port type in the DeviceInfo	2024-02-06 10:56:46 +08:00
Dan Mihai	473efc2149	genpolicy: mount source for non-confidential guest The emergent Kata CI tests for Policy use confidential_guest = false in genpolicy-settings.json. That value is inconsistent with the following mount settings: "emptyDir": { "mount_type": "local", "mount_source": "^$(cpath)/$(sandbox-id)/local/", "mount_point": "^$(cpath)/$(sandbox-id)/local/", "driver": "local", "source": "local", "fstype": "local", "options": [ "mode=0777" ] }, We need to keep those settings for confidential_guest = true, and change confidential_guest = false to use: "emptyDir": { "mount_type": "local", "mount_source": "^$(cpath)/$(sandbox-id)/rootfs/local/", "mount_point": "^$(cpath)/$(sandbox-id)/local/", "driver": "local", "source": "local", "fstype": "local", "options": [ "mode=0777" ] }, The value of the mount_source field is different. This change unblocks testing using Kata CI's pod-empty-dir.yaml: genpolicy -u -y pod-empty-dir.yaml kubectl apply -f pod-empty-dir.yaml k get pod sharevol-kata NAME READY STATUS RESTARTS AGE sharevol-kata 1/1 Running 0 53s Fixes: #8887 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-06 01:19:48 +00:00
Fabiano Fidêncio	ffa190831d	Merge pull request #9022 from fidencio/topic/add-guest-components-to-the-confidential-image-and-initrd rootfs: confidential: Install coco-guest-components	2024-02-05 18:56:48 +01:00
Hyounggyu Choi	40b2b2a43a	gha: Run static-checks on self-hosted runners conditionally Due to the restrictions on instance provisioning for self-hosted runners, performing static checks (36 jobs at the time of writing) on them each time a PR is updated could significantly burden them, consequently slowing down the entire CI system. To address this, the decision is to trigger these checks only when an 'ok-to-test' label is added. Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will remain unchanged. Fixes: #8998 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-05 15:24:21 +01:00
Wainer dos Santos Moschetta	106e1af497	cri-containerd: fix loop in TestContainerMemoryUpdate() The loop that generate test cases for virtio-mem enabled/disabled doesn't return the integers '1' and '0' as expected. Instead it returns the strings '{1,' and '0}'. Fixes #9024 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-05 10:59:39 -03:00
Fabiano Fidêncio	27e7974048	rootfs: confidential: Install coco-guest-components Let's install the coco-guest-components into the confidential rootfs image and initrd. Fixes: #9021 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-05 14:41:29 +01:00
Fabiano Fidêncio	f80dbcee0e	rootfs: Add logging about the coco guest components This will make our lives easier to figure out whether the components are being installed or not. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-05 14:41:29 +01:00
Fabiano Fidêncio	68b8186ec4	osbuilder: Expose COCOGUEST_COMPONENTS_TARBALL We need to pass this to the container where the rootfs is built, so it can actually be unpacked inside the rootfs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-05 14:41:28 +01:00
Lukáš Doktor	3b0049b2a4	tools.kata-webhook: Fix lib path When moving the webhook we skipped the common.bash as (close-enough) version is already in `/tests` but we forgot to update the source path, fixing it here. Fixes: #8653 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-02-05 14:17:24 +01:00
Fabiano Fidêncio	64d09874c3	packaging: coco-guest-components: Pass DESTDIR to the build script As DESTDIR was not being passed, we've been installing the final binaries in a container path that was not exposed to the host, leading to creating an empty tarball with the guest components. Now, theoretically, guest-components should respect a PREFIX passed, but that's not the case and we're manually adding "/usr/local/bin" to the passed DESTDIR. Here's the result of the tarball: ```bash ⋊> kata-containers ≡ tar tf build/kata-static-coco-guest-components.tar.xz ./ ./usr/ ./usr/local/ ./usr/local/bin/ ./usr/local/bin/confidential-data-hub ./usr/local/bin/attestation-agent ./usr/local/bin/api-server-rest ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-05 14:07:10 +01:00
ChengyuZhu6	a214bd8d13	gha: Enable nydus snapshotter in CoCo ci tests This PR is a split of #8585. make the changes on the Github workflows, and the skeleton to deploy_snapshotter() and cleanup_snapshotter() in tests/integration/kubernetes/gha-run.sh in this commit. After initially merging this patch to trigger CI jobs for CoCo, which will begin executing the dummy functions deploy_snapshotter() and cleanup_snapshotter(), the implementation details for these functions remain in #8585. Our subsequent step involves transferring this logic to the PR #8484, enabling the PR to undergo CI testing prior to its merge. Fixes: #8997 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-05 18:51:59 +08:00
Fabiano Fidêncio	1362918ff0	Merge pull request #9011 from fidencio/topic/switch-to-using-the-confidential-rootfs runtime: Replace TEE specific initrd / image for the confidential one	2024-02-05 10:43:12 +01:00
Guoqiang Ding	6068faf40b	runtime: failed to run in the case of ColdPlugVFIO Add the missing port type in the DeviceInfo. Fixes: #9014 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-02-05 17:30:11 +08:00
Fabiano Fidêncio	65013205ed	Merge pull request #9005 from ChengyuZhu6/clang static-checks: Install clang in the ci environments	2024-02-05 09:24:51 +01:00
Archana Shinde	b3c74411f6	runtime-rs: Add tests for persist api for clh Add tests to check clh struct is saved/restored correctly. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-04 22:03:57 -08:00
Archana Shinde	0b78296dca	runtime-rs: Store additional field for hypervisor state Implementing Persist API for cloud-hypervisor was done partially with initial support for cloud-hypervisor. Store and retrieve additional fields to/from the hypervisor state. Fixes: #6202 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-04 22:03:57 -08:00
Archana Shinde	a5f0b92bca	runtime-rs: Add guest protection to hypervisor state Store guest-protection used while storing the state of the hypervisor. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-04 22:03:54 -08:00
Alex Lyn	cf74166d75	Merge pull request #9015 from Apokleos/bugfix-exec-uds runtime: display accurate error msg to avoid misleading users.	2024-02-05 13:50:43 +08:00
Amulyam24	e59d005568	gha: add GOPATH env var to the ppc64le k8s workflow The filtering of testing cases installs/uses yq and expects GOPATH to be present. Hence, add it to the workflow. Fixes: #9018 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-02-05 10:30:10 +05:30
Alex Lyn	51a82bec3c	Merge pull request #9012 from deagon/fix/monitor-agent-url kata-monitor: fix agentUrl from containerd shim	2024-02-05 10:41:56 +08:00
ChengyuZhu6	f354beb253	static-checks: Install clang in the ci environments To test PR #8484, the compilation process for the kata-agent relies on clang. There have been encountered failures on ARM, s390x, and ppc64le architectures: ppc64le: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689026?pr=8484 s390x: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689401?pr=8484 arm: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689026?pr=8484 Fixes: #9004 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-04 17:00:19 +08:00
Alex Lyn	c6830ceb89	runtime: display accurate error msg to avoid misleading users. The original handling method does not reach user expectations. When the ClientSocketAddress method stats the corresponding path of runtime-rs and has not found it yet, we should return an error message here that includes the reason for the failure (which should be an error display indicating that both runtime-go and runtime-rs were not found). Instead of simply displaying the corresponding path of runtime-rs as the final error message to users. It is also necessary to return the error promptly to the caller for further error handling. Fixes: #8999 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-02-04 16:45:59 +08:00
Xuewei Niu	fa01a86334	Merge pull request #9007 from wainersm/aks_delete_rg gha: delete azure RG only if it exists	2024-02-04 16:34:17 +08:00
Guoqiang Ding	7bf1ebe16d	kata-monitor: fix agentUrl from containerd shim Fix the missing leading slash. Fixes: #9013 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-02-04 16:24:13 +08:00
Fabiano Fidêncio	d4a9856a84	gha: Remove SEV / SNP / TDX images / initrds We can remove this now that we're relying on the confidential one. Fixes: #9010 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-03 13:22:07 +01:00
Fabiano Fidêncio	e4258d8694	runtime: Use confidential image / initrd instead of TEE specific ones Now that we have a confidential image / initrd being built, instead of a specific one for each TEE, let's use it everywhere possible. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-03 13:20:14 +01:00
Fabiano Fidêncio	e0bb632053	Merge pull request #8983 from fidencio/topic/add-confidential-image packaging: Add confidential image / initrd	2024-02-03 12:30:16 +01:00
Fabiano Fidêncio	a9f8888c15	packaging: Add confidential image / initrd Let's use a single rootfs image / initrd for confidential workloads, instead of having those split for different TEEs. We can easily do this now as the soon-to-be-added guest-components can be built in a generic way. Fixes: #8982 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-03 00:58:52 +01:00
Fabiano Fidêncio	7ddb2e5999	Merge pull request #8978 from fidencio/topic/use-the-kernel-confidential-when-possible runtime: packaging: Use confidential kernel instead of the TDX one	2024-02-03 00:29:43 +01:00
Fabiano Fidêncio	e9de0ef6b3	packaging: rootfs: Depend on kernel-confidential tarball Now that we're using the kernel-confidential, let the rootfs depending on it, instead of depending on the TEE specific ones. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:13:41 +01:00
Fabiano Fidêncio	b58cfc765c	packaging: Ensure rootfs is rebuilt in case kernel changes We need to do this in order to ensure that the measure boot will be taking the latest kernel bits, as needed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:13:06 +01:00
Fabiano Fidêncio	4394dacb88	packaging: Build the confidential kernel with MEASURED_ROOTFS support This is already done for the TDX kernel, and should have been done also for the confidential one. This action requires us to bump the kernel version as the resulting kernel will be different from the cached one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:13:06 +01:00
Fabiano Fidêncio	c7680839f9	packaging: Fix modules tarball for nvidia-gpu-confidential The modules dir has an extra "-nvidia-gpu-confidential" string in its name. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:13:06 +01:00
Fabiano Fidêncio	dc027e39d6	gha: Remove TEE specific kernel build targets We're using the confidential kernel instead from now on. Fixes: #8981 -- part I Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:12:41 +01:00
Fabiano Fidêncio	3755c69165	runtime: makefile: remove SNP specific kernel references As this is not used anymore, we can go ahead and just remove it Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:12:21 +01:00
Fabiano Fidêncio	57b132f94c	runtime: makefile: remove SEV specific kernel references As this is not used anymore, we can go ahead and just remove it Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:12:21 +01:00
Fabiano Fidêncio	2562d23242	runtime: makefile: remove TDX specific kernel references As this is not used anymore, we can go ahead and just remove it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:43 +01:00
Fabiano Fidêncio	f4e3c936d8	runtime: snp: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:36 +01:00
Fabiano Fidêncio	8731366d7b	runtime: sev: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:36 +01:00
Wainer dos Santos Moschetta	a04b215bcc	gha: delete azure RG only if it exists delete_cluster() has tried to delete the az resources group regardless if it exists. In some cases the result of that operation is ignored, i.e., fail to resource group not found, but the log messages get a little dirty. Let's delete the RG only if it exists then. Fixes #8989 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-02 16:57:20 -03:00
Gabriela Cervantes	eb5b7d3bf8	tests: k8s: Enable tests for cloud hypervisor runtime-rs This PR enable the k8s tests for cloud hypervisor runtime-rs. Fixes #8627 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-02 17:58:58 +00:00
Fabiano Fidêncio	6cbdba7268	runtime: tdx: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 17:13:06 +01:00
Fabiano Fidêncio	a618461d3a	runtime: Add confidential kernel to the makefile With this we can properly generate and the the `-confidential` kernel, which supports SEV / SNP / TDX as part of our configuration files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 17:13:05 +01:00
GabyCT	40d9a65601	Merge pull request #8996 from GabyCT/topic/addclhr gha: k8s: Add cloud-hypervisor (runtime-rs) support	2024-02-02 09:48:35 -06:00
Fabiano Fidêncio	741ed1c8bd	Merge pull request #9001 from fidencio/topic/fix-cache-for-confidential-kernel-part-III packaging: Don't build the confidential / sev kernel twice -- part III	2024-02-02 15:19:41 +01:00
Wainer Moschetta	424fbfe58f	Merge pull request #8654 from ldoktor/openshift-tests ci/openshift-ci: Move openshift-ci from the tests repo here	2024-02-02 10:40:30 -03:00
Fabiano Fidêncio	2ff3f0afc6	packaging: Remove trailing whitespace from extra_tarballs arg This was overlooked during the reviews. Fixes: #6415 -- part III Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 12:42:02 +01:00
Fabiano Fidêncio	228bc48c73	packaging: Fix kernel confidential name It should be "kernel-confidential" instead of "kernel". Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 12:42:02 +01:00
Fabiano Fidêncio	31b21093b0	packaging: Pass the kernel flavour to get_kernel_modules_dir I made this a required argument during the series and ended up forgetting to add that while calling the function. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 12:42:02 +01:00
Fabiano Fidêncio	51b1df2333	packaging: Fix typo to get the extra_tarballs path It should've been "${m#*:}" instead of "${m#&:}". Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 12:41:54 +01:00
Fabiano Fidêncio	53e8461db2	Merge pull request #9000 from fidencio/topic/fix-pushing-artefacts-to-registry packaging: Fix pushing artefacts to the registry	2024-02-02 10:21:40 +01:00
Fabiano Fidêncio	0b221b5618	packaging: Fix pushing artefacts to the registry This issues was introduced due to a typo not caught during reviews on `e5bca90274`. Fixes: #6415 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 10:13:11 +01:00
Wenyuan Liu	cb888516c1	Merge pull request #8760 from fadecoder/reduce_go_runtime_mounts runtime: Reduce the mount points with namespace isolation	2024-02-02 16:54:44 +08:00
Greg Kurz	d1a26ead94	Merge pull request #8454 from BbolroC/compile-with-qemu-s390x runtime-rs: make compilation for QEMU on s390x	2024-02-02 09:29:32 +01:00
Fabiano Fidêncio	0520b272a3	Merge pull request #8987 from fidencio/topic/fix-cache-for-confidential-kernel packaging: cache: Fix caching kernels which rely on extra modules	2024-02-02 09:10:52 +01:00
Amulya Meka	e4252a3fe2	Merge pull request #8957 from Amulyam24/add-k8s-test-ppc64le gha: add kubernetes tests workflow for ppc64le	2024-02-02 10:22:00 +05:30
Fabiano Fidêncio	b2f1235e3c	Merge pull request #8994 from sprt/sprt/switch-aks-eastus ci: aks: switch from eastus2 to eastus region	2024-02-02 00:09:40 +01:00
Hyounggyu Choi	bb6f5073aa	runtime-rs: Allow compilation for s390x Until now, runtime-rs couldn't be compiled on s390x. We need to lift those restrictions in Makefile first. Fixes: #8446 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-01 23:48:15 +01:00
Dan Mihai	6f1062b5d6	Merge pull request #8966 from microsoft/danmihai1/k8s-sandbox-vcpus-allocation genpolicy: ignore empty YAML as input	2024-02-01 13:51:02 -08:00
Dan Mihai	8f9c92c0ee	Merge pull request #8977 from microsoft/danmihai1/default-namespace genpolicy: support non-default namespace name	2024-02-01 13:50:33 -08:00
Gabriela Cervantes	6771ca463b	gha: k8s: Add cloud-hypervisor (runtime-rs) support This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs, as part of the kubernetes tests different with devmapper. Fixes #8995 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-01 21:22:56 +00:00
Aurélien Bombo	0ace31f041	ci: aks: switch from eastus2 to eastus region This addresses an internal AKS issue that intermittently prevents clusters from getting created. The fix has been rolled out to eastus but not yet eastus2, so we unblock the CI by switching. No downsides in general. This supersedes #8990. Fixes: #8989 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-02-01 19:22:42 +00:00
Hyounggyu Choi	8fcee6e6ec	runtime-rs: Use Persist::restore() of QEMU for VirtSandbox It fails to compile virt_container because Dragonball is only used in the implementation of the trait method Persist::restore(). As the hypervisor is not compiled on s390x and QEMU implements the trait method, this commit is to let the method use QEMUi's. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-01 18:02:10 +01:00
Hyounggyu Choi	56aef3741d	runtime-rs: Exclude hypervisors plugins except QEMU for s390x Dragonball and cloud-hypervisor are not supported on s390x. We need to exclude the plugins for these hypervisors from compilation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-01 18:02:10 +01:00
Fabiano Fidêncio	5d2906c36a	packaging: Bump the kata config kernel version Just to make sure we won't use cached components. Fixes: #6415 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:57:15 +01:00
Fabiano Fidêncio	d2ea11dbff	packaging: Use the cached kernel modules Till now we didn't have a logic to consume the kernel modules cached tarball. Let's make sure those are consumed as it'll save us a reasonable amount of build time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:57:15 +01:00
Fabiano Fidêncio	e5bca90274	packaging: Cache the kernel modules This will save us a lot of time, as right now the CI is rebuilding the kernel for absolutely no reason. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:55:21 +01:00
Fabiano Fidêncio	f481f58659	packaging: Create the tarball for the kernel modules Let's start doing this for the confidential kernels (and also for SEV, till it gets removed). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:55:20 +01:00
Fabiano Fidêncio	a58caca723	packaging: Take extra tarballs in install_cached_tarball_component() This allows us to add a map, in the format of: `"tarball1_name:tarball1_path tarball2_name:tarball2_path ..."` With this we have a base to start doing a better job when caching extra artefacts, like kernel modules. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:55:20 +01:00
Fabiano Fidêncio	33ac5468fe	packaging: Add function to get the kernel modules directory Right now this is just being added but not used yet. The idea is to use this to both cache and later on untar the kernel modules needed for some of the kernel targets we have (specifically looking at the confidential one). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:55:20 +01:00
Zhigang Wang	9317e23df1	mount: Reduce the mount points with namespace isolation This patch can reduce load on systemd process, and increase the k8s deployment density when using go runtime. Fixes: #8758 Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com> Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2024-02-01 18:34:24 +08:00
Fabiano Fidêncio	ed6816e29f	kata-manager: Add support for nerdctl installation As already done for docker, let's also add support for installing nerdctl + kata containers. For now, at least for now, we are explicitly not allowing the combination of installing both docker and nerdctl in the same installation in order to reduce the script complexity. Also, nerdctl installation, for now, is limited to x86_64 and aarch64 as those are the only architectures that nerdctl releases a "full" package for. Fixes: #8358 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 09:19:35 +01:00
Xuewei Niu	2332552c8f	Merge pull request #7483 from frezcirno/passfd_io_feature runtime-rs: improving io performance using dragonball's vsock fd passthrough	2024-02-01 14:53:53 +08:00
Amulyam24	f8585db8d9	gha: add kubernetes tests workflow for ppc64le This PR adds workflow for running kubernetes test suite on ppc64le. It uses scripts to create and delete the cluster using kubeadm as none of the current cluster creation tools are supported on Power. Fixes: #7950 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-02-01 12:23:11 +05:30
Alex Lyn	cf26c16017	Merge pull request #8931 from yaoyinnan/8930/feat/merge-ValidCgroupPath runtime: merged ValidCgroupPath method	2024-02-01 12:53:55 +08:00
Alex Lyn	a157fc3b74	Merge pull request #8974 from yaoyinnan/5240/fix/cgroup-parallel runtime: add SingleContainer when obtaining OCI Spec	2024-02-01 11:43:02 +08:00
Alex Lyn	1b8f3ce28a	Merge pull request #8929 from yaoyinnan/8838/fix/error-message runtime-rs: report error on missing or empty fields in configuration	2024-02-01 11:02:30 +08:00
Dan Mihai	09ea0eed9d	genpolicy: ignore empty YAML as input Kata CI's pod-sandbox-vcpus-allocation.yaml ends with "---", so the empty YAML document following that line should be ignored. To test this fix: genpolicy -u -y pod-sandbox-vcpus-allocation.yaml Fixes: #8895 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-01 02:22:21 +00:00
Dan Mihai	befef119ff	Merge pull request #8941 from malt3/genpolicy-flags genpolicy: allow separate paths for rules and settings files	2024-01-31 18:14:12 -08:00
GabyCT	6db1cd5f65	Merge pull request #8964 from GabyCT/topic/fixnerdcltt tests: Re-arranged nerdctl tests	2024-01-31 15:02:54 -06:00
Dan Mihai	21125baec3	Merge pull request #8962 from microsoft/danmihai1/config-map-optional2 genpolicy: ignore volume configMap optional field	2024-01-31 12:29:30 -08:00
Fabiano Fidêncio	39a64d1447	Merge pull request #8269 from wainersm/kata-deploy_deprecated kata-deploy: fix deprecations on kustomization files	2024-01-31 20:02:01 +01:00
Hyounggyu Choi	9c0312d466	Merge pull request #8956 from BbolroC/agent-build-fix-s390x-ppc64le packaging: Use Ubuntu 20.04 for building an agent	2024-01-31 18:23:16 +01:00
Greg Kurz	8b1dc06971	Merge pull request #8938 from pmores/log-qemus-stderr-in-shim-log runtime-rs: Log qemu's stderr in shim log	2024-01-31 18:04:28 +01:00
Dan Mihai	f0339a79a6	genpolicy: support non-default namespace name Allow users to specify in genpolicy-settings.json a default cluster namespace other than "default". For example, Kata CI uses as default namespace: "kata-containers-k8s-tests". Fixes: #8976 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-31 15:47:01 +00:00
Zixuan Tan	222de4f684	agent: Fix a race condition in passfd_io.rs There is a race condition in agent HVSOCK_STREAMS hashmap, where a stream may be taken before it is inserted into the hashmap. This patch add simple retry logic to the stream consumer to alleviate this issue. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	6e4d4c329a	agent,runtime-rs: Add license header to passfd_io.rs Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	1206de2c23	agent: Use pipes as stdout/stderr of container process Linux forbids opening an existing socket through /proc/<pid>/fd/<fd>, making some images relying on the special file /dev/stdout(stderr), /proc/self/fd/1(2) fail to boot in passfd io mode, where the stdout/stderr of a container process is a vsock socket. For back compatibility, a pipe is introduced between the process and the socket, and its read end is set as stdout/stderr of the container process instead of the socket. The agent will do the forwarding between the pipe and the socket. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	f6710610d1	agent,runtime-rs,runk: fix fmt and clippy warnings Fix rustfmt and clippy warnings detected by CI. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	89be42a177	runtime-rs: open stdout and stderr fifos NONBLOCK This patch adds O_NONBLOCK flag when open stdout and stderr FIFOs to avoid blocking. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	3eb4bed957	agent: use biased select to avoid data loss This patch uses a biased select to avoid stdin data loss in case of CloseStdinRequest. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	7874ef5fd2	agent: set stdout/err vsock stream as blocking before passing to child In passfd io mode, when not using a terminal, the stdout/stderr vsock streams are directly used as the stdout/stderr of the child process. These streams are non-blocking by default. The stdout/stderr of the process should be blocking, otherwise the process may encounter EAGAIN error when writing to stdout/stderr. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Fupan Li	cfb262d02f	container: keep the io connection when pass fd to hybrid vsock We want the io connection keep connected when the containerd closed the io pipe, thus it can be attached on the io stream. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-01-31 21:07:48 +08:00
Fupan Li	4a762fcfdd	dbs: hybrid stream support keep the connection when local closed Support the hybrid fd passthrough mode with passing pipe fd, which can specify this connection kept even when the pipe peer closed, and this connection can be reget wich re-opening the pipe. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	5536743361	agent,runtime-rs: fix container io detach and attach Partially fix some issues related to container io detach and attach. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	657b17a86f	runtime-rs: open stdin fifo with RDWR\|NONBLOCK when pass vsock streams In linux, when a FIFO is opened and there are no writers, the reader will continuously receive the HUP event. This can be problematic when creating containers in detached mode, as the stdin FIFO writer is closed after the container is created, resulting in this situation. In passfd io mode, open stdin fifo with O_RDWR\|O_NONBLOCK to avoid the HUP event. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	f1b33fd2e0	agent: clean up term master fd when container exits When container exits, the agent should clean up the term master fd, otherwise the fd will be leaked. Fixes: kata-containers#6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	b8632b4034	dragonball: vsock: properly handle EPOLLHUP/EPOLLERR events When one end of the connection close, the epoll event will be triggered forever. We should close the connection and kill the connection. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	442df71fe5	agent,runtime-rs: refactor process io using vsock fd passthrough feature Currently in the kata container, every io read/write operation requires an RPC request from the runtime to the agent. This process involves data copying into/from an RPC request/response, which are high overhead. To solve this issue, this commit utilize the vsock fd passthrough, a newly introduced feature in the Dragonball hypervisor. This feature allows other host programs to pass a file descriptor to the Dragonball process, directly as the backend of an ordinary hybrid vsock connection. The runtime-rs now utilizes this feature for container process io. It open the stdin/stdout/stderr fifo from containerd, and pass them to Dragonball, then don't bother with process io any more, eliminating the need for an RPC for each io read/write operation. In passfd io mode, the agent uses the vsock connections as the child process's stdin/stdout/stderr, eliminating the need for a pipe to bump data (in non-tty mode). Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	eb6bb6fe0d	config: add two options to control vsock passthrough io feature Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced to enable and configure dragonball's vsock fd passthrough io feature. This commit is a preparation for vsock fd passthrough io feature. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	973b5ad1f4	runtime-rs: make Container::new async Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Xuewei Niu	5449173102	Merge pull request #8932 from kalil-pelissier/feature/issue-8586/fix-noop-method-call-warning dragonball: fix noop-method-call warning	2024-01-31 19:24:27 +08:00
Malte Poll	531a11159f	genpolicy: allow separate paths for rules and settings files Using custom input paths with -i is counter-intuitive. Simplify path handling with explicit flags for rules.rego and genpolicy-settings.json. Fixes: #8568 Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>	2024-01-31 11:00:19 +01:00
Hyounggyu Choi	2e1d770fcf	packaging: Track files correctly when naming builder image for agent The necessary files for the agent builder image can be found in `tools/packaging/static-build/agent`, `ci/install_libseccomp.sh` and `tools/packaging/kata-deploy/local-build/kata-deploy-copy-libseccomp-installer.sh`. Identifying the correct files addresses the previously misreferenced path used to name the builder image. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-31 10:49:20 +01:00
yaoyinnan	9aa1ed805a	runtime: add SingleContainer when obtaining OCI Spec When creating a cgroup, add a SingleContainer when obtaining the OCI Spec to apply to ctr, podman, etc. Fixes: #5240 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 15:24:07 +08:00
yaoyinnan	b0b8523cea	runtime: modify ValidCgroupPath unit test Modify ValidCgroupPath unit test. Fixes: #8930 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 14:37:17 +08:00
yaoyinnan	feed5c8ff9	runtime: merged ValidCgroupPath method Merged ValidCgroupPath method to handle cgroupv1 and cgroupv2. Fixes: #8930 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 14:37:13 +08:00
yaoyinnan	864389c524	runtime-rs: report error on missing or empty fields in configuration Removed the setting of default values for runtime fields. Added explicit checks for missing or empty fields, reporting errors with clear messages. Fixes: #8838 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 12:46:17 +08:00
Wainer dos Santos Moschetta	abc2fcd88f	kata-deploy: fix deprecations on kustomization files By running `kustomize edit fix` on those files they have changed deprecated instructions ('bases' and 'patchesStrategicMerge') as well as 'apiVersion' and 'kind' were added. Fixes #8268 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-01-30 18:41:03 -03:00
Lukáš Doktor	4876eadd2f	tools: Add reference to the kata webhook's README The newly added webhook is a new component and oughst to be linked from the main README file. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-01-30 19:05:56 +01:00
Lukáš Doktor	b0b7748f30	ci/openshift-ci: Correct the lib location correct the lib file locations after the move from tests->kata-containers repo and add a minimized version of the ".ci/lib.sh" library into the "ci/openshift-ci" as we don't really utilize all of the features. Fixes: #8653 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-01-30 19:05:56 +01:00
Lukáš Doktor	4c58478536	ci/openshift-ci: Move openshift-ci from the tests repo Move the f15be37d9bef58a0128bcba006f8abb3ea13e8da version of scripts required for openshift-ci from "kata-containers/tests/.ci/openshift-ci" into "kata-containers/kata-containers/ci/openshift-ci" and required webhook+libs into "kata-containers/kata-containers/tools/testing" as is to simplify verification, the different location handling will be added in following commit. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-01-30 19:05:55 +01:00
Kvlil	3fd5628771	dragonball: fix noop-method-call warning The `noop-method-call` is a rustc lint that has existed since v1.52.0. This lint has been moved to the warn by default lint level since v1.73.0. Therefore build is failing with this version and above. This commit removes the unnecessary call to `<&T as Deref>::deref` on `T: !Deref`. Fixes: #8586 Signed-off-by: Kvlil <kalil.pelissier@gmail.com>	2024-01-30 17:16:49 +00:00
Wainer Moschetta	bf54a02e16	Merge pull request #8924 from microsoft/danmihai1/pod-nested-configmap-secret genpolicy: fix ConfigMap volume mount paths	2024-01-30 14:09:41 -03:00
Gabriela Cervantes	78b517ccc8	tests: Re-arranged nerdctl tests This PR re-arranged the nerdctl tests to avoid random failures. In this PR first will run the tests with RunC and then with the kata hypervisor. This PR tries to avoid the random failures that is happening with cloud-hypervisor and clh. Fixes #8963 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-30 16:07:12 +00:00
Dan Mihai	d12875ee66	genpolicy: ignore volume configMap optional field The auto-generated Policy already allows these volumes to be mounted, regardless if they are: - Present, or - Missing and optional Fixes: #8893 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-30 15:32:37 +00:00
Fabiano Fidêncio	7a83e6dc14	Merge pull request #8959 from fidencio/topic/crio-bump-runners-to-2204 gha: cri-o: Bump runners to 22.04	2024-01-30 14:27:40 +01:00
Fabiano Fidêncio	34d51b05f8	gha: cri-o: Bump runners to 22.04 This will not solve the CRI-O CI breakage but will give us an environment where we could get it to run locally. Fixes: #8935 -- part I Thanks to Julien Ropé for trying to reproduce the issues I faced on https://github.com/kata-containers/kata-containers/issues/8935 in an Ubuntu 22.04 system. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-30 14:17:06 +01:00
Xuewei Niu	7e10000b6f	Merge pull request #8928 from yaoyinnan/8927/fix/unused-DriverInfo runtime-rs: fix unused driverInfo error	2024-01-30 20:39:10 +08:00
Hyounggyu Choi	f3bc6e4155	packaging: Use Ubuntu 20.04 for building an agent This involves using Ubuntu 20.04 as a build environment for an agent to match with a runtime environment. Fixes: #8955 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-30 10:22:14 +01:00
Pavel Mores	d53edbd0a5	runtime-rs: collect qemu stderr and log it in shim log Qemu stderr monitoring runs in its own asynchronous green thread. For that, `stderr` is taken out of the Child representing the qemu child process to avoid partial move and make it possible for the main thread still to call functions on QemuInner::qemu_process (e.g. kill(), id()). Fixes #8937 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-30 09:09:05 +01:00
Pavel Mores	684d740122	runtime-rs: switch qemu child process management from std to tokio We'll want to capture qemu's stderr in parallel with normal runtime-rs execution. Tokio's primitives make this much easier than std's. This also makes child process management more consistent across runtime-rs (i.e. virtiofsd child process is already launched and managed using tokio). Some changes were necessary due to tokio functions being slightly different from their std counterparts. Child::kill() is now async and Child::id() now returns an Option. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-30 09:07:14 +01:00
Dan Mihai	6a8f46f3b8	Merge pull request #8918 from microsoft/danmihai1/metadata genpolicy: optional PodTemplateSpec metadata field	2024-01-29 12:36:30 -08:00
Dan Mihai	60ac3048e9	genpolicy: fix ConfigMap volume mount paths Allow Kata CI's pod-nested-configmap-secret.yaml to work with genpolicy and current cbl-mariner images: 1. Ignore the optional type field of Secret input YAML files. It's possible that CoCo will need a more sophisticated Policy for Secrets, but this change at least unblocks CI testing for already-existing genpolicy features. 2. Adapt the value of the settings field below to fit current CI images for testing on cbl-mariner Hosts: "kata_config": { "confidential_guest": false }, Switching this value from true to false instructs genpolicy to expect ConfigMap volume mounts similar to: "configMap": { "mount_type": "bind", "mount_source": "$(sfprefix)", "mount_point": "^$(cpath)/watchable/$(bundle-id)-[a-z0-9]{16}-", "driver": "watchable-bind", "fstype": "bind", "options": [ "rbind", "rprivate", "ro" ] }, instead of: "confidential_configMap": { "mount_type": "bind", "mount_source": "$(sfprefix)", "mount_point": "$(sfprefix)", "driver": "local", "fstype": "bind", "options": [ "rbind", "rprivate", "ro" ] } }, This settings change unblocks CI testing for ConfigMaps. Simple sanity testing for these changes: genpolicy -u -y pod-nested-configmap-secret.yaml kubectl apply -f pod-nested-configmap-secret.yaml kubectl get pods \| grep config nested-configmap-secret-pod 1/1 Running 0 26s Fixes: #8892 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-29 16:13:47 +00:00
Gabriela Cervantes	31813cf8d8	metrics: Update packages for TensorFlow ResNet Int8 Dockerfile This PR updates the required packages for the TensorFlow ResNet50 Int8 Dockerfile. Fixes #8950 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-29 16:11:09 +00:00
Fabiano Fidêncio	087856f26c	Merge pull request #8934 from microsoft/danmihai1/nodeName genpolicy: ignore the nodeName field	2024-01-29 16:57:59 +01:00
Greg Kurz	d687b601f1	Merge pull request #8933 from fidencio/topic/package-coco-guest-components packaging: Build coco-guest-components	2024-01-29 16:34:06 +01:00
Zvonko Kaiser	a9348fa35b	Merge pull request #8375 from zvonkok/opa-binary-fix arm64: agent_policy build always pulls amd64 opa binary	2024-01-29 15:10:10 +01:00
Fabiano Fidêncio	5ea6a29c37	Merge pull request #8947 from fidencio/topic/gha-pass-down-AZ_SUBSCRIPTION_ID gha: azure: Set the correct subscription to the account	2024-01-29 15:07:06 +01:00
Fabiano Fidêncio	448c0aaecb	gha: azure: Set the correct subscription to the account Due to the changes done in the CI, we need to set the correct subscription to be used with the account from now on, otherwise we'd end up using CoCo subscription. Fixes: #8946 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-29 15:00:38 +01:00
Pavel Mores	b52a398469	runtime-rs: move creation of VM path from start_vm() to prepare_vm() This fixes a flaw pointed out in review of PR #8185. Creation of the directory semantically fits better into VM preparation than VM launch. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-27 13:46:35 +01:00
Fabiano Fidêncio	98dc2d4c52	rootfs: agent: Initialise AGENT_SOURCE_BIN & AGENT_TARBALL Otherwise those would be unbound if not passed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-26 19:58:41 +01:00
Fabiano Fidêncio	5e57e0235e	rootfs: agent: Fix build with AGENT_SOURCE_BIN We need to actually check that the env var is not empty. :-) This was introduced by `8307718842`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-26 19:58:20 +01:00
Fabiano Fidêncio	fbfc880eb6	rootfs: Add COCO_GUEST_COMPONENTS_TARBALL env var This env ver will serve us to pass the Confidential Containers guest-components tarball to the rootfs builder, which will then just unpack the content into the rootfs. Fixes: #8848 -- part I Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-01-26 19:58:19 +01:00
Fabiano Fidêncio	644abde35c	packaging: coco-guest-components: Allow building the project The Confidential Containers guest-components will, in the very short future, be part of the Kata Containers rootfs that's used by the Confidential Containers usecase. This commit introduces the ability to, standalone, build the component locally and as part of our CI, and this can be done by calling: `make coco-guest-components-tarball` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-01-26 19:36:01 +01:00
Hyounggyu Choi	ee072e8a06	Merge pull request #8926 from fidencio/topic/cache-the-agent-for-non-x86_64 gha: Cache the agent for non-x86_64 arches	2024-01-26 18:04:33 +01:00
Dan Mihai	076869aa39	genpolicy: ignore the nodeName field Validating the node name is currently outside the scope of the CoCo policy. This change unblocks testing using Kata CI's test-pod-file-volume.yaml and pv-pod.yaml. Fixes: #8888 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-26 16:30:55 +00:00
Dan Mihai	ef1ee81f81	Merge pull request #8909 from microsoft/danmihai1/main-shareProcessNamespace genpolicy: add shareProcessNamespace support	2024-01-26 05:49:19 -08:00
yaoyinnan	9b7c5c69cf	runtime-rs: fix unused driverInfo error Remove the unused DriverInfo declaration or integrate it into the codebase where applicable. Fixes: #8927 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-26 19:59:52 +08:00
Greg Kurz	f41fa7557a	Merge pull request #8914 from BbolroC/basic-e2e-ibm-se tests: Add IBM SE to the basic confidential test	2024-01-26 12:32:32 +01:00
Fabiano Fidêncio	08a082ca47	gha: Cache the agent for non-x86_64 arches Those are not yet being cached for no reason, and they better be as it'll allow us to save a considerable amount of time building the rootfs. Fixes: #8917 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-26 12:02:26 +01:00
Fabiano Fidêncio	a7c68225aa	Merge pull request #8916 from fidencio/topic/packaging-reuse-already-built-agent packaging: Don't always build the kata-agent	2024-01-26 12:00:55 +01:00
Fabiano Fidêncio	95c569b0a6	packaging: Add safe.directory to the git config Otherwise building as root will not work, as demonstrated by the arm64 CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-26 09:44:43 +01:00
Hyounggyu Choi	ab462a4b89	tests: Add IBM SE to the basic confidential test The existing confidential basic test titled `Test unencrypted confidential container launch success and verify that we are running in a secure enclave` has been updated to incorporate IBM Secure Execution (`qemu-se`). Previously, a secure image was absent from kata-deploy, hindering the inclusion of IBM SE in the test. Thanks to the #6755 update, it is now possible to test the TEE. This modification extends the existing test by introducing `qemu-se`. The specific changes are outlined below: - Add an additional test `cc-se-e2e-tests` to s390x nightly - Expansion of `REMOTE_COMMAND_PER_HYPERVISOR` for `qemu-se` - Temporary exclusion of two test cases currently incompatible with IBM SE (`cpu-ns` is a common issue across all TEEs, while `inotify` will be addressed in a subsequent pull request). Fixes: #8913 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-26 06:04:39 +01:00
GabyCT	c13a63c8ba	Merge pull request #8905 from zvonkok/enable-tpm qemu: enable TPM	2024-01-25 14:52:00 -06:00
GabyCT	aa958adf90	Merge pull request #8904 from GabyCT/topic/buildbq tools: Use defined variable in build base qemu script	2024-01-25 13:51:44 -06:00
GabyCT	36fc2fd83f	Merge pull request #8876 from GabyCT/topic/dockerrestfp metrics: Update packages needed for ResNet50 FP32 Dockerfile	2024-01-25 13:51:16 -06:00
Dan Mihai	8ad5459beb	genpolicy: optional PodTemplateSpec metadata field Add metadata containing the Policy annotation if the user didn't provide any metadata in the input yaml file. For a simple sanity test using a Kata CI YAML file: genpolicy -u -y job.yaml kubectl apply -f job.yaml kubectl get pods \| grep job job-pi-test-64dxs 0/1 Completed 0 14s Fixes: #8891 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-25 19:06:59 +00:00
Fabiano Fidêncio	dd49479829	packaging: Don't build the agent if not needed Let's start relying on the already cached agent to be deployed inside the rootfs. By doing this we save a lot of time in our CI, and we have a better way, for developers, to play with changes in the agent. Fixes: #8915 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:33 +01:00
Fabiano Fidêncio	21fd7e6dfd	packaging: Fail in case oras can't find an artefact It just means the component is not cached, and that it must be built in the usual way. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Fabiano Fidêncio	eb7a33ee71	rootfs: Always strip the agent binary Let's always do this, regardless of where the agent is coming from. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Fabiano Fidêncio	f23451de01	rootfs: Add xz as a dep As we'll be untarring the agent tarball (and any other component that may be part of the rootfs) into the rootfs, we have to have xz installed. For debian and ubuntu the package is called xz-utils; for centos, alpine and cbl-mariner the package is called xz. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Fabiano Fidêncio	8307718842	rootfs: Add AGENT_TARBALL env var This env var will serve us to pass the agent tarball to the rootfs builder, which will then just unpack the content into the rootfs instead of building the agent again. AGENT_TARBALL and AGENT_SOURCE_BIN should never be used together. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Fabiano Fidêncio	5b0d0687e5	packaging: agent: Allow building in all arches We're moving away from alpine and using ubuntu in order to be able to build the agent for all the architectures we need. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Dan Mihai	535cf04edb	genpolicy: add shareProcessNamespace support Validate the sandbox_pidns field value for CreateSandbox and CreateContainer. Fixes: #8868 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-25 16:48:57 +00:00
Dan Mihai	1e24581c07	Merge pull request #8908 from microsoft/danmihai1/genpolicy-permissions tools: allow all users to execute genpolicy	2024-01-25 08:42:24 -08:00
Dan Mihai	295494c7dc	Merge pull request #8898 from microsoft/danmihai1/show-output-of-passing-tests tests: k8s: bats --show-output-of-passing-tests	2024-01-25 06:22:50 -08:00
Fabiano Fidêncio	1039641ab8	packaging: agent: Add the arch to the builder container This has been missed during reviews and is already a problem as we're trying to build the agent outside of the rootfs for other architectures than x86_64. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 14:11:14 +01:00
Fabiano Fidêncio	58874f9c3e	packaging: tools: Add the arch to the builder container This has been missed during reviews and will become a problem when the tools start to be built in different architectures. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 14:10:22 +01:00
Zvonko Kaiser	76efe25aed	Merge pull request #8901 from zvonkok/remove-gha-action gpu: remove GHA target first then remove the obsoleted Makefile targets	2024-01-25 13:40:03 +01:00
Chelsea Mafrica	24b33ae35b	Merge pull request #8884 from GabyCT/topic/ulib versions: Update libseccomp to version v2.5.5	2024-01-24 23:55:32 -08:00
Dan Mihai	723c76d945	tools: allow all users to execute genpolicy This tool can be useful for any users. Fixes: #8907 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-25 00:40:53 +00:00
Zvonko Kaiser	19ecdbca3b	qemu: enable TPM Several use-cases need a vTPM lets enable it for QEMU, a follow up patch will introduce the runtime config. Fixes: #8902 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-01-24 17:49:08 +00:00
Gabriela Cervantes	98b5a19b3a	tools: Use defined variable in build base qemu script This PR uses a variable that is already defined in the build base qemu script to have uniformity across the script as this variable is already used in the script. Fixes #8903 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-24 17:05:17 +00:00
Zvonko Kaiser	4b8d79c1f6	gpu: remove GHA target first then remove the obsoleted Makefile targets Lets remove the GHA target actions first so the the follow-up PR #8874 tests are succeeding. Fixes: #8900 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-01-24 11:43:39 +00:00
Dan Mihai	66c012d052	tests: k8s: bats --show-output-of-passing-tests Add --show-output-of-passing-tests to the k8s integration tests. The output of a passing test can be helpful when investigating a failure of the same test. Fixes: #8885 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-24 03:04:28 +00:00
Hyounggyu Choi	f4290688bb	Merge pull request #7146 from BbolroC/ibm-se-howto-doc docs: provide a guide for how to use IBM Secure Execution	2024-01-23 22:48:05 +01:00
Hyounggyu Choi	25ecca91c6	docs: provide a guide for how to use IBM Secure Execution This PR is to add a document for how to run kata containers under IBM Secure Execution environment. Fixes: #7025 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-23 18:58:27 +01:00
Greg Kurz	0f67a26751	Merge pull request #8812 from kalil-pelissier/feature/issue-7720/drop-dead-code runtime: remove SharedVersions field dead code	2024-01-23 17:46:41 +01:00
Gabriela Cervantes	1b0d12ab78	versions: Update libseccomp to version v2.5.5 This PR updates the libseccompt version to v2.5.5 which includes the following changes: - Update the syscall table for Linux - Fix minor issues with binary tree testing and with empty binary trees Fixes #8883 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-23 16:31:25 +00:00
Zvonko Kaiser	ab597a4d5b	opa: Improve the download logic The versions.yaml has a default for the amd64 binary, but there is no code to actually build the arm64 binary, which seems an overlook. Let's simplify the OPA logic by removing the direct link to the binary, and construct that link as part of the checks we do to decide whether we need to build OPA or not. Fixes: #8373 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-23 09:16:16 +00:00
Greg Kurz	4516f38165	Merge pull request #8872 from zvonkok/nvidia-gpu-confidential gpu: Add NVIDIA GPU Confidential kernel target	2024-01-23 09:22:27 +01:00
Dan Mihai	3d2ec5c919	Merge pull request #8857 from microsoft/danmihai1/k8s-gha gha: get ready to install genpolicy	2024-01-22 08:29:24 -08:00
Gabriela Cervantes	eb7e123de8	metrics: Update packages needed for ResNet50 FP32 Dockerfile This PR updates the packages necessary to build the ResNet50 fp32 Dockerfile to run properly the benchmark. Fixes #8875 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-22 16:15:36 +00:00
Zvonko Kaiser	4fc34323ae	gpu: Add NVIDIA GPU Confidential kernel target This is a follow up to the work of minimizing targets, unifying TDX,SNP builds for NVIDIA GPUs Fixes: #8828 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-01-22 14:58:57 +00:00
Kvlil	a4b208a712	runtime: remove SharedVersions field dead code SharedVersion fiel add a versiontable property that isn't supported by upstream QEMU. This is dead code since virtcontainers isn't setting SharedVersions to true. Fixes: #7720 Signed-off-by: Kvlil <kalil.pelissier@gmail.com>	2024-01-22 12:18:42 +00:00
Dan Mihai	ea9c659d36	gha: get ready to install genpolicy The changes to install and test genpolicy must come later, after CI picks up these gha changes. Fixes: #8856 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-19 23:37:49 +00:00
GabyCT	bb1ada1a8b	Merge pull request #8855 from GabyCT/topic/updatefc versions: Update firecracker version	2024-01-19 16:25:50 -06:00
Fabiano Fidêncio	1e30fde8fa	Merge pull request #8862 from microsoft/danmihai1/genpolicy-dns genpolicy: ignore pod DNS settings	2024-01-19 23:08:26 +01:00
Dan Mihai	ca03d47634	genpolicy: ignore pod DNS settings Ignore pod DNS settings because policing the network traffic is currently outside the scope of the Agent Policy. Example from Kata CI: pod-custom-dns.yaml Fixes: #8832 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-19 16:42:35 +00:00
Alex.Lyn	826c751bf3	Merge pull request #8185 from pmores/add-qemu-cmdline-generation-framework Add qemu cmdline generation framework	2024-01-19 21:42:49 +08:00
Greg Kurz	b7d6b18768	Merge pull request #8485 from BbolroC/add-unit-test-s390x GHA: Enable static check for s390x, aarch64 and ppc64le	2024-01-19 11:49:16 +01:00
Pavel Mores	25c8d5db5d	runtime-rs: use qemu cmdline generation framework to launch VM Deploy the framework added by the previous commit to generate qemu command line and launch the VM. We now properly store the child process object which allows us to implement remaining Hypervisor functions necessary for a simple but successful VM lifecycle, get_vmm_master_tid() and stop_vm(). Fixes #8184 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-19 11:42:23 +01:00
Gabriela Cervantes	0696807384	versions: Update firecracker version This PR updates the firecracker version to v1.6.0 which includes the following features - Added support for per net device metrics. In addition to aggregate metrics net, each individual net device will emit metrics under the label "net_{iface_id}". E.g. the associated metrics for the endpoint "/network-interfaces/eth0" will be available under "net_eth0" in the metrics json object. - Added support for per block device metrics. In addition to aggregate metrics block, each individual block device will emit metrics under the label "block_{drive_id}". E.g. the associated metrics for the endpoint "/drives/{drive_id}" will be available under "block_drive_id" in the metrics json object. - Added a new vm-state subcommand to info-vmstate command in the snapshot-editor tool to print MicrovmState of vmstate snapshot file in a readable format. Also made the vcpu-states subcommand available on x86_64. - Added source-level instrumentation based tracing. See tracing for more details. - Added developer preview only (NOT for production use) support for vhost-user block devices. Firecracker implements a vhost-user frontend. Users are free to choose from existing open source backend solutions or their own implementation. Known limitation: snapshotting is not currently supported for microVMs containing vhost-user block devices. See the related doc page for details. The device emits metrics under the label "vhost_user_{device}_{drive_id}". Fixes #8854 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-18 15:50:30 +00:00
Amulyam24	f6fea5f2ca	agent: fix failing unit tests on ppc64le - test_volume_capacity_stats: verify the file block size against the fetched size via statfs() - test_reseed_rng: Correct the request codes for RNDADDTOENTCNT and RNDRESEEDCRNG when platform is ppc64le - test list_routes: Add the route only if destination is not empty - test_new_fs_manager: skip the test if cgroups v2 is used by default - skip test cases rpc::tests::test_do_write_stream, sandbox::tests::test_find_process, sandbox::t ests::test_find_container_process and sandbox::tests::add_and_get_container on ppc64le as they are fl aky Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:32:16 +01:00
Hyounggyu Choi	610f878894	dragonball: Fix compile error for aarch64 This is to fix a compile error raised for aarch64. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:32:15 +01:00
Amulyam24	376941cf69	kata-ctl: skip building kata-ctl on ppc64le kata-ctl currently fails to build on ppc64le. Skip it for running static checks and the issues will be fixed and tracked in a seperate issue. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	4ecd82a5df	runk: skip the test_init_container_create_launcher if not root on ppc64le This is to skip the test_init_container_create_launcher if not root on ppc64le. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	a4b5447924	tools: fix makefile spacing This minor PR removes the extra space in the makefiles. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	394777291d	runtime: fix failing unit tests on ppc64le A few CPU related test cases were failing as the version was being verified against Power8 while the CI machine is Power9. Fixes: #5531 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	486b8a0538	dragonball: skip running static-checks for ppc64le Since dragonball is not currently supported on ppc64le, skip running the targets for static-checks. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	14934c7b0d	github: run static checks on ppc64le This PR adds ppc64le runner to the static-checks workflow. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	8061a49ca5	kata-ctl: Clean up a test leftover file explicitely It was observed that a tmporary file `/tmp/kata_hybrid_vsock02.hvsock` for test_setup_hvsock_failed() is not removed from time to time. This leads to a test failure for the same test next time due to the file permission on a self-hosted runner. This commit is to explicitely delete the file before the check starts. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	290ecf4c46	Static-check: Exclude s390x from dragonball and runtime-rs At the moment, a project `dragonball` and `runtime-rs` does not support for s390x. During the enablement, some errors due to the misconfiguration of Makefile for `make check` and `make vendor` were identified. This is to skip the build for the affected target of the projects. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	c0f57c9e0a	Lint: Fix `cargo clippy` errors for s390x Some linting errors were identified during the enablement of `make check`. These have not been found by the Jenkins CI job because `make test` was only triggered. The errors for the `agent` occurs under the s390x specific tests while the other ones for the `kata-ctl` are the architecture-specific code. This commit is to fix those errors. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	a1f288e5d3	CI: Use sudo if yq_path is not writable by USER If `yq_path` is set to `/usr/local/bin/yq`, there could be a situation where the `yq` cannot be installed without `sudo`. This commit handles the situation by putting `sudo` in front of `curl` and `chmod`, respectively. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	354cbede9c	GHA: Enable static check for s390x As part of the CI migration from Jenkins to GitHub Action, a CI job named `kata-containers-2.0-ubuntu-s390x-unit-PR` is covered by the static check. This commit is to enable the check for s390x by incorporating a runner `s390x` with the corresponding workflow. Fixes: #8482 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Jianyong Wu	ba74a624a8	runtime-rs: use pathBuf only for x86 PathBuf here is only used for x86. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2024-01-18 16:31:13 +01:00
Jianyong Wu	a10779bf0b	GHA: enable static check on arm64 This is to add a runner for arm64 to the workflow. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2024-01-18 16:31:11 +01:00
Dan Mihai	eeba459a6b	Merge pull request #8845 from microsoft/danmihai1/genpolicy-defaults tools: install genpolicy settings files	2024-01-17 15:08:49 -08:00
Chelsea Mafrica	32ad465663	Merge pull request #8710 from jodh-intel/runtime-rs-ch-get-thread-ids runtime-rs: ch: Implement minimal implementation for missing thread/pid APIs	2024-01-17 14:51:44 -08:00
Fabiano Fidêncio	147d5fd752	Merge pull request #8836 from microsoft/danmihai1/test-with-cbl-mariner genpolicy: use root path from cbl-mariner Guest VM	2024-01-17 17:51:44 +01:00
Pavel Mores	f550d9a325	runtime-rs: add basic implementation of qemu command line generation This current framework is enough to launch a VM with a simple container in it (e.g. busybox). Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-17 12:55:00 +01:00
Pavel Mores	e8e13044da	runtime-rs: add simple impls to some of Qemu's Hypervisor functions The idea of most of these is just to prevent running into todo!()s where we can at the moment, while implementing the fundamental functionality of VM launch. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-17 12:55:00 +01:00
Dan Mihai	febabef08c	tools: install genpolicy settings files Install the default genpolicy OPA rules and settings JSON files, in addition to the genpolicy binary. Fixes: #8844 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-16 23:59:59 +00:00
David Esparza	e11c520ffa	Merge pull request #8808 from kata-containers/memory_usage_test_skip_virtiofs_when_req tests: Ignore virtiofs contribution to memory usage when it is disabled.	2024-01-16 16:50:06 -06:00
Dan Mihai	69557e5ad6	Merge pull request #8814 from microsoft/danmihai1/genpolicy-kata-deploy tools: genpolicy static checks	2024-01-16 07:33:42 -08:00
Dan Mihai	13f2398fe8	Merge pull request #8837 from microsoft/danmihai1/allow_storages genpolicy: temporarily disable allow_storages()	2024-01-16 07:10:49 -08:00
Alex.Lyn	17719f1ac5	Merge pull request #8708 from Apokleos/directvol-bugfix-blk-pci runtime-rs: bugfix for DirectVolume/rawblock when driver is blk	2024-01-16 14:25:16 +08:00
alex.lyn	99717371c1	runtime-rs: bugfix for DirectVolume/rawblock when driver is blk DirectVolume/Rawblock doesn't work well when device's block driver is virtio-blk-pci and the storage handler is DRIVER_BLK_PCI_TYPE. Fixes: #8707 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-16 10:35:08 +08:00
Dan Mihai	205dafd323	genpolicy: temporarily disable allow_storages() Temporarily disable the allow_storages() rules, because they are based on the tarfs snapshotter + container image integrity information that are not available yet in the main branch - see #8833. Fixes: #8834 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-15 23:55:27 +00:00
Dan Mihai	f4106a6107	genpolicy: use root path from cbl-mariner Guest VM Adjust genpolicy-settings.json to match the container root path from the main branch + cbl-mariner Guest VMs. This configuration might have to be adjusted again when other types of Guest VMs will be tested during CI using genpolicy, in the future. Also, improve logging from allow_root_path(), to easier debug these issues in the future. Fixes: #8835 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-15 23:33:28 +00:00
GabyCT	37a4049d0f	Merge pull request #8830 from GabyCT/topic/removeprotocol metrics: Remove iperf3 server protocol	2024-01-15 14:44:39 -06:00
Dan Mihai	201eec628a	tools: genpolicy static checks Package genpolicy and enable static checks for it. Fixes: #8813 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-15 16:49:58 +00:00
David Esparza	4b772d2480	tests: Ignore virtiofs contribution to memory usage when it is disabled. This PR removes the references to virtiofs from memory average calculation when the container uses a shared file system other than virtiofs. Fixes: #8807 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-01-15 08:07:06 -08:00
Gabriela Cervantes	dff800a8ff	metrics: Remove iperf3 server protocol This PR removes the iperf3 server protocol as this server definition is also used for the UDP iperf3 benchmarks to avoid duplication of the same yaml files. Fixes #8829 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-15 15:44:24 +00:00
Fabiano Fidêncio	0dc00ae373	Merge pull request #8822 from microsoft/danmihai1/cargo-clippy genpolicy: cargo clippy fixes	2024-01-15 14:59:04 +01:00
Fabiano Fidêncio	73cf31bd9e	Merge pull request #8827 from microsoft/danmihai1/disable-k8s-oom tests: cbl-mariner: disable k8s-oom.bats	2024-01-15 14:40:16 +01:00
Xuewei Niu	923bd65dff	Merge pull request #8819 from justxuewei/rm-protocol-backend dragonball: Remove unused definition	2024-01-15 10:09:46 +08:00
Dan Mihai	b7c31e3b98	tests: cbl-mariner: disable k8s-oom.bats Disable k8s-oom.bats on cbl-mariner until it passes more often. Fixes: #8824 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-14 17:39:25 +00:00
Dan Mihai	681cb1626a	genpolicy: cargo clippy fixes Clean up cargo clippy errors. Fixes: #8818 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-14 01:23:46 +00:00
Dan Mihai	3af713acd4	Merge pull request #8817 from microsoft/danmihai1/cargo-fmt genpolicy: "cargo fmt -- --check" clean-up	2024-01-13 16:22:27 -08:00
Xuewei Niu	f1fda3d6b0	dragonball: Remove unused definition `EndpointProtocolFlags::ProtocolBackend` is removed due to no reference. Fixes: #8745 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-13 13:25:11 +08:00
Dan Mihai	dcaae54cf6	genpolicy: "cargo fmt -- --check" clean-up Also, update Cargo.lock Fixes: #8816 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-13 01:57:00 +00:00
GabyCT	a7114a35a8	Merge pull request #8792 from GabyCT/topic/updatenhwc metrics: Use a specific python version to run tensorflow benchmark	2024-01-12 11:24:54 -06:00
Alex.Lyn	ffcd95b6b4	Merge pull request #8737 from Apokleos/test-ci-dgb-cri-containerd ci: enable test dragonball stability and cri-containerd	2024-01-12 11:56:22 +08:00
Fabiano Fidêncio	a606401722	Merge pull request #8803 from jodh-intel/issues-8784-runtime-rs-ch-rm-todo-to-unbreak runtime-rs: ch: Unbreak CH driver	2024-01-11 19:37:13 -03:00
Gabriela Cervantes	12a41f89b1	metrics: Use a specific python version to run tensorflow benchmark This PR uses a specific python version to run tensorflow benchmark as it needs python 3.8 to run correctly and avoid failures. Fixes #8791 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-11 22:15:31 +00:00
GabyCT	2ffb161958	Merge pull request #8763 from stevenhorsman/fix-backport-check-hub Fix backport check hub	2024-01-11 15:15:12 -06:00
Fabiano Fidêncio	86a6d133e4	Merge pull request #8248 from microsoft/danmihai1/genpolicy-main tools: add policy generation tool	2024-01-11 17:02:54 -03:00
GabyCT	69be050ff9	Merge pull request #8657 from WenyuanLau/8656/Fix_StratoVirt_on_gha_metrics gha: Fix the failure of gha metrics for StratoVirt	2024-01-11 11:41:25 -06:00
James O. D. Hunt	29e0de4e4a	runtime-rs: ch: Implement minimal memory hotplug APIs Replace the `todo!()` calls with a minimal NOP implementation to return the CH driver to working order since the `todo!()`'s forcibly crash the driver at runtime. Full implementations for these APIs will be added on issues #8800, #8801, and #8802. Fixes: #8784. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-01-11 14:11:31 +00:00
James O. D. Hunt	1c0df670af	runtime-rs: ch: Add minimal implementation of hypervisor metrics method Remove the `todo!()` macro which would cause a runtime crash and replace with a implementation that returns an error as a stop-gap until #8800 is implemented. Fixes: #8785. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-01-11 14:11:01 +00:00
alex.lyn	b97efc3139	CI: enable test container memory update for dragonball Fixes: #8746 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-11 19:07:33 +08:00
alex.lyn	6c85e95c34	CI: bugfix for dragonball when CI running with cri-containerd Containerd runtime options with wrong setting cause it failed. Correct it as below: ... [plugins.cri.containerd.runtimes.${runtime}.options] ConfigPath= "${KATA_CONFIG_PATH}" ... Fixes: #8746 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-11 17:35:33 +08:00
alex.lyn	cd59d31a15	CI: make CI work for dragonball to test stability and cri-containerd It needs to remove the skip setting, and make it work for dragonball. Fixes: #8746 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-11 17:35:13 +08:00
Hyounggyu Choi	f62ec0a7f5	Merge pull request #8693 from BbolroC/ibm-se-config-validation-fix runtime: Allow no initrd path for IBM Z Secure Execution	2024-01-11 09:53:51 +01:00
Xuewei Niu	70305fefc5	Merge pull request #8780 from justxuewei/containerd-events runtime-rs: Forward events to containerd via ttrpc	2024-01-11 14:58:14 +08:00
Xuewei Niu	6fd49f7604	runtime-rs: Forward events to containerd via ttrpc It is a little bit heavy for the runtime-rs to forwards events via containerd CLI, contrast to the ttrpc way. Plus, for runtimes that haven't this mechanism, e.g. CRI-O, we can't get those events anywhere. This patch introduces two types of forwarders: - `ContainerdForwarder`: Acquire ttrpc address from environment variables and forward events via ttrpc connection. - `LogForwarder`: Write event info into logs. Fixes: #7881 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-11 10:32:50 +08:00
GabyCT	a8be3d0450	Merge pull request #8796 from GabyCT/topic/uruncv versions: Update runc version	2024-01-10 14:16:20 -06:00
Gabriela Cervantes	e69f7c07a7	versions: Update runc version This PR updates the runc version to 1.1.11 which includes the following improvements - Fix several issues with userns path handling. - Support memory.peak and memory.swap.peak in cgroups v2. Add swapOnlyUsage in MemoryStats. This field reports swap-only usage. For cgroupv1, Usage and Failcnt are set by subtracting memory usage from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage are set. - build(deps): bump github.com/cyphar/filepath-securejoin. Fixes #8795 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-10 16:46:08 +00:00
Greg Kurz	0c37aec7dc	Merge pull request #8753 from fidencio/topic/add-confidential-artefacts TEEs: Introduce kernel-confidential	2024-01-10 16:59:57 +01:00
Alex.Lyn	695440a431	Merge pull request #8749 from Apokleos/fixup-dragonball-vfio runtime-rs: fixup vfio device in runtime-rs/dragonball	2024-01-10 15:20:34 +08:00
Dan Mihai	de61b4d4e2	Merge pull request #8772 from microsoft/danmihai1/wait-for-delete tests: list the current k8s pods	2024-01-09 13:45:55 -08:00
Fabiano Fidêncio	c3f6eaa267	build-kernel: Fix typo 'terball' -> 'tarball' SSIA. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-09 14:35:45 -03:00
Fabiano Fidêncio	8b2f43a2c2	build: Add "confidential" kernel We're using a Kernel based on v6.7, which should include all te patches needed for SEV / SNP / TDX. By doing this, later on, we'll be able to stop building the specific kernel for each one of the targets we have for the TEEs. Let's note that we've introduced the "confidential" target for the kernel builder script, while the TEE specific builds are being kept as they're -- at least for now. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-09 14:35:45 -03:00
Jianyong Wu	379e2f3da2	kernel: update some configs based on kernel 6.5 and 6.6 There are lots of configs removed from latest kernel. Update them here for convenience of next kernel upgrade. Remove CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE [1] Remove CONFIG_IP_NF_TARGET_CLUSTERIP [2] Remove CONFIG_NET_SCH_CBQ [3] Remove CONFIG_AUTOFS4_FS [4] Remove CONFIG_EMBEDDED [5] [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=a7e4676e8e2cb158a4d24123de778087955e1b36 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=9db5d918e2c07fa09fab18bc7addf3408da0c76f [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=051d442098421c28c7951625652f61b1e15c4bd5 [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=1f2190d6b7112d22d3f8dfeca16a2f6a2f51444e [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=ef815d2cba782e96b9aad9483523d474ed41c62a Fixes: #8408 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2024-01-09 14:35:45 -03:00
Fabiano Fidêncio	cf4835e3ae	packaging: qemu: Simplify "--disable-virtiofsd" logic As all the supported architectures are disabling the virtiofsd build, there's no need to keep the switch statement there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-09 14:35:45 -03:00
Fabiano Fidêncio	bfc6fc7a85	build: Get rid of QEMU experimental We've not been building QEMU experimental for a very long time, and the entry there has only been serving the purpose to clutter the versions.yaml (in the best case scenario) or even confuse new contributors to the project. Mind that the machinery to build the QEMU experimental is not touched, and that's used to build the TEEs capabale artefacts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-09 14:35:45 -03:00
GabyCT	4ac5f13722	Merge pull request #8789 from GabyCT/topic/installimagestress tests: Add check images as part of install dependencies	2024-01-09 09:28:13 -06:00
GabyCT	393edf380a	Merge pull request #8778 from GabyCT/topic/fixin packaging: Fix indentation of build static stratovirt	2024-01-09 09:27:52 -06:00
Greg Kurz	e3611cf27d	Merge pull request #8326 from cheriL/8325/fix_method_param agent: use method params instead of const params in functions	2024-01-09 07:35:19 +01:00
Gabriela Cervantes	24fab19f6f	tests: Remove check images function from stressng test This PR removes the check images function from stressng test as now it will part of the install dependencies function from gha-run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-08 17:40:39 +00:00
Gabriela Cervantes	aceba94d95	tests: Add check images as part of install dependencies To avoid random failures while trying to build and install the stressng image, this PR moves that step as part of the install dependencies in order to move the stability tests and avoid timeouts. Fixes #8787 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-08 17:38:14 +00:00
Pavel Mores	0cfb2d2570	runtime-rs: add simple Persist implementation for Qemu This is not necessarily meant to work, just to stub out unimplemented functionality while focusing on more fundamental things. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-08 13:12:39 +01:00
Pavel Mores	45862aeec0	runtime-rs: add default rootfs type for qemu Make sure that rootfs type is known early on even if it's not set in configuration.toml. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-08 13:12:39 +01:00
Gabriela Cervantes	7d41c97f60	packaging: Fix indentation of build static stratovirt This PR fixes the indentation of the build static stratovirt script for kata containers. Fixes #8777 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-05 18:06:08 +00:00
Dan Mihai	90c782f928	tests: list the current k8s pods Log the list of the current pods between tests because these pods might be related to cluster nodes occasionally running out of memory. Fixes: #8769 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-05 16:41:43 +00:00
Xuewei Niu	192c6ee9c3	Merge pull request #8773 from justxuewei/dbs-k8s-fragile	2024-01-05 12:54:32 +08:00
Xuewei Niu	0e9d73fe30	agent: Fix an issue reporting OOM events by mistake The agent registers an event fd in `memory.oom_control`. An OOM event is forwarded to containerd when the event is emitted, regardless of the content in that file. I observed content indicating that events should not be forwarded, as shown below. When `oom_kill` is set to 0, it means no OOM has occurred. Therefore, it is important to check the content to avoid mistakenly forwarding OOM events. ``` oom_kill_disable 0 under_oom 0 oom_kill 0 ``` Fixes: #8715 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-05 11:06:37 +08:00
Dan Mihai	b18f269ccf	Merge pull request #8735 from microsoft/danmihai1/set-policy agent: hold lock while setting new policy	2024-01-04 13:28:21 -08:00
GabyCT	5ea07c2b3e	Merge pull request #8776 from GabyCT/topic/addextraqemu tests: Add hypervisor component to kill kata components function	2024-01-04 14:29:52 -06:00
Gabriela Cervantes	4ad1971a0a	tests: Add hypervisor component to kill kata components function This PR adds the qemu-experimental hypervisor in the function to kill kata components. Fixes #8775 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-04 17:07:12 +00:00
stevenhorsman	6bac3323be	workflows: Update backport-label to use gh-utils.sh - hub is deprecated, so use the new gh-utils.sh script that wraps the github cli instead Fixes: #8125 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-01-04 16:48:34 +00:00
stevenhorsman	0d5d1c8c36	ci: Add gh-util.sh script - The hub tool is now deprecated, so introduce a new alternative to `hub-util.sh` https://github.com/kata-containers/.github/blob/main/scripts/hub-util.sh that works with it. Initially I've only started with the couple of commands that we use regularly, but we can extend it in future. - Expects jq to be installed and `gh` to be installed an setup (see [1]) - Now we don't have lots of repos, I've moved it into `kata-containers` rather than `.github`, so it is more visible. Fixes: #8125 [1] https://docs.github.com/en/github-cli/github-cli/quickstart#prerequisites Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-01-04 16:48:34 +00:00
Dan Mihai	7d5336aca3	agent: hold lock while setting new policy Don't release the lock between is_allowed and set_policy calls, because the policy might change in between these calls. Also, move more policy code into policy.rs. Fixes: #8734 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-04 16:45:30 +00:00
GabyCT	f056ffe5ef	Merge pull request #8759 from fadecoder/update_docs_for_stratoVirt_VMM docs: Update docs for new StratoVirt VMM introduction	2024-01-04 10:39:37 -06:00
GabyCT	4f9ee7b31c	Merge pull request #8766 from GabyCT/topic/improvedeleteion metrics: Improve iperf3 cleanup	2024-01-04 10:38:33 -06:00
Xuewei Niu	b5a6e74cdf	Merge pull request #8744 from justxuewei/vhu-net-compile dragonball: Fix compilation issue without all net features	2024-01-04 19:02:55 +08:00
Xuewei Niu	db948f685d	Merge pull request #8757 from justxuewei/upgrade-containerd-shim-protos runtime-rs\|agent\|protocols\|agent-ctl: Bump ttrpc and containerd-shim-protos versions	2024-01-04 19:02:42 +08:00
soup	7c176a62fe	agent: use method params instead of const params in functions Fixes: #8325 Signed-off-by: soup <lqh348659137@outlook.com>	2024-01-04 09:29:29 +01:00
Xuewei Niu	f97f16a44a	agent-ctl: Bump ttrpc version - `ttrpc` from `0.7.1` to `0.8`. Fixes: #8757 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Xuewei Niu	bf59c7b3d4	runtime-rs: Bump ttrpc and containerd-shim-protos versions - `ttrpc` from `0.7.1` to `0.8`. - `containerd-shim-protos` from `0.3.0` to `0.6.0`. Fixes: #8756 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Xuewei Niu	cf9a0e21a1	protocols: Bump ttrpc version - `ttrpc` from `0.7.1` to `0.8`. Fixes: #8756 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Xuewei Niu	91360e7ddb	agent: Bump ttrpc version - `ttrpc` from `0.7.1` to `0.8`. Fixes: #8756 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Chao Wu	0f532175fe	Merge pull request #8771 from openanolis/chao/fix_ut dbs-pci: introduce Cargo.lock to prevent the influence from upstream	2024-01-04 15:14:22 +08:00
Zhigang Wang	44b5b88f4c	docs: Update docs for new StratoVirt VMM introduction As the StratoVirt VMM has been added, we can update the docs and make some intoduction to StratoVirt, thus users can know more about the hypervisor choices. Fixes: #8645 Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com> Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2024-01-04 14:26:48 +08:00
Chao Wu	f1235ddba3	dbs_virtio_devices: add Cargo.lock In order to avoid rust-vmm upstream change breaks Dragonball compilation, we introduce Cargo.lock to dbs crates. fixes: #8770 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-01-04 11:23:30 +08:00
Chao Wu	02cd726bfc	dbs-utils: add Cargo.lock In order to avoid rust-vmm upstream change breaks Dragonball compilation, we introduce Cargo.lock to dbs crates. fixes: #8770 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-01-04 11:17:45 +08:00
Chao Wu	97bdc1529b	dbs-pci: introduce Cargo.lock As reported in #8767, we have found that the root cause is that rust-vmm's vmm-sys-utils introduce a new release 0.12.1 and dbs-pci rely on rust-vmm's vfio-ioctls which uses >= to declare vmm-sys-utils so it automatically upgrade vmm-sys-utils to 0.12.1. That's how two different versions of vmm-sys-utils is introduced and this breaks the compilation. In order to fix this and also avoid future problems, we introduce Cargo.lock file to dbs crates. fixes: #8770 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-01-04 11:11:56 +08:00
Gabriela Cervantes	4bc67dba08	metrics: Improve iperf3 cleanup This PR improves the iperf3 cleanup to ensure all the components are being deleted properly to avoid the random failures of leaving the iperf3 clients on the kata metrics CI. Fixes #8765 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-03 17:14:38 +00:00
alex.lyn	d2080fd221	runtime-rs: refactor getting the vfio device guest pci path Fixes: #8748 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-02 14:28:34 +08:00
alex.lyn	d795fcfc2f	runtime-rs: bridge the vfio device between runtime-rs and dragonball Previously, Dragonball did not support PCI device hot-plugging or VFIO device passthrough. Therefore, the runtime-rs support for Dragonball was incomplete. it is time to complete it so that users can use Dragonball's PCI hot-plugging and VFIO passthrough capabilities. Fixes: #8748 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-02 14:28:10 +08:00
Chao Wu	67b91c1eb3	Merge pull request #8740 from openanolis/upstream/pci-6-final Dragonball: add pci vfio passthrough, hot(un)plug support	2023-12-29 01:58:32 +08:00
Chao Wu	71c322c293	runtime-rs: fix ci complains vfio commits introduce quite a lot change in runtime-rs, this commit is for all the changes related to ci, including compilation errors and so on. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 23:34:41 +08:00
Chao Wu	f9e0a4bd7e	upcall: introduce pci device add & del kernel patch add pci add and del guest kernel patch as the extension in the upcall device manager server side. also, dump config version to 120 since we need to add config for dragonball pci in upcall fixes: #8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 16:21:30 +08:00
Chao Wu	a3f7601f5a	dragonball: add pci hotplug / hot-unplug support Introduce two new vmm action to implement pci hotplug and pci hot-unplug: PrepareRemoveHostDevice and RemoveHostDevice. PrepareRemoveHostDevice is to call upcall to unregister the pci device in the guest kernel. RemoveHostDevice should be called after PrepareRemoveHostDevice, it is used to clean the PCI resource in the Dragonball side. fixes: #8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 16:08:31 +08:00
Chao Wu	0f402a14f9	dragonball: add InsertHostDevice vmm action Introduce a new vmm action InsertHostDevice to passthrough host pci devices like NIC or GPU devices into guest so that users could have high performance usage of those devices. fixes: #8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 16:04:22 +08:00
Xuewei Niu	4c023e341c	dragonball: Fix compilation issue without all net features Combinations of network features were tested: - None - virtio-net - vhost-net - vhost-user-net - virtio-net,vhost-net - vhost-net,vhost-user-net - virtio-net,vhost-user-net - virtio-net,vhost-net,vhost-user-net Fixes: #8742 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-28 11:37:26 +08:00
Alex.Lyn	990a3adf39	Merge pull request #8618 from Apokleos/csi-for-directvol runtime-rs: Add dedicated CSI driver for DirectVolume support in Kata	2023-12-27 21:27:29 +08:00
Chao Wu	cbd4481bc1	Merge pull request #7489 from Apokleos/pci_path runtime-rs: add pci topology for pci devices	2023-12-27 18:52:06 +08:00
alex.lyn	ea69c17008	runtime-rs: initialize pcie topology in Device Manager Add a pcie_topology field to DeviceManager and initialize pcie_topology when ResourceManager calls DeviceManager's new() with TopologyConfigInfo. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:57:23 +08:00
alex.lyn	b42548b8e1	runtime-rs: do unregister device in Trait Device/detach Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:53:18 +08:00
alex.lyn	0f0b6d13c9	runtime-rs: do register/update device in Trait Device/attach Before calling the device driver to attach a device, register the device to PCIe topology and allocate a PciPath for it. However, for some hypervisor such as CLH, the allocation is invalid when plugging devices to VM, they have the ability to return DeviceInfo containing PciPath. It'll update the PciPath with the returned pci path in the PCIe topology for them to prevent the inferred pcipath from being different from the actual value returned. But the update will not be executed if the pcipath value doesn't change. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:49:18 +08:00
alex.lyn	ce7d363695	runtime-rs: Introduce helper macros to simplify PCIe device ops Introduce helper macros to simplify PCIe device register/unregister and update, which provides a convenient way to handle devices in topology. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:43:58 +08:00
alex.lyn	0d4992b24d	runtime-rs: add one more argument in Device attach/detach Add one more argument with type &mut Option<&mut PCIeTopology> in attach and detach to inroduce methods within PCIe Topology. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:40:01 +08:00
alex.lyn	b425de6105	runtime-rs: implement Trait PCIeDevice for pcie/pci device Implement Trait PCIeDevice register/unregister for pcie/pci device, such as vfio device which needs set/get device's pci path for kata agent's device handler. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:33:08 +08:00
alex.lyn	87e39cd1f6	runtime-rs: introduce Trait PCIeDevice to do [un]register device Introduce Trait PCIeDevice with register/unregister, which are used to register or unregister pcie device within the PCIe topology. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:29:35 +08:00
alex.lyn	6ebc4884fa	runtime-rs: introduce PCIe Topology framework for pcie/pci devices Due to different ways that different VMMs handle PCI devices, we expect to provide a general PCIe topology processing framework that is as compatible as possible with VMMs such as dragonball, qemu, clh(Though it has its own management method, no conflict). Currently,it's mainly developed for kinds of PCIe/PCI devices in dragonball/clh which are attached on the pci/pcie root bus directly. More will be added when Qemu is ready in runtime-rs. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:29:25 +08:00
alex.lyn	88839026b9	runtime-rs: introduce TopologyConfigInfo to initialize pcie topology A TopologyConfigInfo added to store device config info for PCIe/PCI devices in the VM from Hypervisor DeviceInfo. And TopologyConfigInfo::new will be the entry to initialize PCIe Topology for each VM. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:21:53 +08:00
Fabiano Fidêncio	35f88dfc93	Merge pull request #8733 from fidencio/topic/fix-shim-check-for-snapshotter-configration kata-deploy: Fix shim check for snapshotter configuration	2023-12-27 03:30:53 -03:00
Chao Wu	8895cb82df	Merge pull request #8724 from openanolis/chao/add_vfio dragonball: introduce vfio support	2023-12-27 11:40:53 +08:00
Xuewei Niu	43a627c96f	Merge pull request #8632 from adamqqqplay/support-vhost-user-blk dragonball: introduce vhost-user-blk device	2023-12-27 09:54:21 +08:00
Chao Wu	2f797a6eb7	pci: rename 2 parameters to follow rust naming convention PciCapabilityID -> PciCapabilityId PciBarRegionType::IORegion -> PciBarRegionType::IoRegion Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-26 23:28:47 +08:00
Chao Wu	9c13b2c990	dragonball: introduce vfio support vfio mod collects lots of information related to the vfio operations, including VfioMsi and VfioMsix capability & state, vfio interrupt info, pci region infor and vfio pci device info & state. fixes: #8722 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com> Signed-off-by: Yang Su <yang.su@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Xin Lin <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-26 23:28:43 +08:00
alex.lyn	8779fe7dd5	runtime-rs: create a reference that directs users to kata csi doc Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 20:36:34 +08:00
alex.lyn	ba5437382a	runtime-rs: add examples about Kata pod with directvol by CSI. Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 20:36:34 +08:00
alex.lyn	c6d2a32146	runtime-rs: add support for directvol csi deploy scripts. Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 20:36:34 +08:00
alex.lyn	25d8e83e43	runtime-rs: Add dedicated CSI driver for DirectVolume support in Kata Bridge the gap between user requirements for direct block device access and the DirectVolume capabilities provided by Kata runtimes (kata-runtime/runtime-rs), and facilitate seamless integration with CSI to improve user experience. It aims to integrate DirectVolume CSI support into Kata, enabling users to benefit from its performance and flexibility advantages. Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 20:36:22 +08:00
Fabiano Fidêncio	6ee7fb5402	kata-deploy: Double quote the snapshotter name Otherwise `jq` will complain about: ```sh jq: error: nydus/0 is not defined at <top-level>, line 1: .plugins."io.containerd.grpc.v1.cri".containerd.runtimes."kata-clh".snapshotter=nydus jq: 1 compile error ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-26 09:14:36 -03:00
Qinqi Qu	81ab174c16	dragonball: support vhost-user-blk in device manager This patch introduces a feature of supporting vhost-user-blk device. Fixes: #8631 Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>	2023-12-26 20:02:38 +08:00
Qinqi Qu	ef8dc3b0ce	dragonball: support vhost-user-blk This patch introduces a feature of supporting vhost-user-blk device. This device needs to be defined before the VM instance is started, which can be done through the dbs-cli tool with --virblks option: --virblks '{ "drive_id": "8623", "device_type": "Spdk", "path_on_host": "spdk:///var/tmp/vhost.sock", "is_root_device": false, "is_read_only": false, "is_direct": false, "no_drop": false, "num_queues": 1, "queue_size": 256 }' Fixes: #8631 Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Signed-off-by: fupan <fupan.lfp@antgroup.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>	2023-12-26 20:02:32 +08:00
Fabiano Fidêncio	8332f3c684	kata-deploy: Fix the snapshotter config placement In the way the script is without this patch, we're trying to set ```toml [`$shim`] snapshotter = $snapshotter ``` However, what we actually want to set is the full runtime table instead of shim. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-26 08:26:38 -03:00
Fabiano Fidêncio	907f1ddb9e	kata-deploy: Fix shim check for snapshotter configuration We want to check whether the shim is part of the "plain text" shims passed to the daemonset (meaning, checking against `$SHIMS`). Before this fix we were checking against `$shims`, which is an array of shims instead of a string, resulting on a broken check. Fixes: #8732 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-26 07:42:36 -03:00
Tim Zhang	a4ad12a3d1	Merge pull request #8729 from liubin/fix/package-kata-monitor kata-monitor: fix Dockerfile to build image	2023-12-26 18:30:15 +08:00
alex.lyn	3b317e69e2	runtime-rs: add README and user guide to deploy directvol CSI Driver Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 18:00:35 +08:00
Bin Liu	23eb3042c7	kata-monitor: fix Dockerfile to build image move `SKIP_GO_VERSION_CHECK` after `make` command to skip checking golang version. And also upgrade golang to 1.19. Fixes: #8728 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-12-26 15:11:13 +08:00
Xuewei Niu	1065ca6fa7	Merge pull request #8626 from justxuewei/vhost-user-endpoint	2023-12-26 12:52:21 +08:00
Xuewei Niu	36a4cbccf6	runtime-rs: Expand all DeviceType in match arms The compiler will give a warning if a developer forget to add an arm for a new variants defined. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:59 +08:00
Xuewei Niu	f2d08bc00f	runtime-rs: Remove unused index from Endpoints The affected `Endpoint`s are `VhostUserEndpoint` and `TapEndpoint`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:59 +08:00
Xuewei Niu	60a42351e2	runtime-rs: DAN supports vhost-user-net device DAN reads vhost-user-net device from JSON config. It only supports VMM running as server right now. Fixes: #8625 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:59 +08:00
Xuewei Niu	693a0cfbfd	dragonball: Make vhost-user-net ready for VhostUserEndpoint The changes involve: - Expose VhostUserConfig struct to runtime-rs. - Set a default value while num_queues or queue_size are 0. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:59 +08:00
Xuewei Niu	54df832407	runtime-rs: Support VhostUserEndpoint This commit introduces VhostUserEndpoint and supports relative to vhost-user-net devices for device manager. For now, Dragonball is able to attach vhost-user-net devices. Fixes: #8625 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:50 +08:00
Xuewei Niu	374c2f01aa	runtime-rs: Simplify VhostUserType enum Remove unused string parameter from each item. Fixes: #8625 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-25 16:15:57 +08:00
Xuewei Niu	38eb4077a6	Merge pull request #8503 from justxuewei/vhost-user-net dragonball: Support vhost-user-net device	2023-12-25 13:47:51 +08:00
Xuewei Niu	4c5de72863	dragonball: Wrap config space into `set_config_space` Config space of network device is shared and accord with virtio 1.1 spec. It is a good way to abstract the common part into one function. `set_config_space()` implements this. Plus, this patch removes `vq_pairs` from vhost-net devices, since there is a possibility of data inconsistency. For example, some places read that from `self.vq_pairs`, others read from `queue_sizes.len() / 2`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-25 10:47:34 +08:00
Alex.Lyn	3a3f39aa2d	Merge pull request #8668 from Apokleos/pci-path-refactor runtime-rs: Refactor the code related to PCI paths and VFIO device driver initialize in DM.	2023-12-23 21:44:07 +08:00
Steve Horsman	1afce09858	Merge pull request #8721 from stevenhorsman/kata-deploy-typos kata-deploy: snapshotter typo fixes	2023-12-22 21:26:03 +00:00
stevenhorsman	4a95c0d07f	kata-deploy: snapshotter typo fixes - Add spaces so that the if statements are valid Fixes: #8720 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-12-22 16:32:02 +00:00
Dan Mihai	080541a0f2	genpolicy: add SPDX license header Add SPDX license header to rules.rego. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Saul Paredes	7f126be67e	genpolicy: Update oci_distribution to 0.10.0 Also support alternative media type and update samples Signed-off-by: Saul Paredes <saulparedes@microsoft.com> Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	9eb6fd4c24	docs: add agent policy and genpolicy docs Add docs for the Agent Policy and for the genpolicy tool. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	57f93195ef	genpolicy: add support for StatefulSet YAML input Generate policy for K8s StatefulSet YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	35958ec9cc	genpolicy: add support for ReplicationController Generate policy for K8s ReplicationController YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	7da17099f2	genpolicy: add support for ReplicaSet YAML input Generate policy for K8s ReplicaSet YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	d84300f1ee	genpolicy: add support for List YAML input Generate policy for K8s List YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	a03452637b	genpolicy: add support for Job YAML input Generate policy for K8s Job YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	2dbd01c80b	genpolicy: add support for Deployment YAML input Generate policy for K8s Deployment YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	a40a6003d0	genpolicy: add support for DaemonSet YAML input Generate policy for K8s DaemonSet YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	48829120b6	policy: initial genpolicy commit Add application that infers K8s user's intentions based on user's K8s YAML file, and generates a Rego/OPA based policy for that YAML. Just Pod YAML files are supported as input using this initial source code. Support for other types of YAML files will come with upcoming commits. Fixes: #7673 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Chao Wu	555136c1a5	Merge pull request #8662 from openanolis/pci/4-upstream dragonball: introduce pci msi/msix interrupt	2023-12-22 18:08:31 +08:00
Steve Horsman	c5f939cdc1	Merge pull request #8655 from fidencio/topic/kata-deploy-add-snapshotter-support kata-deploy: Allow setting up snapshotters per runtime handler	2023-12-22 09:16:07 +00:00
Chao Wu	8cf3bcefd8	dragonball: introduce pci msi/msix interrupt introduce msi/msix mod to maintain information for PCI Message Signalled Interrupt Extended Capability. It will be initialized when parsing pci configuration space and used when getting interrupt capabilities. fixes: #8661 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com> Signed-off-by: Yang Su <yang.su@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Xin Lin <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-22 16:28:22 +08:00
Xuewei Niu	beadce54c5	dragonball: Support vhost-user-net devices This PR introduces vhost-user-net devices to Dragonball. The devices are allowed to run as server on the VMM side. Fixes: #8502 Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-22 14:53:18 +08:00
Xuewei Niu	1f21d3cb2c	dragonball: Introduce address space for MmioV2DeviceState Vhost-user-net has a dependency on address space from `MmioV2DeviceState`. The addition of the address space is introduced in this patch. Plus, it makes sure all unit tests have the according parameter as well. Fixes: #8502 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-22 14:53:18 +08:00
Fupan Li	dc9a0ac8ce	Merge pull request #8718 from justxuewei/enable-vhost tests: Load vhost modules explicitly while Kata installing	2023-12-22 14:52:49 +08:00
Xuewei Niu	206ed6d77d	tests: Load vhost modules explicitly while Kata installing The default network backend of runtime-rs with Dragonball is vhost-net after #8609 merged. The tests might be failed if vhost modules are not loaded. Fixes: #8717 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-22 11:07:37 +08:00
alex.lyn	94c83cea84	runtime-rs: Refactor vfio driver implementation It's important to ensure that these tasks which setup vfio devices are completed before add_device. So Moving vfio device setup code to a dedicated method at device building time which does not affect the behavior of other code. And this change makes it easier to understand the difference between create and attach, and also makes the boundaries clearer. Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-22 10:37:40 +08:00
alex.lyn	82d3cfdeda	runtime-rs: Make VhostUserConfig's field pci_path type more specific Make VhostUserConfig pci_path's type more specific, change it from Option<String> to Option<PciPath>. Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-22 10:35:38 +08:00
alex.lyn	5cc2890a10	runtime-rs: refactor and re-implement pci path. Do refactor and re-implement to make the pci path more "rusty". Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-22 10:34:41 +08:00
Fabiano Fidêncio	32e1ba2525	Merge pull request #8714 from cmaf/libsh-update-loc tests: Use function from Kata repo	2023-12-21 12:30:31 -03:00
Fabiano Fidêncio	6cc6ca5a7f	kata-deploy: Allow setting up snapshotters per runtime handler Since containerd 1.7.0 we can easily set a specific snapshotter to be used with a runtime handler, and we should take advantage of this, mostly as it'll help setting up any runtime using devmapper or nydus snapshotters. This implementation here has a few caveats: * The format expected for the SNAPSHOTTER_HANDLER_MAPPING is: `shim:snapshotter,shim:snapshotter,...` * It only works with containerd 1.7 or newer * We never change the default containerd snapshotter * We don't do any check on our side to verify whether the snapshotter required is properly deployed * Users will have to add an annotation to their pods, in order to use the snapshotter set up per runtime handler * Example: ``` metadata: ... annotations: io.containerd.cri.runtime-handler: kata-fc ``` Fixes: #8615 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-21 07:20:10 -03:00
alex.lyn	1b5758c1f2	runtime-rs: Move the PciPath-related code to a dedicated file Move the pciPath code to a new file pci_path.rs and update the references. Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-21 11:35:18 +08:00
alex.lyn	275de453d5	runtime-rs: remove useless get_host_guest_map and its test case Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-21 11:07:56 +08:00
Chelsea Mafrica	9f394f6e18	tests: Use function from Kata repo Switch to use function from Kata repo in common.bash to reduce dependency on the tests repo. Fixes #8713 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-20 16:45:06 -08:00
Dan Mihai	d916da15dd	Merge pull request #8688 from microsoft/danmihai1/k8s-confidential tests: retry connection to pod SSH server	2023-12-20 15:01:26 -08:00
Fabiano Fidêncio	3482256340	Merge pull request #8709 from fidencio/topic/update-jq-for-kata-deploy kata-deploy: Update `jq` as part of the kata-deploy daemonset	2023-12-20 16:48:07 -03:00
James O. D. Hunt	7da6d0a845	runtime-rs: ch: Implement missing thread/pid APIs Add implementations for the following `Hypervisor` trait methods which simply return the same details as the `get_vmm_master_tid()` method: - `get_thread_ids()` - `get_pids()` Fixes: #6438. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-20 17:58:40 +00:00
Fabiano Fidêncio	c9e631dc0c	kata-deploy: Reapply "kata-deploy: Use tomlq to configure containerd" This reverts commit `ee5fa08a27`. This is perfectly fine to do as we narrwoed down the issue to be on the version of `jq` provided by alpine, and we've already updated it in the previous commit (in this very same series). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-20 12:52:41 -03:00
Fabiano Fidêncio	41320c586e	kata-deploy: Install jq from GitHub `jq` coming from alpine is in its 1.6 version, and that has a bug that hits us quite hard, as it changes a float to an int whenever the number is in the `x.0` format. One example is: ```bash / # jq --version jq-1.6 / # echo '{"foo": 1.0}' \| jq .foo 1 ``` With this in mind, let's switch, at least for now, to using the `jq` released directly on github, as it does address the issue we've been hitting. ```bash ⋊> Downloads ./jq-linux-amd64 --version jq-1.7 ⋊> Downloads echo '{"foo": 1.0}' \| jq .foo 1.0 ``` Fixes: #8678 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-20 12:52:41 -03:00
Greg Kurz	ce094ecdc2	Merge pull request #8679 from stevenhorsman/kata-deploy-containerd-config-fix gha: kata-deploy: Revert containerd config break	2023-12-20 12:58:56 +01:00
stevenhorsman	ee5fa08a27	Revert "kata-deploy: Use tomlq to configure containerd" This reverts commit `dd9f5b07b9`. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-12-20 09:10:43 +00:00
stevenhorsman	9e718b4e23	gha: kata-deploy: Add containerd status check After kata-deploy has installed, check that the worker nodes are still in Ready state and don't have a containerd://Unknown container runtime versions, identicating that container isn't working to ensure that we didn't corrupt the containerd config during kata-deploy's edits Fixes: #8678 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-12-20 09:10:43 +00:00
Archana Shinde	7e5868a55f	Merge pull request #8588 from amshinde/runtime-rs-update-readme runtime-rs: Update readme to indicate cloud-hypervisor support	2023-12-19 22:09:14 -08:00
Dan Mihai	8aa390279e	tests: retry connection to pod SSH server To become more resilient against these kinds of errors: deployment.apps/confidential-unencrypted created pod/confidential-unencrypted-c5fdd6964-rrb6q condition met ssh: connect to host 10.42.0.109 port 22: Connection refused Fixes: #8687 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-20 02:48:05 +00:00
GabyCT	5504176e9a	Merge pull request #8699 from GabyCT/topic/fixconfidentialscript tests: k8s: Fix indentation in confidential common script	2023-12-19 16:01:28 -06:00
Dan Mihai	6cea8a5f2a	Merge pull request #8697 from microsoft/danmihai1/runk tests: additional run-runk logging	2023-12-19 11:27:29 -08:00
Dan Mihai	551a50cd72	tests: additional run-runk logging Add logging to run-runk, for debugging possible failures. Fixes: #8696 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-19 14:08:01 +00:00
Hyounggyu Choi	540a2a7fb1	runtime: Allow no initrd path for IBM Z Secure Execution This is to reintroduce a configuration rule for IBM Z Secure Execution, where no initrd path should be configured. For the TEE of interest, only a kernel image should be specified with `confidential_guest=true`. Fixes: #8692 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-19 11:21:16 +01:00
Xuewei Niu	ec30d5a9a8	Merge pull request #8700 from justxuewei/dbs-ut dragonball: Trigger unit tests of dbs_* subcrates by `make test`	2023-12-19 17:51:20 +08:00
Xuewei Niu	039fe7f391	dragonball: Trigger unit tests of dbs_* subcrates by `make test` `make SUPPORT_VIRTUALIZATION=1 test` iterates through all subcrates and does test. Plus, this patch fixes some issues about unit tests: - Feed too much parameters to `I8042Device::new()`. - Virtqueue checks have been introduced since `virtio-queue v0.7.0`. - GHA might have no access to `/var/tmp` dir on runner. Fixes: #8690 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-19 16:22:37 +08:00
Hyounggyu Choi	ceea8882db	Merge pull request #8672 from BbolroC/introduce-vsock-device-init runtime-rs: Separate init_config() from new() for struct VsockDevice	2023-12-18 22:04:37 +01:00
Gabriela Cervantes	1469a5efca	tests: k8s: Fix indentation in confidential common script This PR fixes the indentation of the confidential common script for kubernetes tests. Fixes #8698 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-18 20:25:06 +00:00
Chelsea Mafrica	312475508a	Merge pull request #8682 from cmaf/static-checks-update-loc ci: Use static checks from kata repo for lib functions	2023-12-18 09:53:01 -08:00
Hyounggyu Choi	3cd0cc1388	runtime-rs: Separate init_config() from new() for struct VsockDevice As a follow-up for #8516, guest_cid and vhost_fd are not necessarily initialised via new(). Instead, the fields should be initialised later when they are really used to construct hypervisor's parameters. This commit is to separate init_config() from new() to initialise guest_cid and vhost_fd and leave only the assignment of id for the existing function. Fixes: #8671 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-18 16:36:09 +01:00
Greg Kurz	2987d3eeb5	Merge pull request #8341 from jongwu/fix_cpushares agent: correct CPUShares and CPUWeight value	2023-12-18 15:40:04 +01:00
James O. D. Hunt	3c49120d2f	Merge pull request #8641 from jodh-intel/kata-ctl-add-cfg-file-cli-option kata-ctl: Add option to dump config files	2023-12-18 11:54:19 +00:00
Greg Kurz	1cfcc80018	Merge pull request #8664 from amshinde/remove-ignore-paths-ga github-actions: Remove ignore paths for required CI checks	2023-12-18 12:49:21 +01:00
Chelsea Mafrica	b785ef96ec	docs: Change location of static checks script We now use the static checks script from the main kata containers repo and not the tests repo; update documentation to reflect this. Fixes #8681 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-15 17:13:02 -08:00
Chelsea Mafrica	bfb756199f	ci: Use static checks from kata repo for lib functions Change the two functions in lib.sh to use the static checks script from the kata containers repo instead of tests. Remove cloning the repo from these functions since we don't need it anymore. Leave these two functions because the document checking one may be used locally and the static checks one is called from the virtcontainers Makefile. Fixes #8681 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-15 17:08:33 -08:00
Archana Shinde	510bc36a77	github-actions: Remove ignore paths for required CI checks If a PR contains files from the ignore-paths, these actions do not run as intended. However, the actions are make as required. And there does not seem to be a way to mark these as non-required in that case. As a result a PR containing the files from the ignore-paths remains stalled. Hence remove the ignore-paths until github provides a way to mark actions that are skipped due to ignore-paths as non-required/passed. Fixes: #8663 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-15 15:12:20 -08:00
Liu Wenyuan	61fe20cf9a	gha: Fix some of gha metrics failure for StratoVirt Update the Speed & Density metric tests baseline for StratoVirt and re-enable them, and skip other metric tests temporarily. Fixes: #8656 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-12-15 17:45:01 +08:00
Zhongtao Hu	0f80dc636c	Merge pull request #6876 from openanolis/memory_hotlug runtime-rs: support Memory hotplug	2023-12-15 14:28:35 +08:00
Zhongtao Hu	9a37e77f2a	runtime-rs: check the update memory size check the update memory size greater than default max memory size Fixes:#6875 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 11:25:34 +08:00
Zhongtao Hu	6039417104	runtime-rs: add default_maxmemory in config file add default_maxmemory in config file Fixes:#6875 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 10:25:20 +08:00
Zhongtao Hu	8d9fd9c067	runtime-rs: support memory resize Fixes:#6875 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 10:25:13 +08:00
Zhongtao Hu	81e55c424a	runtime-rs: add resize_memory trait for hypervisor Fixes: #6875 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 10:25:03 +08:00
Zhongtao Hu	d428a3f9b9	runtim-rs: get guest memory details get memory block size and guest mem hotplug probe Fixes:#6356 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 10:22:37 +08:00
GabyCT	4a49dd73db	Merge pull request #8676 from GabyCT/topic/fixins tests: k8s: Fix indentation in setup script	2023-12-14 13:57:47 -06:00
GabyCT	7a606a19c4	Merge pull request #8659 from GabyCT/topic/improvecleanuplatency metrics: Improve latency network cleanup	2023-12-14 13:57:28 -06:00
GabyCT	0831529279	Merge pull request #8644 from GabyCT/topic/updadockerresint metrics: Update TensorFlow ResNet50 Int8 Dockerfile	2023-12-14 13:56:41 -06:00
Jianyong Wu	58e88d9469	agent: correct CPUShares and CPUWeight value If cgroup driver is systemd, CPUShares, for cgroup v1, should be at least 2 [1] and CPUWeight for cgroup v2, should be at least 1 [2]. Fixes: #8340 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> [1] `d19434fbf8/src/basic/cgroup-util.h (L122)` [2] `d19434fbf8/src/basic/cgroup-util.h (L91)`	2023-12-15 02:04:31 +08:00
Steve Horsman	04de6eb4fd	Merge pull request #8674 from ChengyuZhu6/fix_statis_check static-checks: Add some dependencies to static checks for CoCo features	2023-12-14 16:47:01 +00:00
Greg Kurz	1bd9c1b4de	Merge pull request #8589 from wvell/patch-1 Remove warning for cgroupsv2 only operating systems	2023-12-14 17:37:59 +01:00
Gabriela Cervantes	c92b14da97	tests: k8s: Fix indentation in setup script This PR fixes the indentation of the kubernetes setup script. Fixes #8675 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-14 16:26:22 +00:00
Amulya Meka	ac7b3d4735	Merge pull request #8667 from Amulyam24/workflow gha: add a post cleanup script for cri-containerd ppc64le workflow	2023-12-14 21:52:54 +05:30
Alex.Lyn	c7c7632203	Merge pull request #8620 from Apokleos/enhance-directv-using-csi runtime-rs: Enhancement of DirectVolume when using a dedicated CSI	2023-12-14 22:59:09 +08:00
ChengyuZhu6	dfad0e6622	.github: fix the failure without devicemapper for host sharing fix error when running checks and tests: error: failed to run custom build command for `devicemapper-sys v0.1.5` fatal error: 'libdevmapper.h' file not found thread 'main' panicked at 'Could not generate dm.h bindings: ClangDiagnostic("dm.h:2:10: fatal error: 'libdevmapper.h' file not found\n")', /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/devicemapper-sys-0.1.5/build.rs:24:10 stack backtrace: 0: rust_begin_unwind at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:593:5 1: core::panicking::panic_fmt at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:67:14 2: core::result::unwrap_failed at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/result.rs:1651:5 3: core::result::Result<T,E>::expect 4: build_script_build::main 5: core::ops::function::FnOnce::call_once note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. warning: build failed, waiting for other jobs to finish... make: *** [../../utils.mk:177: standard_rust_check] Error 101 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-12-14 20:47:47 +08:00
ChengyuZhu6	983479748f	.github: fix error when making checks for CoCo guest pull Fix error when making checks: ``` error: failed to run custom build command for `image-rs v0.1.0 (https://github.com/confidential-containers/guest-components?tag=v0.8.0#e849dc89)` Caused by: process didn't exit successfully: `/home/runner/work/kata-containers/kata-containers/src/ agent/target/release/build/image-rs-fd932206d09362b7/build-script-build` (exit status: 101) --- stdout cargo:rerun-if-changed=./protos/getresource.proto cargo:rerun-if-changed=./protos --- stderr thread 'main' panicked at 'Could not find `protoc` installation and this build crate cannot proceed without this knowledge. If `protoc` is installed and this crate had trouble finding it, you can set the `PROTOC` environment variable with the specific path to your installed `protoc` binary.If you're on debian, try `apt-get install protobuf-compiler` or download it from https://github.com/protocolbuffers/protobuf/releases ``` Fixes #8673 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-12-14 20:47:42 +08:00
alex.lyn	aa42f0a03f	runtime-rs: Enhancement of DirectVolume when using CSI. We use a matching direct-volume path to determine whether an OCI mount is a DirectVolume. However, we should handle the case where no match is found appropriately. This error will be defined as a non-DirectVolume type when judging the OCI mount but not failed. Fixes: #8619 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-14 18:19:03 +08:00
alex.lyn	80d631ee84	runtime-rs: Add attribute serde rename to each field of DirectVolume. DirectVolume structure in runtime-rs is different from it in kata-runtime, which causes they has no unified handling method for DirectVolumeMountInfo and MountInfo. We should align the two by simply adding the attribute #[serde(rename="x") to each field in DirectVolumeMountInfo Fixes: #8619 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-14 18:18:40 +08:00
Xuewei Niu	7f611dfe84	Merge pull request #8609 from justxuewei/runtime-rs-vhost-net dragonball: Use vhost-net device by default	2023-12-14 16:33:29 +08:00
Amulyam24	0db820fa01	gha: add a post cleanup script for cri-containerd ppc64le workflow This PR identifies and adds an action to cleanup the ppc64le self hosted runner. Fixes: #8666 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-12-14 13:46:47 +05:30
Hyounggyu Choi	fbc04460f6	Merge pull request #8649 from BbolroC/put-pre-action-gha-s390x GHA: Put all the preliminary steps into pre-action for s390x	2023-12-14 07:16:17 +01:00
Xuewei Niu	82fde4431e	dragonball: Set default queue config for vhost-net device Dragonball sets a default queue config in the case of `None`. The queue_size and num_queues of vhost-net are set to `Some(0)` by default. Therefore, we might get an invalid queue config. This patch fixes this issue. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-14 11:18:33 +08:00
Xuewei Niu	c11b066728	runtime-rs: Use vhost-net device by default This patch set vhost-net as default backend of networking. It allows users to set `disable_vhost_net` to `true` to reenable virtio-net backend. Plus, which backend to use is a matter of hypervisor, runtime-rs will no longer need to know that. Fixes: #8608 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-14 11:18:26 +08:00
Chelsea Mafrica	6c2e2a9120	Merge pull request #8635 from cmaf/migrate-static-checks-gha static-checks: Direct Makefile to use new static checks	2023-12-13 16:00:16 -08:00
Gabriela Cervantes	8151117f73	metrics: Improve latency network cleanup This PR improves the latency network cleanup by removing the pods even if the test fails. Fixes #8658 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-13 17:56:01 +00:00
Fabiano Fidêncio	a998e89bcf	Merge pull request #8639 from fidencio/topic/kata-deploy-use-tomlq-to-configure-containerd kata-deploy: Use `tomlq` to configure containerd	2023-12-13 14:11:45 +01:00
Hyounggyu Choi	05e278de5b	GHA: Put all the preliminary steps into pre-action for s390x This is to introduce a pre-action to all the workflows for building artifacts. The action could take care of tasks such as cleaning up files and reinstalling packages, which prevents a workflow from getting affected by the environment. This also includes the removal of the step `Adjust a permission for repo`, because it could be incorporated into the action. Fixes: #8648 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-13 13:24:40 +01:00
Chao Wu	dfaf006fcc	Merge pull request #8564 from openanolis/chao/add_pci_root_bus_device dragonball: add pci root bus and root device	2023-12-13 17:57:16 +08:00
Fabiano Fidêncio	7ad873cf29	kata-deploy: Simplify shim configuration We never have to add a configuration for the "default" case, as we're already creating the runtime class pointing to what should be the "default" handler. This helps to simplify the logic by quite a lot. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-13 10:52:54 +01:00
Fabiano Fidêncio	e618949937	kata-deploy: Remove useless comment from CRI-O drop-in The comment adds absolutely nothing to the runtime handler added, and it'd make our life slightly harder to properly say which VMM is being used when setting the default `kata` handler. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-13 10:49:52 +01:00
Fabiano Fidêncio	dd9f5b07b9	kata-deploy: Use tomlq to configure containerd This save us a lot of trouble on properly sed'ing content that may or may not be in the containerd configuration file. Fixes: #8638 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-13 10:49:49 +01:00
Fabiano Fidêncio	4f01f294bb	kata-deploy: Install `tomlq` to the base image This will help us to have an easier time playing with the containerd configuration, instead of having to sed the **** out of it, which is super error prone. `tomlq` is a tool that comes from https://github.com/kislyuk/yq, and that depends on `jq` to do the toml parsing / editing. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-13 10:49:07 +01:00
James O. D. Hunt	d7c6219dfe	Merge pull request #8630 from jodh-intel/runtime-rs-ch-set-state-on-vm-stop runtime-rs: ch: Change state when VM stopped	2023-12-13 09:26:30 +00:00
Xuewei Niu	855adbc63b	Merge pull request #8634 from justxuewei/disable-packed-vq dragonball: Disable packed virtqueue for vhost-user devices	2023-12-13 17:03:05 +08:00
wvell	af4622fcc1	docs: Remove warning for cgroupsv2 only operating systems Removes warning for cgroupsv2 as it is not needed anymore according to #6259. Fixes #8650 Signed-off-by: wvell <w.vellema@slash2.nl>	2023-12-13 09:18:39 +01:00
Chelsea Mafrica	b46cb22270	static-checks: Direct Makefile to use new static checks Direct the Makefile to use the static checks script in the tests directory of the main Kata Containers repo so it is run in GHA. Fixes #8595 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-12 16:43:35 -08:00
Chelsea Mafrica	63636b869c	static-checks: Update copyright dates Some copyright dates were not updated with the most recent changes to code; update them. Fixes #8595 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-12 16:34:06 -08:00
Chelsea Mafrica	b11c772865	static-checks: Change dir for building tools Change directory for running make due to local errors when building with make -C. Fixes #8595 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-12 16:34:06 -08:00
James O. D. Hunt	2a518f0898	runtime-rs: ch: Change state when VM stopped Make the CH (Cloud Hypervisor) `stop_vm()` method check the VM state before attempting to stop the VM, and update the state once the VM has stopped. This avoids the method failing if called multiple times which will happen if the workload exits before the container manager requests that the container stop. This change ensures the CH driver finishes cleanly. Fixes: #8629. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-12 18:25:20 +00:00
Fabiano Fidêncio	39f5cea3b1	kata-deploy: Fix k0s cri notation comment We can safely assume we're using the newer notation, not the older one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-12 18:20:18 +01:00
Gabriela Cervantes	23f76653e5	metrics: Update command to run the tensorflow int8 benchmark This PR updates the command to run the tensorflow resnet50 int8 benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-12 16:24:09 +00:00
Gabriela Cervantes	8fd5ef7fb7	metrics: Update TensorFlow ResNet50 Int8 Dockerfile This PR updates the TensorFlow ResNet50 Int8 Dockerfile to use the proper python version for kata metrics. Fixes #8643 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-12 16:20:56 +00:00
James O. D. Hunt	1195692d3c	runtime-rs: ch: Move state handling to top-level APIs Move the state setting to the `Hypervisor` trait calls. This makes the code clearer. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-12 15:25:27 +00:00
James O. D. Hunt	5637f11a8c	kata-ctl: Add option to dump config files Add a `--show-default-config-paths` command line option for parity with `kata-runtime`. Note that this requires the `KataCtlCli.command` to be optional so that the user can run simply: ```bash $ kata-ctl --show-default-config-paths ``` ... without also specifying a (sub-)command. Fixes: #8640. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-12 14:20:04 +00:00
Chelsea Mafrica	a9d360728e	static-checks: Fix directory for github labels Fix paths for yqdir (where the install_yq.sh script currently is) so that static checks can run without error. Fixes #8595 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-12 02:16:35 -08:00
Xuewei Niu	86918e91b3	dragonball: Disable packed virtqueue for vhost-user devices The layout of packed virtqueue isn't supported by `Endpoint::negotiate()`. Communication between device and driver will be failed due to the failure of parsing virtqueue if we don't disable the packed feature. This patch fixes this issue. Fixes: #8633 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-12 17:24:20 +08:00
Chao Wu	b079e1aabc	dragonball: add pci root bus and root device In order to follow up the PCI implementation in Dragonball, we need to add PCI root device and root bus support. root device is a pseudo PCI root device to manage accessing to PCI configuration space. root bus is mainly for emulating PCI root bridge and also create the PCI root bus with the given bus ID with the PCI root bridge. fixes: #8563 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com> Signed-off-by: Yang Su <yang.su@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Xin Lin <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-12 11:43:14 +08:00
GabyCT	ee74fca92c	Merge pull request #8617 from GabyCT/topic/enabletestnerdctl tests: nerdctl: Enable nerdctl tests for cloud hypervisor runtime-rs	2023-12-11 14:09:58 -06:00
David Esparza	584a26dab0	Merge pull request #8542 from dborquez/metrics_fix_deployment_cleaning metrics: cleans k8s iperf deployment when the test finishes.	2023-12-11 13:14:39 -06:00
Chao Wu	198e4adcb1	Merge pull request #8599 from openanolis/chao/fix_cargo_fmt dragonball: add --all for fmt ci	2023-12-12 00:20:21 +08:00
GabyCT	43410e1918	Merge pull request #8560 from GabyCT/topic/enablek8srs gha: k8s: Add cloud-hypervisor (runtime-rs) support	2023-12-11 09:42:49 -06:00
Hyounggyu Choi	ea2a0dc69d	Merge pull request #7769 from BbolroC/opa-multiarch rootfs: build OPA binary from source for ppc64le and s390x	2023-12-11 15:25:33 +01:00
Chao Wu	52f7a40e4e	dragonball: add --all for fmt ci Right now, cargo fmt check in Dragonball only test with the default features but not all features. This will cause some code being untested by the fmt tool. This PR adds --all option for the Dragonball CI and also fix some code that forgets to do cargo fmt --all. fixes: #8598 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-11 20:54:25 +08:00
Hyounggyu Choi	375c787e09	rootfs: build OPA binary from source for ppc64le and s390x This PR is to build a binary for OPA from source code for ppc64le and s390x. Fixes: #7616 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-11 12:59:48 +01:00
Hyounggyu Choi	16e2a50d17	Merge pull request #8624 from BbolroC/fix-runtime-class-check-qemu-se GHA: Fix kata-deploy-runtime-classes-check for kata-qemu-se	2023-12-11 12:58:00 +01:00
James O. D. Hunt	2a35541af7	Merge pull request #8592 from jodh-intel/static-checks-try-multiple-user-agents CI: static-checks: Try multiple user agents	2023-12-11 11:52:29 +00:00
Hyounggyu Choi	28c3e0e5f0	GHA: Fix kata-deploy-runtime-classes-check for kata-qemu-se This is to fix an error on kata-deploy-runtime-classes-check for kata-qemu-se. Fixes: #8623 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-11 10:30:00 +01:00
Hyounggyu Choi	b469dbf92f	Merge pull request #8622 from BbolroC/hotfix-k3s-kubectl-version GHA: Use --client=true for k3s kubectl version	2023-12-11 10:00:16 +01:00
Hyounggyu Choi	40f0c8fbb7	GHA: Use --client=true for k3s kubectl version This is to fix a broken usage for `k3s kubectl version` by switching an option `--short` to `--client=true`. Fixes: #8621 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-11 08:26:39 +01:00
Chao Wu	df7f416cb8	Merge pull request #8566 from liubogithub/liubo/dev/panic_fix runtime-rs: fix panic when hypervisor mismatches with configuration	2023-12-10 21:33:59 +08:00
Gabriela Cervantes	1662a3e859	common: Add cloud hypervisor in enabling hypervisor function This PR adds the cloud hypervisor in the enabling hypervisor function. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-08 21:32:00 +00:00
Chelsea Mafrica	1c42d94550	Merge pull request #6826 from gabevenberg/log-parser-rs kata-ctl: Moved log-parser-rs into kata-ctl	2023-12-08 11:33:09 -08:00
James O. D. Hunt	5d085a3042	CI: static-checks: Try multiple user agents Make the URL checker cycle through a list of user agent values until we hit one the remote server is happy with. This is required since, unfortunately, we really, really want to check these URLs, but some sites block clients based on their `User-Agent` (UA) request header value. And of course, each site is different and can change its behaviour at any time. Our strategy therefore is to try various UA's until we find one the server accepts: - No explicit UA (use `curl`'s default) - Explicitly no UA. - A blank UA. - Partial UA values for various CLI tools. - Partial UA values for various console web browsers. - Partial UA for Emacs's built-in browser. - The existing UA which is used as a "last ditch" attempt where the UA implies multiple platforms and browser. > Notes: > > - The "partial UA" values specify specify the UA "product" but not the > UA "product version": we specify `foo` and not `foo/1.2.3`). We do > this since most sites tested appear to not care about the version. > This is as expected given that the version is strictly optional (see `[]`). > > - We now log all errors and display an error summary if none of the UAs > worked, in addition to the simple list of the URLs we believe to be > invalid. This should make future debugging simpler. `[]` - https://www.rfc-editor.org/rfc/rfc9110#section-10.1.5 Fixes: #8553. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 18:02:41 +00:00
James O. D. Hunt	3174c18772	docs: Remove problematic URL Removed the Azure Portal URL (https://portal.azure.com) since this causes problems with our static checks script: that URL returns HTTP 403 ("Forbidden") when queried using command-line tools like `curl(1)`, which is used by the static check script. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	3779261a99	docs: Fix whitespace Remove some extraneous whitespace. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	613def0328	CI: static-checks: Move curl to a separate function Split the call to `curl` in the URL checker out into a new `run_url_check_cmd()` function to make `check_url()` slightly clearer. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	6d859f97ee	CI: static-checks: Lint fixes Declare and then define a couple of variables separately. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	efa8e6547c	CI: static-checks: Check params have a value Check that the `check_url()` parameters have a value. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	563ea020b0	CI: static-checks: Fold long line Break up a long line as little to make it easier to read. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	3ad43df946	CI: static-checks: Improve markdown checker test Only attempt to build the markdown checker if it doesn't already exist. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
Liu Bo	bf97051f11	runtime-rs: fix panic when hypervisor mismatches with configuration If a wrong configuration.toml file is used by accidentally, runtime-rs binary could run into panic because of unwrap(). This fixes the panic by returning errors instead of unwrap(). fixes: #8565 Signed-off-by: Liu Bo <liub.liubo@gmail.com>	2023-12-08 08:56:23 -08:00
Zvonko Kaiser	9d38f01c2f	Merge pull request #8612 from BbolroC/introduce-secret-inheritance-s390x GHA: make secrets inherited for build-kata-static-tarball-s390x	2023-12-08 17:32:47 +01:00
Gabriela Cervantes	f3eeab10ab	tests: nerdctl: Enable nerdctl tests for cloud hypervisor runtime-rs This PR enables the nerdctl tests for cloud hypervisor runtime-rs. Fixes #8616 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-08 16:12:36 +00:00
Hyounggyu Choi	636eef8907	GHA: make secrets inherited for build-kata-static-tarball-s390x This is to make GHA secrets inherited for the workflow titled `build-kata-static-tarball-s390x` to configure an environment variable `CI_HKD_PATH` for a `build-asset-boot-image-se` step. Fixes: #8611 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-08 13:55:45 +01:00
Chao Wu	5054e59ccb	Merge pull request #8429 from adamqqqplay/support-vhost-user-fs dragonball: introduce vhost-user-fs device	2023-12-08 17:20:52 +08:00
Hyounggyu Choi	588f639a69	Merge pull request #6755 from BbolroC/add-se-artifacts-to-main packaging: Add IBM Z SE artifacts to main	2023-12-08 05:17:38 +01:00
Gabe Venberg	69fdd05ce5	kata-ctl: Moved log-parser-rs into kata-ctl Log-parser-rs was always intended to become a sub-functionality of kata-ctl, but it was useful to develop it and initaly merge it as a standalone program, and migrate it to a subcommand later. Fixes #6797 Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>	2023-12-07 21:35:28 -06:00
David Esparza	b2577000e7	metrics: Expose iperf3 pods over a k8s networks. A prerequisite for measuring kata network bandwidth is run Iperf3 tool at a the transport layer provided by a k8s service for exposing a network where the clients inside the cluster can use to contact Pods in the service. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-12-07 18:07:05 -06:00
David Esparza	a062ba166b	metrics: cleans k8s iperf deployment when the test finishes. This PR fixes small issues like: 1. Cleaning up the k8s environment by removing the iperf test implementation even when the test fails. 2. Checks if the workload returned a result before generating an empty results json file as it was bein done. 3. Removes the redundancy of calls to functions that process subtests and should compose the results json file only when all results are ready and not before. 4. The tcp service manifest was added to the server deployment which targets TCP port 5201. Fixes: #8534 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-12-07 18:02:39 -06:00
Archana Shinde	a5105b4227	Merge pull request #8582 from amshinde/runtime-rs-tryfrom-blkconfig Implement and use try_from for DiskConfig	2023-12-07 15:02:00 -08:00
Archana Shinde	458e91b289	runtime-rs: Update readme to indicate cloud-hypervisor support Since cloud-hypervisor is no longer built as an optional feature, lets mention cloud-hypervisor in the list of hypervisors supported by runtime-rs. Fixes: #8587 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-07 14:59:43 -08:00
GabyCT	0e0a7d9410	Merge pull request #8604 from GabyCT/topic/enablenerdctlrs gha: nerdctl: Enable cloud hypervisor runtime-rs for nerdctl CI	2023-12-07 14:35:26 -06:00
Hyounggyu Choi	3fab1690a4	local-build: make strip support for cross-compilation This is to adjust a name of the binary `strip` to a target architecture for cross-compilation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
Hyounggyu Choi	f38c7f14c5	gha: remove build redundancy of kernel and rootfs-initrd It is to remove the build redundancy of `kernel` and `rootfs-initrd` by making `boot-image-se` built based on them at the second build stage. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
Hyounggyu Choi	31db56207b	local-build: add support for key verification for IBM Secure Execution This is to make `build_se_image.sh` incorporate the key verification originally supported by `genprotimg`. It can be achieved by specifying two environment variables called `SIGNING_KEY_CERT_PATH` and `INTERMEDIATE_CA_CERT_PATH`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
Hyounggyu Choi	52bdc87fe9	local-build: make kernel parameters configurable This is to make kernel parameters configurable during the secure image build by adding an environment variable SE_KERNEL_PARAMS. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
Hyounggyu Choi	9ceb2c27e0	local-build: consider cross-compilation env This is to make a base builder image build genprotimg without a package manager under the cross-compilation environment. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
David Esparza	298be4aa1c	Merge pull request #8594 from GabyCT/topic/updatedockerfilet metrics: Update TensorFlow ResNet FP32 dockerfile	2023-12-07 11:14:48 -06:00
Gabriela Cervantes	ce694b905b	tests: Fix indentation of gha-run script This PR fixes the indentation of gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:56:19 +00:00
Gabriela Cervantes	33b300431e	tests: Enable but do not run k8s tests for cloud hypervisor This PR enables but do not run k8s tests for cloud hypervisor for runtime-rs. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:39:15 +00:00
Gabriela Cervantes	acee3d8438	gha: k8s: Add cloud-hypervisor (runtime-rs) support This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs, as part of the kubernetes tests. Fixes #8559 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:33:59 +00:00
Gabriela Cervantes	50a5fa9a65	tests: Enable but do not run the nerdctl tests for cloud hypervisor This PR enables but do not run the nerdctl tests for cloud hypervisor runtime-rs until we find out how stable they are. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:29:51 +00:00
Gabriela Cervantes	e70b2ea95d	gha: nerdctl: Enable cloud hypervisor runtime-rs for nerdctl CI This PR enables the cloud hypervisor runtime-rs for the nerdctl gha CI. Fixes #8603 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:24:36 +00:00
Hyounggyu Choi	ad6aab9918	Merge pull request #8601 from BbolroC/conflict-handling-for-self-hosted-runners GHA: remove GITHUB_WORKSPACE when workflow fails due to merge conflict	2023-12-07 12:17:31 +01:00
Hyounggyu Choi	0d5a970e54	GHA: remove GITHUB_WORKSPACE when workflow fails due to merge conflict It is to remove a GITHUB_WORKSPACE directory for self-hosted runners when a workflow fails due to the merge conflict. This will prevent the subsequent workflows from getting stuck in the same situation. Fixes: #8600 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 10:25:57 +01:00
Greg Kurz	501910d743	Merge pull request #8509 from zvonkok/stable-overlay deployment: Add stable overlay for kata-deploy.yaml	2023-12-07 09:43:41 +01:00
Huang Jianan	5629b7454f	dragonball: support vhost-user-fs in device manager This patch implements the virtio-fs device used for filesystem sharing and heavily based on the vhost-user protocol. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com> Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>	2023-12-07 11:59:07 +08:00
Archana Shinde	a661ac3a0e	runtime-rs: Implement and use try_from for DiskConfig Implement try_from trait function to convert runtime-rs BlockConfig to cloud-hypervisor DiskConfig. This can allow for code reuse in the future. Fixes: #8581 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-06 12:10:34 -08:00
Fabiano Fidêncio	c14e3096c8	Merge pull request #8580 from amshinde/runtime-rs-clh-network-hotplug runtime-rs: add network hotplug for clh	2023-12-06 20:50:04 +01:00
Gabriela Cervantes	56dddab04f	metrics: Update command to run tensorflow resnet fp32 benchmark This PR updates the command needed to run the tensorflow benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-06 17:02:10 +00:00
Gabriela Cervantes	62fdebeeb5	metrics: Update TensorFlow ResNet FP32 dockerfile This PR updates the python version for the TensorFlow ResNet FP32 dockerfile so the benchmark can run without issues. Fixes #8593 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-06 16:53:21 +00:00
GabyCT	3d149d3455	Merge pull request #8578 from GabyCT/topic/fixlinkconfig docs: Update config containerd url link	2023-12-06 10:40:29 -06:00
Zvonko Kaiser	16380558e0	deployment: Create a stable overaly for kata-deploy Fixes: #8508 Create a stable overlay for kata-deploy.yaml so we do not have to maintain two files, only one. Single source for both. This is also preparation for the helm-overlay Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-12-06 14:23:22 +00:00
Huang Jianan	2a1fc29e84	dragonball: add unit test for vhost-user-fs Add some test cases for vhost-user-fs function. Signed-off-by: Beiyue <beiyue@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>	2023-12-06 10:43:24 +08:00
Huang Jianan	d6cfbe9436	dragonball: support vhost-user-fs This patch implements the virtio-fs device used for filesystem sharing and heavily based on the vhost-user protocol. This vhost-user-fs device defines 5 parameters: - path: vhost-user socket path - tag: mount tag used from the guest to mount the filesystem - req_num_queues: number of request virtqueues - queue_size: depth of each virtqueue - cache_size: cache window size for dax This device needs to be defined before the VM instance is started, which can be done through the dbs-cli tool with --fs option: --fs '{ "sock_path":"/path/to/virtiofs.socket", "tag":"myfs", "num_queues":1, "queue_size":1024, "cache_size":0, "thread_pool_size":1, "cache_policy":"auto", "writeback_cache":true, "no_open":true, "xattr":true, "drop_sys_resource":false, "mode":"vhostuser", "fuse_killpriv_v2":true, "no_readdir":false, }' Fixes: #8428 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>	2023-12-06 10:43:17 +08:00
Archana Shinde	955dec06da	runtime-rs: add network hotplug for clh This is required for clh to work with nerdtcl and docker. This fixes the issues seen with nerdctl while starting a container. Hoewever, container exit with docker is still broken due to an unrelated issue. Fixes: #8579 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-05 15:29:53 -08:00
Fabiano Fidêncio	b056683b7a	Merge pull request #8436 from Lu-Biao/main image-builder: bugfix incorrect partition location	2023-12-06 00:10:06 +01:00
Fabiano Fidêncio	2cd003156e	Merge pull request #8573 from fidencio/topic/gha-add-a-timeout-for-tests gha: basic-ci: Add a timeout for the tests	2023-12-05 22:20:49 +01:00
Fabiano Fidêncio	d149b9f9ca	Merge pull request #7231 from wainersm/measured_rootfs-improvements Build for measured rootfs improvements	2023-12-05 22:20:33 +01:00
Fabiano Fidêncio	f75f17c4ff	Merge pull request #8570 from fidencio/topic/gha-dragonball-enable-some-tests-but-do-not-run-them-yet gha: dragonball: Enable, but do not run, cri-containerd, stability, and devmapper tests	2023-12-05 20:00:24 +01:00
Jeremi Piotrowski	e2c6b8ae6e	Merge pull request #4743 from yuchen0cc/main mount: support checking multiple kinds of block device driver	2023-12-05 18:04:51 +01:00
Gabriela Cervantes	61b868692b	docs: Update config containerd url link This PR updates the config containerd url link in the containerd kata documentation. Fixes #8577 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-05 16:35:21 +00:00
Fabiano Fidêncio	05ce52d746	devmapper: dragonball: Enable, but do not run, the tests This will make the life easier for dragonball developers to properly enable the tests once the tests are ready. Fixes: #8569 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 15:29:23 +01:00
Fabiano Fidêncio	a8a156b1af	stability: dragonball: Enable, but do not run, the tests This will make the life easier for dragonball developers to properly enable the tests once the tests are ready. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 15:29:23 +01:00
Fabiano Fidêncio	16ad721eda	cri-containerd: dragonball: Enable, but do not run, the tests This will make the life easier for dragonball developers to properly enable the tests once the tests are ready. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 15:29:23 +01:00
James O. D. Hunt	d9daadf15c	Merge pull request #8558 from jodh-intel/load-config-improvement runtime-rs: Show config files attempted on config load failure	2023-12-05 11:48:42 +00:00
Greg Kurz	1650d02b91	Merge pull request #8516 from Apokleos/vsock-dev move vsock device into device manager	2023-12-05 11:28:37 +01:00
James O. D. Hunt	93c0fc2ad3	Merge pull request #8551 from amshinde/runtime-rs-setns-clh runtime-rs: Launch cloud-hypervisor in given netns	2023-12-05 10:18:34 +00:00
James O. D. Hunt	d627893975	runtime-rs: Show config files attempted on config load failure PR #8483 changed the location of the rust runtime config files to `/etc/kata-containers/runtime-rs/`. However, if you haven't updated your system to create that directory, attempting to create a container using the rust runtime was giving the following cryptic message (formatted for easier reading): ``` failed to handler message try init runtime instance Caused by: 0: load config 1: load toml config 2: entity not found ``` Now, the message is as follows (again, reformatted for easier reading): ``` failed to handle message try init runtime instance Caused by: 0: load config 1: load TOML config failed (tried [ \"/etc/kata-containers/runtime-rs/configuration.toml\", \"/usr/share/defaults/kata-containers/runtime-rs/configuration.toml\", \"/opt/kata/share/defaults/kata-containers/runtime-rs/configuration.toml\" ]) ``` Fixes: #8557. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-05 09:10:18 +00:00
James O. D. Hunt	45c0364d4c	runtime-rs: Fix typo in task service "failed to handler message" -> "failed to handle message". Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-05 09:10:18 +00:00
Fabiano Fidêncio	a14f2fc180	gha: runk: Fix typo in the test name tracing -> runk Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 09:44:42 +01:00
Fabiano Fidêncio	1a74142a16	gha: basic-ci: Add a timeout for the tests This will ensure no job will be stuck forever, as we've noticed with a few jobs already. Fixes: #8572 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 09:42:46 +01:00
GabyCT	e8b28fed2a	Merge pull request #8540 from GabyCT/topic/fixctrdoc docs: Update cri installation url link	2023-12-04 17:36:33 -06:00
Archana Shinde	2df8144cfe	runtime-rs: Launch cloud-hypervisor in given netns Launch cloud-hypervisor binary in the netns provided at the prepare_vm stage. Fixes: #6441 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-04 13:02:43 -08:00
Hyounggyu Choi	511dd5feac	local-build: add support to build IBM Z SE image This is to add an artifact for IBM Z SE(TEE) to main. Fixes: #6754 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:51 +01:00
Hyounggyu Choi	4de8ef3d18	local-build: add build target boot-image-se This is to add a build target boot-image-se for s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:51 +01:00
Hyounggyu Choi	a63a6959d1	local-build: install s390-tools in Dockerfile This is to install s390-tools including genprotimg during the docker build. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:51 +01:00
Hyounggyu Choi	6d0dabd81e	gha: build secure image for s390x release This is add a build target boot-image-se with a host-key-document config for s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:51 +01:00
Hyounggyu Choi	bb1d4adaa9	config: add SE configuration This is to add SE configuration which is used by kata runtime. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:49 +01:00
Gabriela Cervantes	2b05029347	docs: Update cri installation url link This PR updates the cri installation url link for the containerd documentation. Fixes #8539 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-04 20:07:49 +00:00
Hyounggyu Choi	8de4241d3b	kata-deploy: add kata-qemu-se runtimeclass This is to increase resources for relaxing the limitation of hotplug for SE. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:06:53 +01:00
Hyounggyu Choi	9ede2bcd95	local-build: differentiate build targets based on architecture This is to rule out unnecessary build targets for s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:06:53 +01:00
GabyCT	1c00a9a6a9	Merge pull request #8524 from GabyCT/topic/addiperfinfo docs: Update iperf3 network documentation	2023-12-04 14:03:30 -06:00
GabyCT	1b204cc3cb	Merge pull request #8550 from GabyCT/topic/enableclhstability gha: Add cloud runtime rs as part of the stability tests	2023-12-04 11:37:58 -06:00
Gabriela Cervantes	dfc07d1c72	gha: stability: Add cloud-hypervisor (runtime-rs) support This PR adds the Cloud Hypervisor driver, integraedwith the runtime-rs, as part of the stability tests. Fixes #8462 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-04 15:32:29 +00:00
Fabiano Fidêncio	8d7e0f7721	Merge pull request #8556 from fidencio/topic/kernel-add-tdx-guest-driver kernel: Add CONFIG_TDX_GUEST_DRIVER to the tdx.conf	2023-12-04 15:13:57 +01:00
James O. D. Hunt	e4aebb4560	Merge pull request #8549 from jodh-intel/tdx-no-root libs: protection: x86_64: drop root requirement for querying	2023-12-04 13:03:10 +00:00
Chao Wu	1550ee6767	Merge pull request #8480 from openanolis/chao/add_dbs_pci dragonball: init dbs-pci lib with pci bus & pci conf	2023-12-04 18:08:40 +08:00
Fabiano Fidêncio	03c3f4275e	kernel: Add CONFIG_TDX_GUEST_DRIVER to the tdx.conf The driver enables the userspace interface to communicate with the TDX module to request the TDX guest details, like the attestation report. Fixes: #8555 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-04 10:25:59 +01:00
Biao Lu	b816dca3ed	image-builder: fix incorrect part start position The 'part_start' of image and dax_image should exactly specify the same location, according to the parted documentation, to exactly specify the location, the units of start and end should use MiB. https://www.gnu.org/software/parted/manual/parted.html#IEC-binary-units Fixes: #8435 Signed-off-by: Biao Lu <biao.lu@intel.com>	2023-12-04 17:20:26 +08:00
Chao Wu	52fd57e49a	Merge pull request #8301 from Apokleos/do-direct-volume runtime-rs: Enhancing DirectVolMount Handling with Patching Support	2023-12-04 16:49:46 +08:00
James O. D. Hunt	7beab11d9e	Merge pull request #8547 from jodh-intel/unbreak-logger libs:logging: Fix logger	2023-12-04 08:38:03 +00:00
alex.lyn	0fabfa336d	runtime-rs: bring support for legacy vsock device. Bring support for legacy vsock and add Vsock to the ResourceConfig enum type, and add the processing flow of the Vsock device to the prepare_before_start_vm function. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-04 15:54:51 +08:00
alex.lyn	6c08cf35d5	runtime-rs: Introduce prepare_vm_socket_config to VirtSandbox. Instroduce prepare_vm_socket_config to VirtSandbox for vm socket config, including Vsock and Hybrid Vsock. Use the capabilities() trait of the hypervisor to get the vm socket supported in VMM. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-04 15:54:50 +08:00
alex.lyn	60f88da5e1	runtime-rs: add Capability of HybridVsockSupport for Hypervisor. Add Cap of HybridVsockSupport for hypervisors CLH and Dragonball which use hybrid-vsock, default for Qemu, which uses legacy vsock. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-04 15:54:50 +08:00
alex.lyn	c5178dd258	runtime-rs: Introduce Capability of HybridVsockSupport. Introduce HybridVsock Cap to judge which kind of vm socket will be supported by the Hypervisor. Use `is_hybrid_vsock_supported` to tell if an hypervisor supports hybrid-vsock, if not, it supports legacy vsock. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-04 15:54:29 +08:00
James O. D. Hunt	e1caca3e41	kata-ctl: Remove root requirement for "env" Remove the redundant `kata-ctl` `root` check when running the `env` command. This check duplicated the `GuestProtection` check, and that check is now no longer necessary anyway. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-01 15:55:45 +00:00
James O. D. Hunt	f05ada592f	libs: protection: x86_64: drop root requirement for querying It is no longer necessary to be `root` to query the guest protection (TDX) on `x86_64` systems, so drop the requirement. > Note: > > This change drops the `nix` `Uid` import required for the `root` check. > But at the same time it adds it for PPC64le since that implementation of > `available_guest_protection()` needs it and it was previously missing. Fixes: #8548. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-01 15:55:21 +00:00
Fabiano Fidêncio	852021e416	Merge pull request #8483 from fidencio/topic/move-rust-config-files-to-subdir-based-on-jodh-approach build/kata-deploy: Move rust runtime config files to runtime-rs directory -- based on #8445	2023-12-01 16:22:51 +01:00
James O. D. Hunt	f9f1d3a071	libs:logging: Fix logger PR #8311 inadvertently broke the logging since no log messages below the `Info` level are logged now, regardless of the requested log level. Resolve the issue by storing the requested log level in the `RuntimeComponentLevelFilter` and using that level in the `log()` function, rather than hard-coding `Info` as the default where no entry is found in the `FILTER_RULE` hashmap. Fixes: #8546. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-01 12:21:20 +00:00
yuchen.cc	1cd1558a92	mount: support checking multiple kinds of block device driver Device mapper is the only supported block device driver so far, which seems limiting. Kata Containers can work well with other block devices. It is necessary to enhance supporting of multiple kinds of host block device. Fixes #4714 Signed-off-by: yuchen.cc <yuchen.cc@alibaba-inc.com>	2023-12-01 11:59:30 +08:00
Chelsea Mafrica	818b8f93b1	Merge pull request #8288 from cmaf/migrate-static-checks Migrate static checks	2023-11-30 17:44:16 -08:00
Chelsea Mafrica	207a7fef90	Merge pull request #7815 from cmaf/runtime-rs-ch-vsock runtime-rs: Add Hybrid VSOCK device handling for CH	2023-11-30 12:22:36 -08:00
GabyCT	2bd21f7831	Merge pull request #8531 from GabyCT/topic/fixiperfli metrics: Fix iperf parallel bandwidth limit	2023-11-30 13:47:00 -06:00
Chao Wu	b3da71f21e	dragonball: init dbs-pci lib with pci bus & pci conf This commit inits dbs-pci lib for Dragonball to use. It contains several implementation now: 1. PCI configuration space 2. PCI bus More info of the design & behavior of those two features could be found in the README of dbs-pci. fixes: #8479 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com> Signed-off-by: Yang Su <yang.su@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Xin Lin <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-11-30 23:40:26 +08:00
Dan Mihai	38f24c41c0	Merge pull request #8271 from microsoft/danmihai1/exec-test-failure tests: more k8s-exec-rejected debug output	2023-11-30 07:11:01 -08:00
Greg Kurz	48e5596186	Merge pull request #8456 from cheriL/8447/alpine_bash osbuilder: add pkg bash for alpine	2023-11-30 13:43:48 +01:00
Steve Horsman	c6110284d5	Merge pull request #8520 from stevenhorsman/hypervisor-ttrpc runtime: Update hypervisor generated code	2023-11-30 10:01:56 +00:00
Amulya Meka	3d5db65b2e	Merge pull request #8526 from Amulyam24/workflow-ppc gha: fix artefacts build on ppc64le	2023-11-30 15:00:06 +05:30
Fabiano Fidêncio	80fcc56cef	Merge pull request #8528 from fidencio/topic/stop-building-and-shipping-log-parser-rs tools: Stop building / shipping log-parser-rs	2023-11-30 09:14:10 +01:00
Fabiano Fidêncio	9b30d97885	Merge pull request #8533 from fidencio/topic/fix-invalid-cpu-topology-for-tdx Revert "runtime: confidential: Do not set the max_vcpu to cpu"	2023-11-30 09:06:45 +01:00
Amulyam24	6a922f0e37	gha: fix artefacts build on ppc64le Add step in the right place to prepare the runner for the builds/tests. Fixes: #8525 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-30 09:50:47 +05:30
soup	811ec07359	osbuilder: add pkg bash for alpine The bash component is required in the guest for debug console to work properly. Fixes: #8447 Signed-off-by: soup <lqh348659137@outlook.com>	2023-11-30 09:42:39 +08:00
Fabiano Fidêncio	f15e16b692	Revert "runtime: confidential: Do not set the max_vcpu to cpu" This reverts commit `b0157ad73a`. ``` commit `b0157ad73a` Refs: 3.3.0-alpha0-124-gb0157ad73 Author: Fabiano Fidêncio <fabiano.fidencio@intel.com> AuthorDate: Fri Aug 11 14:55:11 2023 +0200 Commit: Fabiano Fidêncio <fabiano.fidencio@intel.com> CommitDate: Fri Nov 10 12:58:20 2023 +0100 runtime: confidential: Do not set the max_vcpu to cpu We don't have to do this since we're relying on the `static_sandbox_resource_mgmt` feature, which gives us the correct amount of memory and CPUs to be allocated. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> ``` This commit was removing a requirement that was made previously, but due to the SMP issue we're facing with the QEMU used for TDX (see commit d1b54ede290e95762099fff4e0bcdad10f816126), QEMU will fail to start due to: ``` Invalid CPU topology: product of the hierarchy must match maxcpus: sockets (1) dies (1) * cores (1) * threads (1) != maxcpus (240)" ``` This has no affect on the SEV / SNP workflow and hopefully we'll be able to re-revet this soon enough, when this gets solved on te QEMU side. Last but not least, this is not a "clean" revert as we're using conf.NumVCPUs() instead of conf.NumVCPUs, to ensure we're dealing with uint32. Fixes: #8532 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-30 00:41:27 +01:00
Fabiano Fidêncio	1284b4e80d	tools: Stop building / shipping log-parser-rs This is a commit that's a pre-req for #6826, as that PR will merge log-parser-rs into kata-ctl, but that will result in a CI breakage. So, let's deal with the CI changes here, thanks to GHA and our favourite `pull_request_target` event, unblocking that PR to be merged. Fixes: #6797 (not really, but related). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-30 00:32:10 +01:00
Gabriela Cervantes	37633d3cc2	metrics: Fix iperf parallel bandwidth limit This PR fixes the iperf parallel bandwidth limit for the kata metrics CI. Fixes #8530 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-29 19:59:45 +00:00
Dan Mihai	96deea52f2	tests: more k8s-exec-rejected debug output Print more information useful for debugging. Also, use a separate YAML file for this test, instead of reusing someone else's file. Fixes: #8270 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-11-29 18:05:15 +00:00
stevenhorsman	47b8c3181f	runtime: remote hypervisor updates to ttrpc - Update the remote hypervisor code to match the re-genned code for the ttrpc Hypervisor Service Fixes: #8519 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-11-29 18:04:40 +00:00
stevenhorsman	613c75ba8c	runtime: Update hypervisor generated code Update to use ttrpc_out instead of grpc_out Fixes: #8519 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-11-29 18:04:40 +00:00
GabyCT	1f1e5377e5	Merge pull request #8497 from GabyCT/topic/removemetricsstratovirt gha: Disable stratovirt for gha metrics	2023-11-29 11:16:53 -06:00
Fabiano Fidêncio	8fd39d11c4	tests: Adapt `enable_hypervisor`to the runtime-rs config location change As the configuration for the runtime-rs based drivers are now placed in a different location than the golang ones, we should adapt this script accordingly. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-29 14:51:35 +01:00
Fabiano Fidêncio	38183acbcb	tests: Use `kata-ctl` instead of `kata-runtime` for runtime-rs `kata-ctl` is the tool for runtime-rs, and it should be used instead of `kata-runtime`. `kata-ctl` requires sudo, and that's the reason it's also been added as part of the calls. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-29 14:51:35 +01:00
Fabiano Fidêncio	a5a73a11cb	tests: Replace `kata-runtime kata-env` by `kata-runtime env` `kata-runtime env` is an alias for `kata-runtime kata-env, and calling it with the `env` paramenter allows us to easily extend the scripts to use `kata-ctl` instead of `kata-runtime` when dealing with runtime-rs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-29 14:51:31 +01:00
Chelsea Mafrica	05efb23261	tests: update go.mod and go.sum Generate a go.sum file for tests. Fixes #8187 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-11-28 17:40:41 -08:00
Fabiano Fidêncio	30acb5a0c0	tests: nydus: Adapt the default config file for runtime-rs based drivers As we've done some changes in the runtime-rs based drivers to install their configuration into a different location, this should also be reflected as part of this test. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 20:37:59 +01:00
Chelsea Mafrica	6d9cb9325d	tests: update scripts for static checks migration Updates to scripts for static-checks.sh functionality, including common functions location, the move of several common functions to the existing common.bash, adding hadolint and xurls to the versions file, and changes to static checks for running in the main kata containers repo. The changes to the vendor check include searching for existing go.mod files but no other changes to expand the test. Fixes #8187 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-11-28 11:13:55 -08:00
Chelsea Mafrica	66f3944b52	tests: move github-labels to main repo Move tool as part of static checks migration. Fixes #8187 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Derek Lee <derlee@redhat.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Graham Whaley <graham.whaley@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Marco Vedovati <mvedovati@suse.com> Signed-off-by: Peng Tao <bergwolf@hyper.sh> Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Snir Sheriber <ssheribe@redhat.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:13:55 -08:00
Chelsea Mafrica	7f3c12f1dd	tests: move spell check tool to main repo Move tool as part of static checks migration. Fixes #8187 Signed-off-by: Bo Chen <chen.bo@intel.com> Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Dan Middleton <dan.middleton@intel.com> Signed-off-by: Derek Lee <derlee@redhat.com> Signed-off-by: Eric Ernst <eric.ernst@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Graham Whaley <graham.whaley@intel.com> Signed-off-by: Hui Zhu <teawater@antfin.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Jimmy Xu <xjmmyshcn@gmail.com> Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com> Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com> Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Snir Sheriber <ssheribe@redhat.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:13:55 -08:00
Chelsea Mafrica	8ad433d4ad	tests: move markdown check tool to main repo Move the tool as a dependency for static checks migration. Fixes #8187 Signed-off-by: Bin Liu <bin@hyper.sh> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Julio Montes <julio.montes@intel.com>	2023-11-28 11:13:55 -08:00
Chelsea Mafrica	eaa6b1b274	tests: move static checks and dependencies from tests Move static checks scripts and dependencies from tests to kata-containers repo. Fixes #8187 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com> Signed-off-by: Bin Liu <bin@hyper.sh> Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Dan Middleton <dan.middleton@intel.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Derek Lee <derlee@redhat.com> Signed-off-by: Dov Murik <dovmurik@linux.ibm.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com> Signed-off-by: Graham Whaley <graham.whaley@intel.com> Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com> Signed-off-by: Jon Olson <jonolson@google.com> Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com> Signed-off-by: Julio Montes <julio.montes@intel.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com> Signed-off-by: Marco Vedovati <mvedovati@suse.com> Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com> Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Snir Sheriber <ssheribe@redhat.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: Xu Wang <xu@hyper.sh> Signed-off-by: Yang Bo <bo@hyper.sh> Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-11-28 11:13:55 -08:00
Fabiano Fidêncio	61aa84b158	Revert "tests: k8s: Allow passing rust-runtime env var to kata-deploy" This reverts commit `44899d4cdf`, as we've decided to keep both golang and rust runtime installable and usable at the same time. The decision of having both runtimes installable and usable will help users to test and easily catch any possible differences between those runtimes, helping us to get on par with both implementations. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 18:02:07 +01:00
James O. D. Hunt	158ca17ae7	kata-deploy: Add cloud-hypervisor Now that we have a separate Cloud Hypervisor configuration file for the rust runtime, add it to the kata-deploy. See: https://github.com/kata-containers/kata-containers/pull/8250 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 18:02:06 +01:00
Fabiano Fidêncio	d4e00238ab	kata-deploy: Improve the logic for linking to the rust runtime This change for now doesn't do much, apart from making it easier to expand which runtimes should be linked to the runtime-rs containerd shim binary. Also, this matches the logic used for the config files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 18:01:27 +01:00
James O. D. Hunt	fc28deee0e	kata-deploy: Use rust runtime config files in runtime-rs directory Update `kata-deploy` to modify the rust runtime configuration files in their new `runtime-rs/` directory. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-28 18:01:25 +01:00
Gabriela Cervantes	9166d0aabb	docs: Update iperf3 network documentation This PR updates the iperf3 network documentation to include the parallel bandwidth. Fixes #8523 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-28 15:59:38 +00:00
Wainer dos Santos Moschetta	48bdca4c49	tests/k8s: add k8s-measured-rootfs.bats Implements the following test case: Scenario: Check incorrect hash fails Given I have a version of kata installed that has a kernel with the initramfs built and config with rootfs_verity.scheme=dm-verity rootfs_verity.hash=<incorrect hash of rootfs> set in the kernel_params When I try and create a container a basic pod Then The pod is doesn't run And Ideally we'd get a helpful message to indicate why Currently on CI only qemu-tdx is built with measured rootfs support in the kernel, so the test is restriced to that runtimeclass. Fixes #7415 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:54 -03:00
Wainer dos Santos Moschetta	1eae657b91	tests/k8s: add set_node() to lib.sh Use this new function to set the node where the pod should be scheduled to. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	c6075c8627	tests/k8s: add setup common Bring the setup_common() from CCv0 branch test's integration/kubernetes/confidential/tests_common.sh. It should be used to reduce boilerplates on the setup() of the tests. Unlike the original code, this won't export the `test_start_time` variable as it wouldn't be accurate to grab logs from the worker nodes due date/time mismatch between the running tests machine and the worker node. The function export the `node` variable which holds the name of a random node which has kata installed. Apart from that, it exports the `node_start_time` which capture the date/time when the test started, relative to the `node`. Tests that should inspect the logs can schedule pods/resources to the `node` and use `node_start_time` as the value reference to grep the logs. Fixes #7590 Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	220a2d9a15	tests/k8s: add assert_logs_contain() to lib.sh Bring the assert_logs_contain() from CCv0 branch tests' integration/kubernetes/confidential/lib.sh. Introduced the print_node_journal() which uses `kubectl debug` to print the systemd's journal of a k8s's node. Fixes #7590 Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	9a9c7a5c6f	tests/k8s: add set_metadata_annotation() to lib.sh This new function allow to the annotations to metadata section in a yaml configuration file. Co-authored-by: Ryan Savino <ryan.savino@amd.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	a13eecf7f3	runtime(-rs): add clean-generated-files target The new clean-generated-files make target allows for removing the generated files (including the configuration.toml files). The tools/packaging/static-build/shim-v2/build.sh script now uses that target to always force the re-generation of those files. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	36ea1b8ee7	tests/k8s: add new_pod_config() to lib.sh Copied the new_pod_config() and pod-config.yaml.in from CCv0 branch tests' integration/kubernetes/confidential/tests_common.sh and fixtures. Unlike the original version, new_pod_config() now gets the runtimeclass by parameter as the RUNTIMECLASS environment variable seems not broadly used on main branch's CI. The pod-config.yaml.in was changed as the diff shows below. In particular the imagePullSecrets was removed to avoid it throwing a warning on the pod's log. ``` --- a/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in +++ b/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in @@ -5,12 +5,10 @@ apiVersion: v1 kind: Pod metadata: - name: busybox-cc + name: test-e2e spec: runtimeClassName: $RUNTIMECLASS containers: - - name: nginx + - name: test_container image: $IMAGE - imagePullPolicy: Always - imagePullSecrets: - - name: cococred \ No newline at end of file + imagePullPolicy: Always \ No newline at end of file ``` Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com> Co-authored-by: Megan Wright <Megan.Wright@ibm.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	428daf9ebc	tests/k8s: add utilities functions for the tests The following functions were copied from CCv0's branch test's integration/kubernetes/confidential/lib.sh. I did just smalls refactorings (shortened their names and delinted shellcheck warnings): - k8s_delete_all_pods_if_any_exists() - k8s_wait_pod_be_ready() - k8s_create_pod() - assert_pod_fail() Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com> Co-authored-by: Jordan Jackson <jordan.jackson@ibm.com> Co-authored-by: Megan Wright <Megan.Wright@ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Co-authored-by: Wang, Arron <arron.wang@intel.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	ba4f806c30	initramfs: re-wrote devices checking on init.sh Re-wrote the logic of init.sh to follow the rules: * the root device MUST exist always because it will be either mounted or verified (then mounted) * if rootfs verifier is enabled then the hash device MUST exist. Avoid the case where dm-verity is set but the hash device does not exist and so the verification is silently skipped Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	72ef82368c	shim-v2: ensure root hash exist when measured rootfs When measured toofs is enabled then the shim-v2 build should find the guest rootfs hash file, otherwise might (silently) generate configuration files with empty hash. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	1465e58854	kernel: ensure initramfs exist when measured rootfs The KATA_BUILD_CC variable plus the existence (or not) of the initramfs were used to determine whether to build the kernel for measured rootfs or not. Currently the variable MEASURED_ROOTFS has been used to trigger the feature build and when it is activated it should expect the initramfs exist. In other words, this changed the kernel build so that if `MEASURED_ROOTFS=yes` then the initramf file must exist and be found. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	4dbba5215f	shim-v2: moved measured rootfs logic to its builder Moved the measure rootfs logic from kata-deploy-binaries.sh to the shim-v2's builder script so that the former get less bloated with components's specific code. Fixes #6674 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	34be78df19	kernel: moved measured rootfs logic to its builder Moved the measure rootfs logic from kata-deploy-binaries.sh to the kernel's builder script so that the former get less bloated with components's specific code. Fixes #6674 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	3f16d29593	kernel: measured rootfs as argument to build-kernel.sh By convention the caller of tools/packaging/kernel/build-kernel.sh changes the script behavior by passing arguments, whereas, for measured rootfs it has used an environment variable (MEASURED_ROOTFS). This refactor the script so that the caller now must pass the "-m" argument to enable the build of the kernel with measured rootfs support. Fixes #6674 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:51 -03:00
Fabiano Fidêncio	80860478bf	runtime-rs: Remove the golang config paths As the configuration files are different, we can safely remove those as any new installation of the binary should also bring in the new configurations. This makes things less error-prone in the future, as we're ensuring that the rust runtime will only be reading the rust configuration files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 15:16:53 +01:00
James O. D. Hunt	b86ab5aa21	runtime-rs: Update list of config paths to check Update the `DEFAULT_RUNTIME_CONFIGURATIONS` list to include a number of rust runtime specific paths to try to load before checking the "traditional" (golang) runtime configuration paths. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-28 15:16:53 +01:00
James O. D. Hunt	89ef464b7c	build: Install rust config files to runtime-rs directory Install the rust runtime configuration files to a `runtime-rs/` directory to distinguish them from the golang config files (which may have a different syntax). The default values mean that the rust config files are now installed to `/opt/kata/share/defaults/kata-containers/runtime-rs/` rather than `/opt/kata/share/defaults/kata-containers/`. See: https://github.com/kata-containers/kata-containers/issues/6020 Fixes: #8444. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-28 15:16:53 +01:00
alex.lyn	fe68f25bea	runtime-rs: enhancement of vfio volume. Reimplement vfio volume into direct_volume and do alignment of rawblock/spdk volume. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-28 10:08:05 +08:00
alex.lyn	e3fd403126	runtime-rs: enhancement of spdk volume. (1) Add enum DirectVolumeType for direct volumes. (2) Reimplement spdk volume into direct_volume and do alignment of rawblock volume. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-28 10:08:05 +08:00
alex.lyn	f973729029	runtime-rs: Enhancing DirectVolMount Handling for current Infra. The current infra(K8S, CSI, CRI, Containerd) for Kata containers is unable to properly handle direct volumes, resulting in the need for workarounds like searching/comparision and then patch up volume type. In this commit, reimplement of handling method is added to support raw block volume which backends may be rawdisk or other format file. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-28 10:08:05 +08:00
alex.lyn	e3becea566	runtime-rs: add support kata/multi-containers sharing one vfio volume. Fiexes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-28 10:07:23 +08:00
Steve Horsman	891f488ee3	Merge pull request #8501 from Amulyam24/containerd-tests gha: add cri-containerd workflow for ppc64le	2023-11-27 17:22:59 +00:00
James O. D. Hunt	45cc417a4e	Merge pull request #8461 from jodh-intel/update-codeowners CODEOWNERS: Expand scope	2023-11-27 15:38:39 +00:00
Fabiano Fidêncio	bb4c51a5e0	Merge pull request #8494 from ChengyuZhu6/kata_virtual_volume runtime: Pass `KataVirtualVolume` to the guest as devices in go runtime	2023-11-27 16:02:28 +01:00
Steve Horsman	bee6fba5c7	Merge pull request #8459 from Amulyam24/workflow-1 github: add workflows for building and publishing kata artefacts on ppc64le	2023-11-27 14:31:20 +00:00
Amulyam24	754aec02c3	gha: add cri-containerd workflow for ppc64le This PR adds workflow to run containerd tests on Power as a part of CI migration. Fixes: #8500 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-27 17:58:58 +05:30
alex.lyn	6af0592274	runtime-rs: Add vsock device in device manager. (1) Implement Device Trait for vsock device. (2) add vsock device in device manager. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-27 15:23:18 +08:00
alex.lyn	1a6b45d3b7	runtime-rs: Reintroduce Vsock and add it to the DeviceType enum As vsock device will be used in Qemu or other VMMs, the Vsoock is reintroduced to DeviceType enum. Fixes: #8474 Signed-off-by: Pavel Mores <pmores@redhat.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-27 15:12:44 +08:00
alex.lyn	e31dbc94a5	runtime-rs: remove vhost_fd from VsockConfig and make it cloneable. Currently encounters difficulty in utilizing the clone operation on VsockConfig due to the implicit management of the vhost fd within the runtime-rs. This responsibility should be delegated to the VMM(especially QEMU) child process, as it's not runtime-rs core responsibilities. We'll remove the member vhost_fd from VsockConfig and make the VsockConfig/VsockDevice Cloneable. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-27 15:11:21 +08:00
alex.lyn	eb90962b27	runtime-rs: introduce a new function generate_vhost_vsock_cid. Introduce a new function generate_vhost_vsock_cid to generate a guest CID and set guest CID for vsock fd. Also this commit wouldn't introduce functional change and it's just splited from the previous VsockDevice::new(). Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-27 15:06:58 +08:00
alex.lyn	b952c5c5ce	runtime-rs: add support kata/multi-containers sharing one spdk volume. Fiexes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-25 21:13:03 +08:00
alex.lyn	17d2d465d1	runtime-rs: re-organize the volumes with adding new direct_volumes. Add a new dire direct_volumes containing spdk, rawblock and vfio volume. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-25 21:04:55 +08:00
alex.lyn	6731466b13	runtime-rs: set a standard NotFound when direct volume path not found. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-25 19:51:12 +08:00
alex.lyn	d23867273f	runtime-rs: split the block volume into block and rawblock volume (1) rawblock volume is directvol mount type. (2) block volume is based on the bind mount type. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-24 23:30:30 +08:00
Amulyam24	ae2c0c5696	github: add workflows for building and publishing kata artifacts on ppc64le Adds workflows for building kata static tarball and releasing it. Fixes: #8458 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-24 15:53:38 +05:30
ChengyuZhu6	5318afe273	runtime: support to create VirtualVolume rootfs storages 1) Creating storage for all `io.katacontainers.volume=` messages in rootFs.Options, and then aggregates all storages into `containerStorages`. 2) Creating storage for other data volumes and push them into `volumeStorages`. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-11-23 23:22:55 +08:00
ChengyuZhu6	0b4f7c2ee7	runtime: redefine and add functions to handle VirtualVolume to storage 1) Extract function `handleBlockVolume` to create Storage only. 2) Add functions to handle KataVirtualVolume device and construct corresponding storages. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-11-23 23:07:32 +08:00
ChengyuZhu6	bd099fbda9	runtime: extend SharedFile to support mutiple storage devices To enhance the construction and administration of `Katavirtualvolume` storages, this commit expands the 'sharedFile' structure to manage both rootfs storages(`containerStorages`) including `Katavirtualvolume` and other data volumes storages(`volumeStorages`). NOTE: `volumeStorages` is intended for future extensions to support Kubernetes data volumes. Currently, `KataVirtualVolume` is exclusively employed for container rootfs, hence only `containerStorages` is actively utilized. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-11-23 23:05:14 +08:00
ChengyuZhu6	e4f33ac141	runtime: add functions to create devices in KataVirtualVolume The snapshotter will place `KataVirtualVolume` information into 'rootfs.options' and commence with the prefix 'io.katacontainers.volume='. The purpose of this commit is to transform the encapsulated KataVirtualVolume data into device information. Fixes: #8495 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Feng Wang <feng.wang@databricks.com> Co-authored-by: Samuel Ortiz <sameo@linux.intel.com> Co-authored-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-11-23 23:05:13 +08:00
Dan Mihai	756022787c	Merge pull request #8239 from Sumynwa/sumsharma/fix_configmap_update_propagation runtime: Fix configmap/secrets updates with FS sharing disabled	2023-11-23 06:50:53 -08:00
Chelsea Mafrica	98aa291c9e	runtime-rs: Add Hybrid VSOCK device handling for CH Update cloud hypervisor implementation to allow hybrid vsock device to be handled. Fixes #6692 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-11-22 14:42:09 -08:00
Gabriela Cervantes	8839ca93ba	gha: Disable stratovirt for gha metrics This PR disables the stratovirt for gha metrics. Fixes #8496 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-22 16:17:31 +00:00
briwan01	231b9dfd9d	runtime-rs/clh: Fix unable to boot container In the case of Cloud Hypervisor running on arm64 architecture, only arm AMBA UART (pl011) is supported as the TTY. Consequently, when enabling Hypervisor debug mode, it's essential to configure the console as "ttyAMA0" rather than "ttyS0 Fixes: #8381 Signed-off-by: briwan01 <brian.wang@arm.com>	2023-11-22 17:52:11 +08:00
GabyCT	358f32e8bb	Merge pull request #8467 from GabyCT/topic/fixresult metrics: Fix result finding in tensorflow benchmark	2023-11-21 13:41:46 -06:00
Fabiano Fidêncio	45a41c3431	Merge pull request #8481 from ChengyuZhu6/guest-kernel kernel: backport erofs patch to 6.1.52 guest kernel	2023-11-21 12:22:24 +01:00
Fabiano Fidêncio	8425c78c91	Merge pull request #8476 from fidencio/topic/gha-pass-rust-runtime-to-kata-deploy tests: k8s: Allow passing rust-runtime env var to kata-deploy	2023-11-21 11:09:01 +01:00
Chao Wu	6a6c3c53b5	Merge pull request #8450 from adamqqqplay/vhost-user-general dragonball: add vhost-user connection management logic	2023-11-21 16:05:17 +08:00
ChengyuZhu6	6de01eacfd	kernel: backport erofs patch to 6.1.52 guest kernel Backport the erofs patch from linux kernel to solve the error #8083 Fixes: #8083 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-11-21 15:22:40 +08:00
Amulyam24	d8a8cc4491	tools: install oras from source on ppc64le Since the release is not yet out for ppc64le, build oras from source and use it. Fixes: #8458 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-21 11:38:20 +05:30
Amulyam24	08f3603123	tools: fix static build of qemu and shimv2 on ppc64le - statically linked qemu requires slof.bin to run, hence remove it from blacklist - By default, initrd is used for Power, modify the configuration.toml accordingly Fixes: #8458 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-21 11:38:20 +05:30
Alex.Lyn	4fd2914a33	Merge pull request #7932 from Apokleos/wrap-virtiofs-in-dm runtime-rs: bringing virtio-fs device in device-manager	2023-11-21 13:48:15 +08:00
Huang Jianan	a9571398a6	dragonball: add test utils for vhost-user The test utils will be used by the upcoming feature tests: vhost-user-net, vhost-user-blk and vhost-user-fs. Signed-off-by: Beiyue <beiyue@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>	2023-11-21 09:51:56 +08:00
Qinqi Qu	a6a399d5bc	dragonball: add vhost-user connection management logic The vhost-user connection management logic will be used by the upcoming features: vhost-user-net, vhost-user-blk and vhost-user-fs. Fixes: #8448 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>	2023-11-21 09:51:48 +08:00
Fabiano Fidêncio	9445a967b6	Merge pull request #8471 from ChengyuZhu6/kata-virtual-volume runtime: Introduce `KataVirtualVolume` structure into go runtime	2023-11-20 21:58:27 +01:00
Fabiano Fidêncio	8002de895a	Merge pull request #8439 from fidencio/topic/kata-manager-install-a-given-kata-tarball utils: kata-manager: Allow installing kata from a given tarball	2023-11-20 20:02:25 +01:00
Wainer Moschetta	728565d1e4	Merge pull request #7046 from stevenhorsman/remote-hypervisor-cherry-picks CC: Remote hypervisor merge to main	2023-11-20 15:22:37 -03:00
Chao Wu	5ee8829700	Merge pull request #8451 from openanolis/chao/pci	2023-11-21 00:29:22 +08:00
Fabiano Fidêncio	41f3f6f93e	Merge pull request #8465 from justxuewei/rename-virtio dragonball: Uniform the spelling of Virtio	2023-11-20 16:31:33 +01:00
Hyounggyu Choi	506b127df8	Merge pull request #8478 from BbolroC/set-default-allowed_hypervisor_annotations kata-deploy: Set a default value for ALLOWED_HYPERVISOR_ANNOTATIONS	2023-11-20 15:39:56 +01:00
alex.lyn	fe62e656a7	runtime-rs: Name the ShareFs Mount Option type more accurately Fixes: #7915 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-20 20:05:50 +08:00
alex.lyn	856315ff87	runtime-rs: bringing virtio-fs device in device-manager It mainly focus on the two parts: (1) redesign the ShareFsConfig with ShareFsMountConfig The device mount operation must depend on the fact that sharefs device exists, and re-design the structure of SharesFsConfig and move the ShareFsMountConfig into it with Option type, which is to describe the relation between ShareFsConfig and ShareFsMountConfig. (2) move virtiofs into device manager Currently, virtio-fs is still outside of the device manager. To do Enhancement of device manager, it will bring virtio-fs device in device-manager for unified management Fixes: #7915 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-20 20:04:47 +08:00
Chao Wu	b3318e59eb	Merge pull request #8332 from Apokleos/bugfix-directvol-multicontainers runitme-rs/bugfix: kata pod with multi-containers sharing one direct volume	2023-11-20 19:37:58 +08:00
Hyounggyu Choi	c489f1f504	kata-deploy: Set a default value for ALLOWED_HYPERVISOR_ANNOTATIONS As a follow-up PR for #8404, this is to set a default value for an environment variable `ALLOWED_HYPERVISOR_ANNOTATIONS`. This will prevent a pod launching without an explicit configuration for the variable from getting into a `CrashLoop` state. Fixes: #8477 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-11-20 12:33:34 +01:00
Chao Wu	ee55897827	fmt: refactor in pci & balloon 1. merge hashmap get logic according to Xuewei suggestion. 2. do cargo fmt Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-11-20 17:53:51 +08:00
Chao Wu	baf3db9e6e	Dragonball: add PCI bus and PCI interrupt support in mptable Spec In order to support PCI VFIO functionality in Dragonball, we should first add PCI bus and PCI device Interrupt information in Dragonball mptable setup process. This patch add : 1. pci_legacy_irqs transfered to setup_mptable function. 2. pci bus support in mptable mem 3. pci interrupt support in mptable mem fixes: #8449 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-11-20 17:53:51 +08:00
Xuewei Niu	c305634b4e	dragonball: Uniform the spelling of Virtio The changes are: - VirtIoError -> VirtioError - VirtIoResult -> VirtioResult - VirtIoDevice -> VirtioDevice Fixes: #8464 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-20 17:00:58 +08:00
Fabiano Fidêncio	44899d4cdf	tests: k8s: Allow passing rust-runtime env var to kata-deploy This will be used for selecting the correct runtimes and runtimeclasses to be deployed with kata-deploy. Fixes: #8475 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-20 09:13:05 +01:00
ChengyuZhu6	1353b14e6c	runtime: Add KataVirtualVolume struct in runtime Add the corresponding data structure in the runtime part according to kata-containers/kata-containers/pull/7698. Fixes: #8472 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-11-19 13:30:32 +08:00
Greg Kurz	110574353d	Merge pull request #8345 from beraldoleal/issues/8343 Fixes make check errors	2023-11-17 17:38:29 +01:00
Gabriela Cervantes	37916e7a58	metrics: Fix result finding This PR fixes the result finding for the general throughput for the tensorflow benchmark. Fixes #8466 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-17 15:59:51 +00:00
stevenhorsman	ebf9d2725a	kata-deploy: Add remote shim - Add remote to the list of shims in kata-deploy and kata-cleanup Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-11-17 13:38:49 +00:00
Fabiano Fidêncio	d5cf169adf	kata-deploy: Add missing kata-remote runtimeclass It's CCv0 specific for now, and it's needed as the Operator is now delegating the runtimeclass creation to the kata-deploy daemonset. Fixes: #7550 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `2df6cb7609`)	2023-11-17 13:34:40 +00:00
Pradipta Banerjee	39e8c84269	runtime: Add support for key annotations to remote hyp In order to support different pod VM instance type via remote hypervisor implementation (cloud-api-adaptor), we need to pass machine_type, default_vcpus and default_memory annotations to cloud-api-adaptor. The cloud-api-adaptor then uses these annotations to spin up the appropriate cloud instance. Reference PR for cloud-api-adaptor https://github.com/confidential-containers/cloud-api-adaptor/pull/1088 Fixes: #7140 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> (based on commit `004f07f076`)	2023-11-17 13:33:27 +00:00
Yohei Ueda	2910e333a8	runtime: Use static resource in remote hypervisor This patch updates the template configuration file for the remote hypervisor to set static_sandbox_resource_mgmt to be true. The remote hypervisor uses the peer pod config to determine the sandbox size, so requires this to be set to true by default. Fixes: #6616 Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> (based on commit `938447803b`)	2023-11-17 13:33:27 +00:00
stevenhorsman	26d56678a9	config: Add initial remote hypervisor config - Remote hypervisor template config - Add annotation enablement for machine_type, default_memory and default_vcpus for flexible instance types Fixes: #6349 Signed-off-by: stevenhorsman <steven@uk.ibm.com> (based on commits `7c9a791d67` and `335a456425`)	2023-11-17 13:33:24 +00:00
stevenhorsman	ad63439a3e	runtime: Update the remote hypervisor config Add the SELinux setting to ensure it is passed through to the remote hypervisor Fixes: #5936 Signed-off-by: stevenhorsman <steven@uk.ibm.com> (based on commit `3ef2fd1784`)	2023-11-17 13:32:52 +00:00
Lei Li	50e0d43dad	runtime: Support privileged containers in peer pod VM This patch fixes the issue of running containers with privileged as true. See the discussion at this URL for the details. https://github.com/confidential-containers/cloud-api-adaptor/issues/111 Signed-off-by: Lei Li <cdlleili@cn.ibm.com> (based on commit `c3e6b66051`)	2023-11-17 13:32:52 +00:00
Yohei Ueda	57d4dd8e57	runtime: Support the remote hypervisor type This patch adds the support of the remote hypervisor type. Shim opens a Unix domain socket specified in the config file, and sends TTPRC requests to a external process to control sandbox VMs. Fixes #4482 Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> (based on commit `f9278f22c3`)	2023-11-17 13:32:49 +00:00
Yohei Ueda	8ac9a22097	runtime: Add hypervisor proto to support peer pod VMs This patch adds a protobuf definiton of the remote hypervisor type. Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> (based on commit `150e8aba6d`)	2023-11-17 13:31:09 +00:00
Fabiano Fidêncio	f8322ffad2	Merge pull request #7796 from WenyuanLau/7794/StratoVirt_VMM_support StratoVirt: add support for a lightweight VMM StratoVirt in Kata	2023-11-17 10:53:17 +01:00
Fabiano Fidêncio	d6d9b45007	Merge pull request #7931 from BbolroC/migrate-to-gha-s390x tests\|gha: add containerd and k8s tests for s390x	2023-11-17 10:24:14 +01:00
Sumedh Alok Sharma	4aaf54bdad	runtime: Fix configmap/secrets update propagation with FS sharing disabled This PR fixes k8's configmap/secrets etc update propagation when filesystem sharing is disabled. The commit introduces below changes with some limitations: - creates new timestamped directory in guest - updates the '..data' symlink - creates user visible symlinks to newly created secrets. - Limitation: The older timestamped directory and stale user visible symlinks exist in guest due to missing DELETE api in agent. Fixes: #7398 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2023-11-17 13:01:23 +05:30
Hyounggyu Choi	0c7aa1f307	gha: Set nightly test for s390x to 5 UTC This is to push back the time for the s390x nightly test to 5 a.m. UTC. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-11-17 05:47:44 +01:00
Hyounggyu Choi	ffe1ea52cf	tests\|gha: add containerd and k8s tests for s390x As part of the CI migration, this PR is to add workflows for containerd and k8s for s390x. Fixes: #7930 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-11-16 18:14:26 +01:00
GabyCT	8586308dcd	Merge pull request #8453 from GabyCT/topic/udpreadme metrics: Add iperf udp information to README	2023-11-16 10:38:56 -06:00
GabyCT	494174a98e	Merge pull request #8421 from GabyCT/topic/enablestressng tests: Enable stressng scalability test	2023-11-16 10:25:05 -06:00
James O. D. Hunt	4a4fc9c648	CODEOWNERS: Expand scope Improve the `CODEOWNERS` file by specifying more groups. Since GitHub automatically checks the `CODEOWNERS` file when a PR is created and adds all matching groups as reviewers for the PR, this may help reduce the PR backlog since the right people will be alerted and requested to review the PR. That should improve the quality of reviews (and thus the quality of the landed code). It may also have a positive effect on PR velocity. > Note: > > This PR combines the other `CODEOWNERS` files so we have > a single, visible, top-level file. See: https://github.com/kata-containers/community/issues/253 Fixes: #3804. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-16 16:09:20 +00:00
Fabiano Fidêncio	10996f3bbb	Merge pull request #8460 from ldoktor/artifacts gha: Keep kata tarballs for 15 days	2023-11-16 13:56:25 +01:00
Liu Wenyuan	c77e990c3e	tests: Enable tests for StratoVirt hypervisor This commit enables StratoVirt hypervisor to be tested in kata GHA, incluing k8s, metrics, cri-containerd, nydus and so on. Meanwhile, adding some unit tests for StratoVirt to make sure it works. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:26 +08:00
Liu Wenyuan	14d8790d83	kata-deploy: Add StratoVirt support to deploy process Allow kata-deploy process to pull StratoVirt from release binaries, and add them as a part of kata release. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:26 +08:00
Liu Wenyuan	9542211e71	configuration: add configuration for StratoVirt hypervisor. Add configuration-stratovirt.toml.in to generate the StratoVirt configuration, and parser to deliver config to StratoVirt. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:26 +08:00
Liu Wenyuan	561c85be54	build: Makefile for StratoVirt hypervisor Add support for building StratoVirt hypervisor, including x86_64 and arm64. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:26 +08:00
Liu Wenyuan	26966c8469	virtcontainers: Add StratoVirt as a supported hypervisor Initial support of the MicroVM machine type of StratoVirt hypervisor for the kata go runtime. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:24 +08:00
Fabiano Fidêncio	edb791315e	Merge pull request #7987 from BbolroC/nightly-ci-s390x tests\|gha: add nightly tests for s390x	2023-11-16 11:45:32 +01:00
Lukáš Doktor	8959e3ca05	gha: Keep kata tarballs for 15 days these tarballs are useful for debugging and re-running jobs, keep them for 15 days. Fixes: #8000 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2023-11-16 10:35:20 +01:00
Gabriela Cervantes	9cc6908b09	stability: Update stressng to run on the gha This PR updates the stressng test to run on the gha for kata CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-15 19:34:36 +00:00
Gabriela Cervantes	9d8eb298c3	metrics: Add iperf udp information to README This PR adds the iperf udp information to the network README for the kata metrics CI. Fixes #8452 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-15 15:22:06 +00:00
Gabriela Cervantes	4b7854b668	stability: Add missing dependencies This PR adds missing dependencies to run stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-15 14:51:14 +00:00
Gabriela Cervantes	79177bb9cb	tests: Enable stressng scalability test This PR enables the stressng scalability test for kata CI. Fixes #8420 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-15 14:51:14 +00:00
Xuewei Niu	f18794d880	Merge pull request #8426 from justxuewei/vhost-rm-virtio-net dragonball: Remove vhost-net dependency on virtio-net	2023-11-15 10:39:27 +08:00
alex.lyn	ba632ba825	runitme-rs: kata with multi-containers sharing one direct volume When multiple containers in a kata pod share one direct volume, it's important to make sure that the corresponding block device is only mounted once in the guest. This means that there should be only one mount entry for the device in the mount information. Fixes: #8328 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-15 10:37:01 +08:00
alex.lyn	d7594d830c	runtime-rs: correct the path from cid to device_id. When a direct volume is used by multiple containers in Kata, Generating many shared paths with cids will cause IO error as the result of one direct volume mounts more than once. To correct it, use the device_id instead of cid which ensures that the guest only mounts the FS once. Fixes: #8328 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-15 10:30:39 +08:00
Fabiano Fidêncio	906f6b7380	Merge pull request #8431 from UiPath/fix-vsock-packets-drop kernel: Fix vsock packets drop when the driver initializes	2023-11-14 18:52:53 +01:00
Fabiano Fidêncio	1699b84f13	utils: kata-manager: Remove $enable_debug from the install_kata call This was added as part of `d4d65bed38`, but install_kata has never actually used the passed enable_debug var. With this in mind, let's just remove it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-14 17:34:03 +01:00
Fabiano Fidêncio	38d2edd83b	utils: kata-manager: Allow installing kata from a given tarball With this change, we give the users the change to try kata-containers with their own pre-built tarball. This will become very useful in the CI context, as we won't be downloading a specific version of kata-containers, but rather installing whatever was built in previous steps of the CI pipeline. Fixes: #8438 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-14 17:34:01 +01:00
Fabiano Fidêncio	fd9b6d6837	Merge pull request #7623 from fidencio/topic/runtime-improve-vcpu-allocation-on-host-side runtime: Improve vCPU allocation for the VMMs	2023-11-14 14:10:54 +01:00
Alexandru Matei	bfd1ce30e1	kernel: Fix vsock packets drop when the vsock driver starts The virtio vsock driver has a small window during initialization where it can silently drop replies to connection requests. Because no reply is sent, kata waits for 10 seconds and in the end it generates a connection timeout error in HybridVSockDialer. Fixes: #8291 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-11-14 11:02:52 +02:00
Xuewei Niu	49c2e6e23c	dragonball: Remove vhost-net dependency on virtio-net This patch is to remove vhost-net dependency on virtio-net for dbs-virtio-devices crate. Then, the feature of vhost-net is able to enable without enabling virtio-net device, error, etc. Fixes: #8423 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-14 15:35:10 +08:00
Fabiano Fidêncio	dffc6f611c	Merge pull request #8432 from justxuewei/rm-ci-docker-and-nerdctl gha: Remove docker and nerdctl tests from ci.yaml	2023-11-14 08:34:18 +01:00
alex.lyn	4d65c2e8a2	runtime-rs: introduce `update_device` in trait Hypervisor Introduce the `update_device` trait in Hypervisor to enable device updates for VMMs.This trait will initially be utilized for virtiofs Mount operations. Fixes: #7915 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-14 11:56:36 +08:00
Xuewei Niu	481486c6d5	gha: Remove docker and nerdctl tests from CI Two workflows, run-nerdctl-tests-on-garm.yaml and run-docker-tests-on-garm.yaml, are removed from commit `b481d39`. However, they are referenced by CI workflow. It leads to the CI not working properly. This patch is to remove those files from ci.yaml. Fixes: #8433 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-14 10:44:14 +08:00
Fabiano Fidêncio	c858ea1460	Merge pull request #8174 from fidencio/topic/re-revert-8115 ci: Re-add tracing tests and move docker/nerdctl to the basic-ci-amd64.yaml file	2023-11-13 18:19:40 +01:00
James O. D. Hunt	a781ce33b0	Merge pull request #8383 from jodh-intel/kata-manager-add-list-option utils: kata-manager: Add option to list versions	2023-11-13 16:18:36 +00:00
David Esparza	98ec34b04c	Merge pull request #8338 from dborquez/improve_metrics_init_environment metrics: Fix function that completely stops kata containers before running a test	2023-11-13 09:35:27 -06:00
Fabiano Fidêncio	b481d396fc	gha: Move docker / nerdctl content to the basic-ci-amd64 file There's no need to keep those as separate files, and by having those in the basic-ci-amd64.yaml file actually helps us to avoid the undocummented GHA limitation about the number of files imported. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-13 15:34:00 +01:00
Fabiano Fidêncio	3c735c236d	ci: tracing: Adapt to basic-ci-amd64.yaml Peng Tao made this move as part of `1280f85343`, and here we're simply adjusting to the move. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-13 15:27:39 +01:00
Fabiano Fidêncio	ee17fe9d20	Revert "gha: ci: Revert tracing test PR to unbreak CI" This reverts commit `e9bd852113`.	2023-11-13 15:27:39 +01:00
James O. D. Hunt	4d5b23b73a	Merge pull request #8419 from jodh-intel/2023-11-10-fix-tdx runtime-rs: ch: Fix TDX	2023-11-13 11:58:16 +00:00
James O. D. Hunt	7f666f783d	runtime-rs: ch: Fix TDX PR #8311 inadvertently broke the runtime-rs / Cloud Hypervisor TDX handling. It also introduced unrecoverable failure scenarios. Hence, replace slow, fallible regex matching in logging fast path with single pass non-failing multi-string log level matching. Also, added a unit test for `parse_ch_log_level()`. Fixes: #8418. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-13 08:49:47 +00:00
Xuewei Niu	0a9125e629	Merge pull request #7675 from justxuewei/vhost-net	2023-11-12 20:38:18 +08:00
Xuewei Niu	d1deaf0538	dragonball: Minor changes for a comment from Bian - Add feature control for InsertNetworkDevice. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:14:10 +08:00
Xuewei Niu	e4f83e27c4	dragonball: vhost-net set_offload with acked features set_offload() for tap devices depends on acked features. Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:10:39 +08:00
Xuewei Niu	6cd572dbbb	dragonball: Minor changes for Chao's comments - Remove two panic statements from InsertNetworkDevice test. - Rename `NUM_QUEUES` to `DEFAULT_NUM_QUEUES`, `QUEUE_SIZE` to `DEFAULT_QUEUE_SIZE` for vhost-net and virtio-net. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:10:39 +08:00
Xuewei Niu	dcdf3c6556	runtime-rs: Supply missing fields of NetworkConfig `test_networkconfig_to_netconfig` from clh depends on `NetworkConfig` which has some new fields in this PR. Therefore, this commit gives the test missing fields. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:10:39 +08:00
Xuewei Niu	58e9709c1f	dragonball: Changes for ZizhengBian's comments - Dragonball's vhost-net feature not depends on virtio-net feature. - Remove `TapError` from dbs-virtio-devices's Error, and add `VirtioNet` and `VhostNet` two fields. - Downgrade visiblity of two fields of `VhostNetDeviceMgr` from `pub(crate)`. - File an issue to record a todo for network rate limiter. - Print internal errors with `{0:?}. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:10:33 +08:00
Fabiano Fidêncio	849253e55c	tests: Add a simple test to check the VMM vcpu allocation As we've done some changes in the VMM vcpu allocation, let's introduce basic tests to make sure that we're getting the expected behaviour. The test consists in checking 3 scenarios: * default_vcpus = 0 \| no limits set * this should allocate 1 vcpu * default_vcpus = 0.75 \| limits set to 0.25 * this should allocate 1 vcpu * default_vcpus = 0.75 \| limits set to 1.2 * this should allocate 2 vcpus The tests are very basic, but they do ensure we're rounding things up to what the new logic is supposed to do. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-10 18:26:01 +01:00
Fabiano Fidêncio	5e9cf75937	vc: utils: Rename CalculateMilliCPUs() to CalculateCPUsF() With the change done in the last commit, instead of calculating milli cpus, we're actually converting the CPUs to a fraction number, a float. Let's update the function name (and associated vars) to represent that change. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-10 18:26:01 +01:00
Fabiano Fidêncio	e477ed0e86	runtime: Improve vCPU allocation for the VMMs First of all, this is a controversial piece, and I know that. In this commit we're trying to make a less greedy approach regards the amount of vCPUs we allocate for the VMM, which will be advantageous mainly when using the `static_sandbox_resource_mgmt` feature, which is used by the confidential guests. The current approach we have basically does: * Gets the amount of vCPUs set in the config (an integer) * Gets the amount of vCPUs set as limit (an integer) * Sum those up * Starts / Updates the VMM to use that total amount of vCPUs The fact we're dealing with integers is logical, as we cannot request 500m vCPUs to the VMMs. However, it leads us to, in several cases, be wasting one vCPU. Let's take the example that we know the VMM requires 500m vCPUs to be running, and the workload sets 250m vCPUs as a resource limit. In that case, we'd do: * Gets the amount of vCPUs set in the config: 1 * Gets the amount of vCPUs set as limit: ceil(0.25) * 1 + ceil(0.25) = 1 + 1 = 2 vCPUs * Starts / Updates the VMM to use 2 vCPUs With the logic changed here, what we're doing is considering everything as float till just before we start / update the VMM. So, the flow describe above would be: * Gets the amount of vCPUs set in the config: 0.5 * Gets the amount of vCPUs set as limit: 0.25 * ceil(0.5 + 0.25) = 1 vCPUs * Starts / Updates the VMM to use 1 vCPUs In the way I've written this patch we introduce zero regressions, as the default values set are still the same, and those will only be changed for the TEE use cases (although I can see firecracker, or any other user of `static_sandbox_resource_mgmt=true` taking advantage of this). There's, though, an implicit assumption in this patch that we'd need to make explicit, and that's that the default_vcpus / default_memory is the amount of vcpus / memory required by the VMM, and absolutely nothing else. Also, the amount set there should be reflected in the podOverhead for the specific runtime class. One other possible approach, which I am not that much in favour of taking as I think it's less clear, is that we could actually get the podOverhead amount, subtract it from the default_vcpus (treating the result as a float), then sum up what the user set as limit (as a float), and finally ceil the result. It could work, but IMHO this is less clear, and less explicit on what we're actually doing, and how the default_vcpus / default_memory should be used. Fixes: #6909 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2023-11-10 18:25:57 +01:00
Fabiano Fidêncio	8d958b8c47	Merge pull request #8406 from microsoft/danmihai1/policy-doc docs: add agent policy documentation	2023-11-10 17:19:04 +01:00
James O. D. Hunt	f588d31324	Merge pull request #8374 from jodh-intel/kata-manager-check-dl-url-count utils: kata-manager: Ensure only one download URL	2023-11-10 13:19:07 +00:00
Fabiano Fidêncio	b0157ad73a	runtime: confidential: Do not set the max_vcpu to cpu We don't have to do this since we're relying on the `static_sandbox_resource_mgmt` feature, which gives us the correct amount of memory and CPUs to be allocated. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-10 12:58:20 +01:00
Steve Horsman	b23952c852	Merge pull request #8309 from gkurz/update-release-process-doc Update release process documentation	2023-11-10 09:44:18 +00:00
James O. D. Hunt	0ead018d0a	utils: kata-manager: Add Docker details to list output Add Docker version details to the output of the list versions CLI option. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 09:19:56 +00:00
James O. D. Hunt	be3044fd01	utils: kata-manager: Add option to list versions Add a command-line option to list the installed and available versions of Kata and containerd. Fixes: #8355. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 09:19:56 +00:00
James O. D. Hunt	9969f5a94a	utils: kata-manager: Make test container name more unique Rather than creating a container called `test-kata`, prefix with the script name to make it a bit "more unique" and less likely for users to have an existing container with the test container name. The new test container name is `kata-manager-sh-test-kata`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 09:19:56 +00:00
James O. D. Hunt	436d7d1275	utils: kata-manager: Improve usage message Update the usage to show that the latest Kata version can also be queried using `kata-ctl`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 08:29:14 +00:00
James O. D. Hunt	1625a5ce48	utils: kata-manager: Improve version check Update `github_get_latest_release()` to use `sort -V` rather than sub-sorting on the major, minor and patch level version number elements. The new approach is safer and more accurate. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 08:29:14 +00:00
James O. D. Hunt	c72a27e219	utils: kata-manager: Ensure only one download URL Add an extra sanity check to ensure that only a single download URL is found for the specified release version. Fixes: #8364. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 08:27:23 +00:00
James O. D. Hunt	839f6c3d44	utils: kata-manager: Improve info messages Improve some of the information messages a little by adding more detail and quoting file names. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 08:27:20 +00:00
Archana Shinde	21e45bebc8	Merge pull request #8376 from fidencio/topic/kata-manager-add-support-for-docker-installation kata-manager: Add support for Docker CLI installation	2023-11-09 22:11:50 -08:00
Chao Wu	a62fb83c91	Merge pull request #8169 from openanolis/chao/fix_typo_shm runtime-rs: fix a typo in shm	2023-11-10 14:00:11 +08:00
Chao Wu	820b578aa3	Merge pull request #8370 from gaohuatao-1/bugfix agent: update AGENT_THREADS metrics value	2023-11-10 13:16:29 +08:00
gaohuatao	78df1bb851	agent: update AGENT_THREADS metrics value Fixes: #8369 Signed-off-by: gaohuatao <gaohuatao@bytedance.com>	2023-11-10 10:39:57 +08:00
Chao Wu	afb002c25c	runtime-rs: fix a typo in shm is_shim_volume should be is_shm_volume in shm_volume mod. fixes: #8168 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-11-10 10:36:58 +08:00
Fabiano Fidêncio	2b937400fe	Merge pull request #8404 from fidencio/topic/kata-deploy-allow-users-to-enable-hypervisor-annotations kata-deploy: Allow users to set hypervisor annotations	2023-11-09 17:44:52 +01:00
Dan Mihai	bc49c553ef	docs: add agent policy documentation Add initial agent policy documentation. Fixes: #7671 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-11-09 16:43:00 +00:00
Fabiano Fidêncio	5d10aed9ba	kata-manager: Make containerd_config a global var As "/etc/containerd/config.toml" is used from more than one place, let's just make it a global var. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 13:47:52 +01:00
Fabiano Fidêncio	66d1b2c173	kata-manager: Add support for docker installation Add support for also installing the Docker CLI, giving users the chance to try Kata Containers with docker in the same way we provide users the chance to try Kata Containers with `ctr`. Fixes: #8357 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 13:47:52 +01:00
Fabiano Fidêncio	1a81989d20	tests: k8s: Use the "ALLOWED_HYPERVISOR_ANNOTATIONS" The current kata-deploy code has been doing a `sed` to add allowed hypervisor annotations, so CBL mariner can be tested with their own kernel and initrd. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 13:42:31 +01:00
Fabiano Fidêncio	023c4a17cf	kata-deploy: Allow users to set hypervisor annotations Currently the only way one can specify allowed hypervisor annotations is during build time, which is a big issue for users grabbing kata-deploy as we provide. Fixes: #8403 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 13:42:31 +01:00
Fabiano Fidêncio	0352f1e029	kata-manager: Allow passing a specific tool to test_installation Right now we're only testing with `ctr` and there's no change in behaviour with this commit. However, allowing to pass a tool to run the tests with gives us an easier time when expanding kata-manager to support, for instance, docker and nerdctl. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 11:24:37 +01:00
Fabiano Fidêncio	50df1129ea	Merge pull request #8411 from fidencio/topic/fix-k3s-deployment gha: Fix regex used to get kubectl version from the k3s version	2023-11-09 10:44:34 +01:00
Fabiano Fidêncio	455b7bf776	gha: k3s: Avoid unnecessary escape There's no reason to escape the first + on the +k3s[0-9]\+ regex, as shown here: ```sh ubuntu@k3s:~$ /usr/local/bin/k3s kubectl version --short 2>/dev/null \| \ grep "Client Version" \| \ sed \ -e 's/Client Version: //' \ -e 's/+k3s[0-9]\+//' v1.27.7 ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 08:42:25 +01:00
Fabiano Fidêncio	e7890ee8f6	gha: Fix regex used to get kubectl version from the k3s version It seems that with the new k3s release, they've bumped their kubectl version from x.y.z+k3s1 to x.y.z+k3s2. Let's ensure our regexp is more generic and future proof for such changes. Fixes: #8410 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 07:08:02 +01:00
Archana Shinde	1611723465	Merge pull request #8379 from likebreath/1103/clh_v36.0 Upgrade to Cloud Hypervisor v36.0	2023-11-08 21:10:41 -08:00
Archana Shinde	268d4d622f	Merge pull request #8389 from justxuewei/vm-capable-test runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue	2023-11-08 12:14:04 -08:00
Archana Shinde	92a517156c	Merge pull request #8367 from amshinde/add-nerdctl-ipvlan-test network: Fix network hotplug for ipvlan and macvlan endpoints for qemu and add tests	2023-11-08 11:45:13 -08:00
Chelsea Mafrica	83e731328f	Merge pull request #8023 from cmaf/runtime-rs-ch-pause-resume runtime-rs: Update status for pause and resume	2023-11-08 11:34:47 -08:00
Hyounggyu Choi	84b5618733	tests\|gha: add internal nightly tests for s390x This is to add a workflow for internal nightly tests for s390x in Jenkins. Fixes: #7986 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-11-08 16:07:41 +01:00
Xuewei Niu	acd9057c7b	runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue TestCheckHostIsVMContainerCapable removes sysModuleDir to simulate a case that the kernel modules are not loaded. However, checkKernelModules() executes modprobe <module> if a module not found in that directory. Loading those modules is required to be denied temporarily. Fixes: #8390 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 22:40:08 +08:00
Fupan Li	100a73d2fd	Merge pull request #7531 from justxuewei/device-cgroup agent: Restrict device access at upper node of container's cgroup	2023-11-08 22:01:48 +08:00
Chao Wu	4435c1efd7	Merge pull request #8386 from jodh-intel/runtime-rs-ch-tidy-up runtime-rs: ch: Simplify VSOCK error handling	2023-11-08 17:31:40 +08:00
Xuewei Niu	023d8dc01e	agent: Changes according to Pan's comments - Disable device cgroup restriction while pod cgroup is not available. - Remove balcklist-related names and change whitelist-related names to allowed_all. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:08 +08:00
Xuewei Niu	136fb76222	tests: Add a integrated test for device cgroup `TestDeviceCgroup` is added to cri-containerd's integration tests. The test launches two containers. Each container has a block device. It checks the validity of device cgroup. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	b5f3a8cb39	agent: Fix container launching failure with systemd cgroup FSManager of systemd cgroup manager is responsible for setting up cgroup path. The container launching will be failed if the FSManager is in read-only mode. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	6477825195	agent: Minor changes according to Zhou's comments The changes include: - Change to debug logging level for resources after processed. - Remove a todo for pod cgroup cleanup. - Add an anyhow context to `get_paths_and_mounts()`. - Remove code which denys access to VMROOTFS since it won't take effect. If blackmode is in use, the VMROOTFS will be denyed as default. Otherwise, device cgroups won't be updated in whitelist mode. - Add a unit test for `default_allowed_devices()`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	cec8044744	agent: Make devcg_info optional for LinuxContainer::new() The runk is a standard OCI runtime that isnt' aware of concept of sandbox. Therefore, the `devcg_info` argument of `LinuxContainer::new()` is unneccessary to be provided. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	ef4c3844a3	agent: Restrict device access at upper node of container's cgroup The target is to guarantee that containers couldn't escape to access extra devices, like vm rootfs, etc. Assume that there is a cgroup, such as `/A/B`. The `B` is container cgroup, and the `A` is what we called pod cgroup. No matter what permissions are set for the container (`B`), the `A`'s permission is always `a : rwm`. It leads that containers could acquire permission to access to other devices in VM that not belongs to themselves. In order to set devices cgroup properly, the order of setting cgroups is that the pod cgroup comes first and the container cgroup comes after. The `Sandbox` has a new field, `devcg_info`, to save cgroup states. To avoid setting container cgroup too early, an initialization should be done carefully. `inited`, one of the states, is a boolean to indicate if the pod cgroup is initialized. If no, the pod cgroup should be created firstly, and set default permissions. After that, the pause container cgroup is created and inherits the permissions from the pod cgroup. If whitelist mode which allows containers to access all devices in VM is enabled, then device resources from OCI spec are ignored. This feature not supports systemd cgroup and cgroup v2, since: - Systemd cgroup implemented on Agent hasn't supported devices subsystem so far, see: https://github.com/kata-containers/kata-containers/issues/7506. - Cgroup v2's device controller depends on eBPF programs, which is out of scope of cgroup. Fixes: #7507 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Archana Shinde	c075fa6817	tests: Add test with nerdctl to verify macvlan support Add test to verify kata supports macvlan networks. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-07 10:13:51 -08:00
Archana Shinde	07db673eb9	tests: Add test with nerdctl to verify ipvlan support Add test to verify kata supports ipvlan networks. This test can be bit tricky as it requires knowledge about host interfaces to be used as a master for the ipvlan network. However, with github actions, we can assume interface called eth0 to be present on the host and functioning. Fixes: #8366 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-07 10:13:51 -08:00
Archana Shinde	a6272733e7	network: Fix network hotplug for ipvlan and macvlan endpoints. Since moving from network coldplug to hotplug, the only case verified was veth endpoints. Support for network hotplug for ipvlan and macvlan was broken/not added. Fix it. Fixes: #8391 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-07 10:13:51 -08:00
James O. D. Hunt	59d0d4caff	runtime-rs: ch: Simplify VSOCK error handling Remove the redundant `VmConfigError::EmptyVsockSocketPath` error from the Cloud Hypervisor config crate since this scenario is already handled by the `VsockConfigError::NoVsockSocketPath` error. Fixes: #8385. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-07 17:45:38 +00:00
James O. D. Hunt	bdb83f8282	runtime-rs: ch: Remove unused function Remove the redundant `parse_mac()` function: this was never used and we already have an implementation in `crates/resource/src/network/utils/mod.rs`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-07 17:45:38 +00:00
Wainer Moschetta	949ac4d810	Merge pull request #8217 from beraldoleal/issues/8216 tests: fixes permission denied when running test	2023-11-07 12:25:23 -03:00
Wainer Moschetta	7f5d70f48b	Merge pull request #8061 from beraldoleal/gogo-removal-v3 Updating containerd to a GogoProtobuf free version	2023-11-07 12:18:50 -03:00
Xuewei Niu	8ea87405ed	runtime-rs: Remove virtio config from Backend Virtio-net and vhost-net share a common virtio config, and vhost-user-net uses another config, named `VhostUserConfig`. Thus, the virtio config could be added into `NetworkConfig` instead of `Backend`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Xuewei Niu	ad66378bf5	runtime-rs: Move Dragonball stuff out of device drivers Moving Dragonball structs convertions out of device drivers to keep driver neutral. The convertions include `NetworkBackend` to `DragonballNetworkBackend` and `NetworkConfig` to `DragonballNetworkConfig`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Xuewei Niu	3e0614cdf0	dragonball: Minor changes to comments Changes include: - Merge `VhostNetDeviceError` import item. - Replace if with match in `add_vhost_net_device()` Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Xuewei Niu	a047331a34	runtime-rs: Network config distinguishes backends Network backends determine the virtio dataplane implementations. Common protocols include virtio-net, vhost-net and vhost-user-net, etc. Network config has a new field named `backend` to specify which protocol to use. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Xuewei Niu	9203371833	dragonball: Introduce vhost-net device PLEASE NOTE THAT this pull request just implements vhost-net support for Dragonball, and adaptation for the Runtime-rs. And this pull request DOESN'T provide an item to config which backend to use. To sum up, virtio-net as a default backend is only choice for the user so far. This pull request introduces vhost-net device for the Dragonball. In addition, this pull request includes changes of Runtime-rs to improve network configuration abilities. The Dragonball part implements a vhost-net device and a vhost-net device manager, named `VhostNetDeviceMgr`, to manage vhost-net device. `NetworkInterfaceConfig` is introduced as a high-level abstract for network config. Then, the Dragonball is able to distinguish network backends, e.g. virtio-net, vhost-net, vhost-user-net(WIP), etc. The Runtime-rs part adds support of multiple network backends as well. `NetworkConfig` has a couple of new fields, like `backend`, `use_shared_irq`, etc. And Dragonball's network config structs are implmented `From` trait which allow to be converted from the Runtime-rs's network config conveniently. Fixes: #7674 Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Greg Kurz	b27b4ce104	doc: No longer release the test repository Now that most of the test repository got migrated to the main Kata repository, it is no longer needed to tag the test repository when doing a release. Update the documentation accordingly by dropping all references to the test repository and only mention the Kata repository. Fixes #8302 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-11-07 10:28:43 +01:00
Greg Kurz	af2d897fb1	doc: Release now uses the official GitHub CLI The hub tool is deprecated. Releases are now based on the official gh CLI. A notable improvement : when properly setup (see [1]), gh allows to directly use HTTPS with one's GitHub credentials, instead of having to setup proper SSH access for pushes to the repo. Adjust the documentation accordingly. Fixes #8302 [1] https://docs.github.com/en/github-cli/github-cli/quickstart#prerequisites Signed-off-by: Greg Kurz <groug@kaod.org>	2023-11-07 10:22:54 +01:00
Greg Kurz	2af9419fa4	doc: No longer run kata-deploy test when releasing This is already tested by CI for every PR. Drop this step from the release process documentation. Fixes #8302 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-11-07 10:19:32 +01:00
Beraldo Leal	dd530ba8ee	tests: fixes AMD errors TestCheckHostIsVMContainerCapable is failing on AMD machines. kata-check_amd64_test.go:96 has no AMD modules, also getCPUType is missing. Fixes #8384. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:59 +00:00
Beraldo Leal	7641c19f74	runtime: bump containerd for gogo deprecation This update includes necessary changes due to the version bump of containerd and its dependencies. It's part of a broader initiative to phase out gogo protobuf, which has been deprecated, and to align with the current supported libraries. Fixes #7420. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:59 +00:00
Beraldo Leal	16fa2c39e6	protocols: replace gogo/types.Empty and Any by Google versions. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Beraldo Leal	c61f4a8592	protocols: remove unused fieldpath option The +fieldpath option, specific to gogoprotobuf, enabled dynamic field access in protobuf messages, allowing nested fields to be accessed via string paths. This change is part of a larger effort to transition to the official Go protobuf library for better maintainability and community support. Upon review, no instances of dynamic field access were found in the codebase, confirming that the feature is not in use. By removing this unused feature, we simplify the build process and make it easier to complete the transition away from gogoprotobuf. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Beraldo Leal	c87bc60ea0	protocols: removing unused mappings Those mappings are not used by our .proto files and there is no difference between .pb.go files generated. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Beraldo Leal	c5d845b30a	agent: updating Cargo.lock files Probably previous changes missed updating Cargo.lock. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Beraldo Leal	5d88c78a6e	protocols: generating agent.pb.go `a3b003c345` modified agent but agent.pb.go was not updated. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
David Esparza	28e7b3467b	metrics: improving stop and remove running containers This PR makes the change to using the SIGKILL signal instead of SIGTERM to force stop each kata component before start running any metric test. Fixes: #8336 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-11-06 09:54:32 -06:00
Archana Shinde	3b2fb6a604	Merge pull request #8284 from amshinde/runtime-rs-update-device-pci-info runtime-rs: update device pci info for vfio and virtio-blk devices	2023-11-06 01:09:20 -08:00
Archana Shinde	036b7787dd	runtime-rs: Use PCI path from hypervisor for vfio devices Remove earlier functionality that tries to assign PCI path to vfio devices from the host assuming pci slots to start from 1. Get this from the hypervisor instead. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-05 21:59:44 -08:00
Archana Shinde	c3ce6a1d15	runtime-rs: Provide PCI path to the agent for virtio-block If PCI path for block device is not empty for a block device, use that as identifier for agent instead of virt path which is valid only for mmio devices. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-05 21:59:44 -08:00
Archana Shinde	a2bbbad711	runtime-rs: change hypervisor add_device trait to return device copy Block(virtio-blk) and vfio devices are currently not handled correctly by the agent as the agent is not provided with correct PCI paths for these devices. The PCI paths for these devices can be inferred from the PCI information provided by the hypervisor when the device is added. Hence changing the add_device trait function to return a device copy with PCI info potentially provided by the hypervisor. This can then be provided to the agent to correctly detect devices within the VM. This commit includes implementation for PCI info update for cloud-hupervisor for virtio-blk devices with stubs provided for other hypervisors. Removing Vsock from the DeviceType enum as Vsock currently does not implement the Device Trait, it has no attach and detach trait functions among others. Part of the reason is because these functions require Vsock to implement Clone trait as these functions need cloned copies to be passed down the hypervisor. The change introduced for returning a device copy from the add_device hypervisor trait explicitly requires a device to implement Copy trait. Hence removing Vsock from the DeviceType enum for now, as its implementation is incomplete and not currently used. Note, one of the blockers for adding the Clone trait to Vsock is that it currently includes a file handle which cannot be cloned. For Clone and Device Traits to be implemented for Vsock, it requires an implementation change in the future for it to be cloneable. Fixes: #8283 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-05 21:59:44 -08:00
Bo Chen	071667f1ca	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v35.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #8378 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-11-03 10:47:06 -07:00
Bo Chen	d1163141b9	versions: Upgrade to Cloud Hypervisor v36.0 Details of this release can be found in ourroadmap project as iteration v36.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #8378 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-11-03 10:46:56 -07:00
Fabiano Fidêncio	0aac3c76ee	Merge pull request #8365 from fidencio/topic/kata-manager-restrict-containerd-versions-to-be-used kata-manager: Accept only "lts" or "active" as containerd versions	2023-11-03 11:54:05 +01:00
Fabiano Fidêncio	8b4fc847d7	kata-manager: Accept only "lts" or "active" as containerd versions kata-manager is a very nice tool, but we shouldn't be trying to take care of "everything" in "all possible scenarios", and we should focus on installing Kata Containers dependencies that are supported. With this in mind, let's limit a little bit the scope of which versions of containerd can be installed, limitting to "active" and "lts", which will then install the latest version of those "flavours". The default value will always be "lts" as that's supposed to be the stable one. NOTE: This is a breaking change, as it changes the behaviour of what the script takes in its `-c` parameter. I'm assuming here we're safe to do so as the majority of the users should / would only be using the full installation by default. Fixes: #8356 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-03 10:30:37 +01:00
Fabiano Fidêncio	d395ae8198	Merge pull request #8368 from fidencio/topic/gha-stale-fixes gha: stale: Fix typo and allow manually triggering it	2023-11-03 10:07:56 +01:00
Fabiano Fidêncio	994615ca28	gha: stale: Allow manually triggering it This will help us to avoid waiting till the next time cron would trigger the action to test Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-03 08:17:48 +01:00
Fabiano Fidêncio	6abcf03611	gha: stale: Fix typo action -> actions This is causing the following error: ``` Unable to resolve action action/stale, repository not found ``` Fixes: #8347 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-03 08:15:18 +01:00
Steve Horsman	a7a14e33d8	Merge pull request #8285 from sazzy4o/patch-1 Docs: Fix Dragonball link	2023-11-02 17:54:47 +00:00
Fabiano Fidêncio	37233622da	kata-manager: Ensure we run apt-get update before apt-get install As that's an operation that can easily fail, and it's quite simple / cheap for us to run it, let's just do it and avoid the failure. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-02 14:14:32 +01:00
Fabiano Fidêncio	d547798284	Merge pull request #7057 from brianwang12/kata-manager-fix kata-manager: Fix deployment of containerd on architectures other than amd64.	2023-11-02 14:14:18 +01:00
Fabiano Fidêncio	8905286767	Merge pull request #8348 from fidencio/topic/gha-add-stale-action-for-PRs gha: Add workflow to close stale PRs	2023-11-02 11:34:35 +01:00
Fabiano Fidêncio	abec287058	gha: Add workflow to close stale PRs Our goal. as discussed in the Architecture Committee meeting held on October 31st, 2023, is to take a more aggressive action on issues and PRs that have been opened for a long time. This commit is the very first step, and it's only targetting PRs. What this action will do is: * Mark all the PRs that have no activity for more than 180 days, starting from May 1st, 2023, as stale. * A message will be added, letting the contributor know that they can simply comment on the PR in order to make it "not stale". * If there's no activity on the PR for 7 days, the PR will be automatically closed. Fixes: #8347 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-02 09:19:44 +01:00
briwan.wang	437db15916	kata-manager: Fix Mulit-Arch deployment for containerd Fix: Kata-Manager fails to retrieve the correct Containerd string name for architectures other than amd64. Update the 'github_get_release_file_url()' function to make it compatible with different architecture expressions. eg. aarch64/arm64, or x86_64/amd64, allowing it to acquire the correct URL addresses Fixes: #7071 Signed-off-by: briwan.wang <briwan.wang@arm.com>	2023-11-02 06:12:04 +00:00
Archana Shinde	004646162e	Merge pull request #8308 from gkurz/fully-drop-hub release: Fully migrate from hub to gh	2023-11-01 22:46:44 -07:00
Peng Tao	b3dbd4f1c7	Merge pull request #8351 from amshinde/update-agent-cargo-lock cargo: Agent cargo.lock updated	2023-11-02 11:31:24 +08:00
Archana Shinde	58b4d1a264	cargo: Agent cargo.lock updated The Cargo.lock for agent needs to be updated to include "safe-path" dependency. Fixes: #8350 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-01 11:54:33 -07:00
Fabiano Fidêncio	40cc397218	Merge pull request #8255 from cmaf/migrate-checks-fixes-links docs: Fix broken links	2023-11-01 14:46:30 +01:00
Beraldo Leal	afec54799e	libs: fixes dereferenced reference make check is giving us the following error: error: this expression creates a reference which is immediately dereferenced by the compiler. Fixes #8344 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-10-31 15:55:32 -04:00
Beraldo Leal	c57df607ad	libs: fixes comparison to empty slice Make check gives us an "error: comparison to empty slice". Fixes #8343 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-10-31 15:51:03 -04:00
Greg Kurz	d20b7381f0	release: Drop obsolete comment in workflow file This comment belongs to the hub tool that got sunset by `710eb8ab9d`. Just drop it. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 16:03:12 +01:00
Greg Kurz	6236fa4617	release: Drop build_hub helper Not used anymore. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 15:28:57 +01:00
Greg Kurz	bc4c66caaf	release: Migrate tag_repos.sh to GitHub CLI The hub tool is deprecated. Convert this script to use the official GitHub CLI gh instead of hub. A typical gh setup is able to access repos using HTTPS along with GitHub credentials. It is only needed to patch the remote url when using SSH. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 15:11:28 +01:00
Greg Kurz	e331102ba3	release: Migrate update-repository-version.sh to GitHub CLI The hub tool is deprecated. Convert this script to use the official GitHub CLI gh instead of hub. A couple of adjustments had to be made : - the notes.md temporary file is moved to ${tmp_dir} in order to silent gh, otherwise it complains about an untracked file, - title of a PR no longer goes to the notes.md file since gh requires the title to be passed with a dedicated --title option. Fixes #8303 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 15:10:50 +01:00
Greg Kurz	b83a7149ee	release: Introduce helper to get GitHub CLI If gh isn't installed already, download it from GitHub. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 15:09:24 +01:00
Fabiano Fidêncio	53cda12a71	Merge pull request #8311 from TimePrinciple/log-system-enhancement runtime-rs: Log system enhancement	2023-10-31 10:14:41 +01:00
Greg Kurz	ceeabe3714	release: Allow to test release scripts with an alternate repo We don't want to mess with the official repo when testing a change in the release scripts. Adapt `update-repository-version.sh` to be able to use an alternate repo just like `tag_repos.sh` already does. This means that the following command : $ OWNER="$SOME_ORG" ./update-repository-version.sh -p "$NEW_VERSION" "$BRANCH" will only create a PR in this repo : http://github.com/$SOME_ORG/kata-containers.git Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 09:49:27 +01:00
Archana Shinde	148c565b2f	Merge pull request #8289 from BbolroC/skip-create-tmpfs-s390x agent: Skip flaky create_tmpfs on s390x	2023-10-30 22:26:28 -07:00
Ruoqing He	4ad2cfe0c2	runtime-rs: Log system enhancement By modifying RuntimeLevelFilter drain to improve logging control, enabling isolation of change effect of the loggers between components, tuning clh logs to be logged according to their log levels given by cloud-hypervisor. Fixes: #8310 Signed-off-by: Ruoqing He <linuxwatcher@outlook.com>	2023-10-31 04:57:46 +00:00
David Esparza	2a17d3889e	Merge pull request #8334 from amshinde/ipvlan-nerdctl-fix network: Fix network attach for ipvlan and macvlan	2023-10-30 16:00:32 -06:00
David Esparza	5573705800	Merge pull request #8202 from dborquez/enable_fio_checkmetrics Enable fio checkmetrics	2023-10-30 15:55:37 -06:00
David Esparza	c232869af9	metrics: removes double-quotes in checkemtrics when parsing results This PR removes double quotes in jq output to return raw strings as input of checkmetrics tool. Fixes: #8331 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-30 09:43:03 -06:00
David Esparza	c42a2f2eda	metrics: increase the number of attempts to stop kata This PR increases the number of attempts to stop kata components when it is required usually before starting a metrics test. Fixes: #8307 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-30 09:43:03 -06:00
David Esparza	1626253d9e	metrics: FIO ci test enablement This PR enables the new FIO test based on the containerd client which is used to track the I/O metrics in the kata-ci environment. Additionally this PR fixes the parsing of results. Fixes: #8199 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-30 09:42:54 -06:00
David Esparza	873386a349	metrics: update iodepth and job size fio parameters to improve workload This PR updates the values of the fio parameters for iodepth requests and for the number of jobs, in order to increase the number of sequential operations. Additionally, it adds the list of packages needed to parse the results. Fixes: #8198 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-30 08:43:06 -06:00
James O. D. Hunt	d93275224b	Merge pull request #8323 from jodh-intel/utils-kata-manager-fix-version-checks utils: kata manager: Fix version checks	2023-10-30 12:25:51 +00:00
Chao Wu	7d26604061	Merge pull request #7831 from lisongqian/feat/dragonball_trace dragonball: add tracing feature for dragonball	2023-10-30 17:27:30 +08:00
James O. D. Hunt	d7e410ad2b	Merge pull request #8314 from jodh-intel/kata-ctl-show-confidential-guest kata-runtime/kata-ctl: Add security details to output	2023-10-30 07:41:22 +00:00
Songqian Li	2f533c3003	dragonball: add tracing feature for dragonball This PR adds the tracing capability for dragonball and it depends on the tracing::Subscriber of the upper layer. Fixes: #7249 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-28 19:52:24 +08:00
Chao Wu	f1f4410537	Merge pull request #7695 from lisongqian/feat/legacy_metrics dragonball: add metrics support for legacy device	2023-10-28 16:48:57 +08:00
Archana Shinde	f53f86884f	network: Fix network attach for ipvlan and macvlan We used the approach of cold-plugging network interface for pre-shimv2 support for docker.Since the hotplug approach was not required, we never really got to implementing hotplug support for certain network endpoints, ipvlan and macvlan being among them. Since moving to shimv2 interface as the default for runtime, we switched to hotplugging the network interface for supporting docker and nerdctl. This was done for veth endpoints only. Implement the hot-attach apis for ipvlan and macvlan as well to support ipvlan and macvlan networks with docker and nerdctl. Fixes: #8333 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-27 21:42:37 -07:00
Peng Tao	52a014d9cd	Merge pull request #8033 from h56983577/6715/shared-mount agent: use open_tree()/move_mount() to set up bind mounts between containers directly.	2023-10-28 10:57:34 +08:00
Songqian Li	da77b19449	dragonball: output legacy device metrics to runtime Legacy device manager adds device metrics to METRICS when a device is created and removes metrics when a device is dropped. Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-27 14:09:42 +08:00
Songqian Li	65213e9fbe	dragonball: unify the metric interface of legacy device Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-27 14:09:42 +08:00
Chao Wu	b508091305	Merge pull request #8322 from wainersm/git_helper-fix tests/git-helper: cancel any previous rebase left halfway	2023-10-27 14:07:16 +08:00
Spencer von der Ohe	fee97e219c	docs: Fix Dragonball link Update dragonball link to be the current repo (from archived repo) Fixes #8324 Signed-off-by: Spencer von der Ohe <s.vonderohe40@gmail.com>	2023-10-26 21:12:31 -06:00
Archana Shinde	f5c17f89a3	Merge pull request #8250 from amshinde/runtime-rs-clh-config runtime-rs: Add default configuration file for cloud-hypervisor	2023-10-26 14:54:47 -07:00
Chelsea Mafrica	0608e20a01	docs: Fix broken links Update broken links so that static checks pass. Fixes #8254 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-10-26 10:17:01 -07:00
Chelsea Mafrica	4ede63fa4d	Merge pull request #8317 from cmaf/gha-spellcheck-reqs gha: add dependencies for spell checker	2023-10-26 10:11:26 -07:00
James O. D. Hunt	ae3ea1421d	utils: kata-manager: Fix containerd version check Contained release files include the version number without a "v" prefix. However, the tag for the equivalent release does include it so handle this distinction and also tighten up the Kata check by specifying an explicit version number in the regex. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-26 16:34:56 +01:00
James O. D. Hunt	346f195532	utils: kata-manager: Fix whitespace Use tabs consistently. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-26 16:06:51 +01:00
Wainer dos Santos Moschetta	0ce0abffa6	tests/git-helper: cancel any previous rebase left halfway In bare-metal machines the git tree might get on unstable state with the previous rebase left halfway. So let's attempt to abort any rebase before. Fixes #8318 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-26 11:50:12 -03:00
James O. D. Hunt	2ac7ac1dd2	utils: kata-manager: Fix "Cannot determine download URL" issue The archive names for x86_64 [Kata releases](https://github.com/kata-containers/kata-containers/releases) used to include the tag `x86_64`, but that has now been changed to `amd64`, which unfortunately broke `kata-manager.sh`: ``` kata-static-3.1.3-x86_64.tar.xz ~~~~~~ expected kata-static-3.2.0-alpha3-x86_64.tar.xz ~~~~~~ expected kata-static-3.2.0-alpha4-amd64.tar.xz ~~~~~ changed ``` Fixes: #8321. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-26 15:27:37 +01:00
James O. D. Hunt	59bd534827	utils: kata-manager: Lint fixes Improve the code by fixing some lint issues: - defining variables before using them. - Using `grep -E` rather than `egrep`. - Quoting variables. - Adding a check for invalid CLI arguments. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-26 15:24:46 +01:00
HanZiyao	a3b003c345	agent: support bind mounts between containers This feature supports creating bind mounts directly between containers through annotations. Fixes: #6715 Signed-off-by: HanZiyao <h56983577@126.com>	2023-10-26 16:34:50 +08:00
Archana Shinde	1b8ec08278	Merge pull request #8281 from amshinde/add-clh-config-kata-manager kata-manager: Add clh config to containerd config file	2023-10-25 13:44:53 -07:00
Chelsea Mafrica	c20aadd7a8	gha: add dependencies for spell checker In the migration from the tests repo to the kata containers repo we missed two huspell dictionaries for static checks; add them. Fixes #8315 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-10-25 12:49:09 -07:00
James O. D. Hunt	d707fa2c0d	kata-runtime/kata-ctl: Add security details to output Add the hypervisor security details to the output of the `kata-runtime env` and `kata-ctl env` commands so the user can see, amongst other things, the value of `confidential_guest`. Fixes: #8313. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-25 16:34:42 +01:00
Chao Wu	29d863350f	Merge pull request #7697 from lisongqian/feat/balloon_metrics dragonball: add metrics support for balloon device	2023-10-25 02:42:14 -05:00
Fabiano Fidêncio	328ba0da99	Merge pull request #7647 from jongwu/use_pcie_virt AArch64: runtime: use pcie root port to do pci/pcie device hotplug	2023-10-25 09:17:13 +02:00
Archana Shinde	f99de4d5a1	runtime-rs: Make default kernel params as empty The default kernel params passed to any hypervisor except dragonball is empty. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-24 15:50:12 -07:00
Archana Shinde	a813012785	runtime-rs: Add default configuration file for clouf-hypervisor The config template file for clh is in the new format for runtime-rs. It is a result of merging the new format file and options supportted by cloud-hypervisor. Some config options from the golang runtime are missing as they may not be currently supported by the rust runtime. An example of this is the selinux options, rate limiting options as these are not currently supported or verified with the rust runtime. Fixes: #8249 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-24 15:17:24 -07:00
Chao Wu	43675bd485	Merge pull request #8294 from ZizhengBian/jason/for-master runtime-rs: fix a typo in device manager	2023-10-24 04:52:04 -05:00
Songqian Li	dce365d5b4	dragonball: add conditional compilation for BalloonDeviceMetrics Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-24 13:33:39 +08:00
GabyCT	4c3a664358	Merge pull request #8278 from GabyCT/topic/udpparallel metrics: Add parallel udp iperf3 benchmark	2023-10-23 10:30:53 -06:00
Fabiano Fidêncio	a001021721	Merge pull request #8292 from fidencio/topic/release-ensure-gh-is-used-from-a-git-repo release: Always use actions/checkout to ensure we're in a git repo	2023-10-23 15:16:12 +02:00
Songqian Li	3819f0ee6f	dragonball: output balloon device metrics to runtime Balloon device manager adds balloon device metrics to METRICS when a device is created and remove metrics when a device is dropped. Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-23 21:15:22 +08:00
Zizheng Bian	7d7c25c1d6	runtime-rs: fix a typo in device manager Fixes: #8293 Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>	2023-10-23 20:33:47 +08:00
Fabiano Fidêncio	c5cfad7023	actions: Move all the checkout actions to v4 It's been released for a while now, and we need to keep consistency between what we used. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-23 14:01:53 +02:00
Fabiano Fidêncio	b32c6bf805	release: Always use actions/checkout to ensure we're in a git repo Otherwise we'll face issues like: ``` Run tag=$(echo $GITHUB_REF \| cut -d/ -f3-) tag=$(echo $GITHUB_REF \| cut -d/ -f3-) tarball="kata-static-$tag-amd64.tar.xz" mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}" pushd $GITHUB_WORKSPACE echo "uploading asset '${tarball}' for tag: ${tag}" GITHUB_TOKEN=*** gh release upload "${tag}" "${tarball}" popd shell: /usr/bin/bash -e {0} ~/work/kata-containers/kata-containers ~/work/kata-containers/kata-containers uploading asset 'kata-static-3.3.0-alpha0-amd64.tar.xz' for tag: 3.3.0-alpha0 failed to run git: fatal: not a git repository (or any of the parent directories): .git ``` Fixes: #8286 (or better, just a follow up of that) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-23 14:00:39 +02:00
Fabiano Fidêncio	8fe88696c0	Merge pull request #8287 from fidencio/topic/release-use-gh-cli-instead-of-hub actions: release: Use GH cli instead of hub	2023-10-23 12:40:22 +02:00
Hyounggyu Choi	a0746c8d7b	agent: Skip flaky create_tmpfs on s390x This is to skip a flaky test `create_tmpfs()` on s390x until a root cause is identified and fixed. Fixes: #4248 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-10-23 11:22:14 +02:00
Fabiano Fidêncio	710eb8ab9d	actions: release: Use GH cli instead of hub hub is now deprecated, which has been causing issues with our release process. Let's move to the GH cli (https://cli.github.com/manual), and unblock this release. NOTE: This commit is purposefully not touching anywhere else hub is used, as that would require more time and investigation to do the switch, and right now we just want to unblock the release. Fixes: #8286 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-23 08:49:55 +02:00
Fabiano Fidêncio	74d4865189	Merge pull request #8275 from fidencio/topic/ci-adapt-kata-deploy-regex-on-repo-version-update release: Adapt the CIs using the kata-deploy image	2023-10-23 00:37:19 +02:00
Archana Shinde	d3250dff34	kata-manager: Add clh config to containerd config file kata-manager currently adds default config which currently is qemu. Add config for clh as well to containerd configuration. This should allow new users to get started with clh using kata-manager. Also add config related to enabling privileged_without_host_devices. Always good to have this config enabled when users try to run privileged containers so that devices from host are not inadverdantly passed to the guest. Fixes: #8280 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-20 18:16:16 -07:00
Gabriela Cervantes	2d0518cbe6	metrics: Add parallel udp iperf3 benchmark This PR adds the parallel udp iperf3 benchmark for network metrics. Fixes #8277 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-20 19:54:06 +00:00
Dan Mihai	732fe163f3	Merge pull request #8229 from microsoft/danmihai1/no-config-toml-endpoints agent: no endpoint blocking from agent-config.toml	2023-10-20 11:30:43 -07:00
Fabiano Fidêncio	026f6a1a4c	release: Adapt the CIs using the kata-deploy image This is needed in order to properly run the CIs in branches that are not the main one, as the kata-deploy.yaml file on those branches do not have the `latest` tag, but rather the latest stable release. Fixes: #8274 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-20 18:59:14 +02:00
Fabiano Fidêncio	124f498830	Merge pull request #8266 from fidencio/3.3.0-alpha0-branch-bump # Kata Containers 3.3.0-alpha0	2023-10-20 17:40:44 +02:00
GabyCT	8486283012	Merge pull request #8247 from GabyCT/topic/iperfudp metrics: Add iperf udp benchmark	2023-10-20 09:21:37 -06:00
Fabiano Fidêncio	0fb69ddf6a	release: Kata Containers 3.3.0-alpha0 - kata-deploy-stable: Switch to using the ubuntu based payload - libs: protection: Fix typo in TDX output - ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat - tests: Enable agent stability test - docs: Fix paths to build kernel in SNP VMs documentation - runtime-rs: ch: Add TDX CH features check - runtime: Validate hypervisor section name in config file - tests: query data from the OPA service - release: tag_repos: Stop tagging the `tests` repo - metrics: fixes common.sh function to always return true - Memory footprint test removing trailing commas to make json results file valid - policy: allow access to ReseedRandomDev - runtime/kata-ctl: update dependencies - runtime-rs : fix Nydus support for runtime-rs + Dragonball - metrics: removal of reference in the documentation to the fio dax subtest. - runtime-rs: ch: Detect Intel TDX version - runitme-rs: use the same base64 as kata-runtime/direct-volume does - tests: Enable scability test for stability CI - runtime-rs: Add support for adding vfio device for cloud-hypervisor - tests: Enable soak parallel stability test - dragonball: vcpu metrics change to be recorded per vcpu - ci: k8s: adapt gha-run.sh to run locally - metrics: removes kata components and k8s deployment when test finishes - GHA: fix up referenced yaml exceeding 20 limit problem - gha: ci: Revert tracing test PR to unbreak CI - runtime-rs: ch: Enable feature - gha: ci: Port runk tests over - ci: gha: Port tracing tests over - Enable fio test using containerd client - gha: Add stability tests workflow for gha - gha: arm64: Ensure the builder is arm64-builder - kata-deploy: Build kata-agent as we build all the other components - versions: migrate out of k8s.gcr.io - doc: Update crictl pod-config - gha: Fix k0s deployment - tests: Add stability test for kata CI - docs: Update url in kata vra document - gpu: Adding CDI support for cold and hot-plug of VFIO devices - kata-deploy: build & ship the rust components from src/tools/ - metrics: Add latency value limits for kata CI - runtime: fix reading cgroup stats of sandboxes - Upgrade to Cloud Hypervisor v35.0 - ci: Port kata-monitor tests from Jenkins to GHA - metrics: Fix latency yamls path - metrics: Fix metrics README - metrics: Fix C-Ray documentation - runtime-rs: ch: Enable Intel TDX - ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI - metrics: Enable latency test in gha run script - local-build: Fix .docker ownership before build-payload - runtime-rs: Add network support for cloud-hypervisor - osbuild: Reduce guest components binary size with strip - gha: Add pandoc as a dependency for static checks - ci: rootfs-image build-asset is failing - feat(runtime-rs): introduce huge page mode to select VM RAM's backend - clh: Direct IO support for block devices - gha: Install hunspell for static checks - ci: Trigger payload-after-push on workflow_dispatch - ci: Actually enable the CRI-O tests - protocol: remove gogoprotobuff tests - ci: k8s: Also run tests with CRI-O - runtime: support kernel params including spaces - ci: kata-deploy: Fix runner name - metrics: Enable parallel bandwidth iperf limit - ci: kata-deploy: Enable all k8s flavours that we support - ci: Create clusters in individual resource groups - versions: Bump virtiofsd to v1.8.0 - clh: arm: Use static_sandbox_resource_mgmt=true - Bump nydus versions and update nydus tests - runtime/qemu: Rework QMP/HMP support - clh:arm64: use arm AMBA UART for hypervisor debug - ci: Use variable size of VMs depending on the tests running - ci: Rework static checks - runtime: incorrect handling of non-empty []Endpoint parameter in Remo… - ci: cache: Check the sha256sum of the components & fix ovmf-sev cache usage - ci: cache: Use the artefacts stored in ghcr.io/kata-containers/cached-artefacts/${component} - ci: Run some of the GARM tests in smaller instances - ci: Reduce the size of the AKS VMs - ci: cache: Allow pushing our artefacts to an OCI registry - metrics: Add iperf value for cpu utilization - ci: cache: Export env vars needed to use ORAS - gha: vfio: Import test script - tests: fix kernel and initrd annotations - metrics: Add iperf bandwidth value for kata metrics - metrics: Add Cassandra Metrics documentation - metrics: Remove warning from metrics documentation - ci: docker: nerdctl: Switch to tcp port 80 ping - runtime: Naming conflict of network devices - Remove gogoproto.nullable extension - metrics: Ensure docker is running in init_env - metrics: this PR skips the FIO test temprarily to fix issues - ci: Add a very basic nerdctl sanity test - runtime-rs: hypervisor: Remove debug kernel options - versions: Bump rust version - ci: Add a very basic docker sanity test - dragonball: fix for non-deterministic builds - runtime-rs: bring hybrid vsock devices in manager. - ci: use github.ref_name instead of $GITHUB_REF_NAME - ci: Add more target-branch related fixes - ci: Fix target-branch usage - agent: optimize the code of systemd cgroup manager - gha: Manually rebase PR atop of the target branch before testing - Update kernel to the latest LTS release (v6.1.52) and bring in erofs patches needed for the CC work - kata-deploy: Fix aarch64 image build - runtime: Fix more virtiofs args - kata-deploy: Switch to an alpine image - metrics: Use TensorFlow optimized image - metrics: fix FIO test initialization - ci: k8s: Add clean-up-garm argument for gha-run.sh - ci: k8s: Second round of fix-ups with the devmapper CI - metrics: re-enable memory-usage initialization step - Dragonball: optimize the placement of dbs-upcall features - ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml - ci: k8s: Add k8s devmapper tests (part 0) - kata-deploy: Create kata-static.tar with correct ownership - runtime: run prestart hooks before starting VM for FC - metrics: Add write 95 percentile FIO value - runtime: Allow virtio_fs_extra_args annotation - packaging: do not install docker-compose-plugin for s390x\|ppc64le - runtime-rs: Fix volumes and rootfs cleanup issues - metrics: Enable iperf benchmark on gha for kata metrics - CI: switch static-checks-dragonball CI machines to Azure - metrics: Add README for kata metrics report - osbuilder: Remove chcon operation for guest SELinux - kata-sys-util: protection: Update TDX checks - Improve the way to clean up storage devices for sandbox - agent: avoid possible leakage of storage device - tests: add policy to existing tests - gha: Rebase PR atop of the target branch before testing - versions: Update alpine to its 3.18 version - runtime: Fix data race in ioCopy - metrics: Add grabdata script for metrics report - Fixes tests on AMD machines - metrics: Enable FIO limits for kata metrics - metrics: Add metrics report script - metrics: Fix memory inside limits for kata metrics - metrics: fix parsing issue on memory-usage test - dragonball: vsock add fifo/pipe stream support for passed fd hybridSt… - tests: Add confidential test - tdx: Update the components needed for using the 6.2 kernel stack - tests: delete k8s deployment at the test's end - tests: use unique test name - runtime-rs: check peer close in log_forwarder - gha: Avoid "fail-fast" in tests that are known to be flaky - Refine storage device management for kata-agent - metrics: Remove unused variable in tensorflow nhwc script - kata-deploy: Don't try to remove /opt/kata - metrics: Add TensorFlow ResNet50 FP32 benchmark - gha: vfio: Run on Ubuntu 23.04 runner - kata-agent: use default filemode for block device when it is set to 0 - kata-types: introduce KataVirtualVolume to support nydus, direct volume and image pull - libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml - local-build: Remove GID before creating group - kata-deploy: Avoid failing on content removal - runtime: fix image and initrd assets handling - metrics: Add disk link to README - metrics: Fix FIO path - gha: capture additional kata-deploy output - metrics: Use function from metrics common in pytorch script - metrics: Enable kata runtime in K8s for FIO test. - metrics: Fix README for pytorch - metrics: Remove unused variable in tensorflow mobilenet script - rootfs: agent: Policy support with AGENT_INIT=yes - gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy - metrics: Fix check results for tensorflow benchmark - metrics: Add Tensorflow ResNet50 int8 benchmark - kata-deploy: Properly create default runtime class - agent: simplify error handling - metrics: Fix MobileNet help me description - gha: ci: Start running kata-deploy tests - runk: Modify kill command's error message for containerd tests - runtime-rs: add driver option - gha: cri-containerd: Enable tests - metrics: Rename tensorflow scripts - gha: tests: Add kata-deploy functional tests -- Part 1 - agent: runtime: add Agent Policy feature - runk: Support without pid ns - metrics: Add Cassandra Kubernetes benchmark for kata metrics - metrics: Add common functions to the common script - metrics: fix the loop used to stop kata components - docs: Remove installation step in virtcontainers doc - Propogate secrets, config maps etc into guest if sharedFS not available - kata-deploy: Preliminary k0s support - gha: static-checks: Move to the Azure instances - versions: Update firecracker version to 1.4.0 - agent: Allow clippy::redundant_clone in the unit tests - agent: avoid creating new `Vec` instances when easily avoidable - metrics: compute tensorflow statistics - metrics: Add network nginx benchmark - metrics: install kata once and run multiple checks - ci: unencrypted-image: Fix build context - ci: create-confidential-image: Add dependent actions - Follow up fixes for https://github.com/kata-containers/kata-containers/pull/7596 - tests: Create image that will be used in the unencrypted confidential tests - kata-deploy: Ensure we cover SHIMS / DEFAULT_SHIM as part of our tests - tests: upgrade bats version - Fix mimor bugs and improve coding stype of agent rpc/sandbox/mount - deps: Bump dependent crate versions - fix number of queues handling in dragonball share fs device - runtime-rs: Introduce directly attachable network - metrics: General improvements to mobilenet tensorflow test - gha: Add iperf network metrics - docs: Use control-plane term instead of master - agent: avoid unnecessary calls to `Arc::clone` - metrics: Add network latency test - Image pulling on the host - Use version 0.10.4 of `fuse-backend-rs` - kata-deploy: Use host's systemctl - release: Revert kata-deploy changes after 3.2.0-rc0 release - metrics: stop kata components before start a metric test. - runtime-rs: Add block device handling for cloud hypervisor `a93fdb014` kata-deploy-stable: Adapt to what we're using in the stable branch `36109da93` ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat `d01daf749` tests: Adjust timeout for agent stability test `9b14dda14` libs: protection: Fix typo in TDX output `0e0867f15` runtime-rs: ch: Add TDX CH features check `409eadddb` runtime-rs: ch: Improve readability of guest protection checks `82a0814fc` tests: Enable agent stability test `32be8e3a8` tests: query data from the OPA service `b81c0a669` tests: encode policy file during test `4f9681b41` metrics: fixes common.sh function to always return true `2ef2b2a6d` docs: Fix paths to build kernel in SNP VMs documentation `408b59c02` runtime-rs: fix bugs to support Nydus v5 `157caea9f` Revert "nydus: Temporarily skip tests on dragonball" `678fe3cd3` Dragonball: fix Nydus config serde problem `b6ec62138` policy: allow access to ReseedRandomDev `908519db9` metrics: skips docker restart when it is not installed or is masked. `c2763120a` metrics: removing trailing comma characters from json file. `3e8cf6959` runtime: Validate hypervisor section name in config file `ef6388e81` tests: Remove unused function from scability test `fbc8f8f46` scripts: Use install_yq from the `kata-containers` repo `65b1a2d27` release: tag_repos: Stop tagging / updating the `tests` repo `87b760f56` runtime-rs: ch: Detect Intel TDX version `73e81f5e3` runitme-rs: unify base64 encoding for direct-volume `c6463cb5a` tests: Fix path for versions yaml for soak parallel test `89c9454fc` metrics: removal of reference in the documentation to the dax test. `30ff58904` tests: Enable scability test for stability CI `8d6f7b909` runtime-rs: Add support for handling vfio device for cloud-hypervisor `e786b2b01` gha: Add install dependencies for stability tests `dbfe6512f` dragonball: vcpu metrics change to be recorded per vcpu `fa60fbe02` dragonball: METRICS is refactored to RwLock<DragonballMetrics> `500d1c5ce` kata-ctl: update rustls-webpki/webpki dependency `d7660d82a` runtime: unify gopkg.in/yaml.v3 to v3.0.1 `fc9a107e8` runtime: unify swag and testify dependency `79ebb959c` runtime: update runc dependency to v1.1.9 `7f3e8bd65` runtime: unify golang.org/x/text to v0.7.0 `df325ae37` runtime: update golang.org/x/net to v0.7.0 `bba34910d` metrics: stops kata components and k8s deployment when test finishes `84e3d884e` gha: Add general dependencies to stability tests `dec3951ca` tests: Add soak parallel stability test `0f04d527d` tests: Enable soak parallel test `e669282c2` ci: k8s: set KUBERNETES default value `c30c3ff18` tests: run k8s-volume on a given node `666993da8` tests: run k8s-file-volume on a given node `3a00fc910` tests: exec_host() now gets the node name `61c9c17bf` tests: add get_one_kata_node() to tests_common.sh `68f083c4d` ci: k8s: set KATA_HYPERVISOR default value `6677a61fe` ci: k8s: configurable deploy kata timeout `200e54292` ci: k8s: shellcheck fixes to gha-run.sh `4af78be13` kata-deploy: re-format kata-[deploy\|cleanup].yaml `d54e6d9cd` ci: k8s: run_tests() for kcli `c2ef1f0fb` ci: k8s: add deploy-kata-kcli() to gh-run.sh `d2be8eef1` ci: k8s: add cleanup-kcli() to gha-run.sh `cbb9aa15b` ci: k8s: set default image for deploy_kata() `89bef7d03` ci: k8s: create k8s clusters with kcli `954d40cce` gha: combine coco jobs into a single yaml `b60e0a9b5` gha: combine basic amd64 jobs into a single yaml `e9bd85211` gha: ci: Revert tracing test PR to unbreak CI `b8a46a4b8` runtime-rs: ch: Enable feature `0f2dc8c67` gha: Add containerd stability tests to ci yaml `da91c9df8` ci: Port runk tests to this repo `7f2377276` ci: Add placeholder for runk tests `9205acc3d` ci: Move tracing tests here `85d290a04` gha: Add stability gha run script `54f0c8f88` gha: Add stability tests workflow for gha `3bb2923e5` ci: Add placeholder for tracing tests `2c3bf406d` ci: Create a function to install docker `119f03de2` gha: arm64: Ensure the builder is arm64-builder `8c498ef5e` metrics: Use jq tool to pretty-print json metrics output `a2159a636` metrics: Enables FIO test for kata containers `70e7ec3e2` gha: Fix k0s deployment `560bbffb5` packaging: tools: Remove `set -x` leftover `18fa483d9` packaging: release: Mention newly added images `ca3b88837` packaging: tools: Fix container image env var name `5ca66795c` packaging: Allow passing the TOOLS_CONTAINER_BUILDER `02acef957` gha: Build the kata-agent as part of our workflows `5208386ab` packaging: Build the kata-agent `1727487ee` agent: Allow specifying DESTDIR and AGENT_POLICY via env vars `45c118883` packaging: Add get_agent_image_name() `0db8fb8f9` versions: migrate out of k8s.gcr.io `a1a054367` doc: Fix spelling `6339605a1` tests: Add general stability fixes `59ae24444` doc: Update crictl pod-config `fd19f4082` tests: Add agent stability test `215577032` tests: Add cassandra stress in stability tests `f2d3ea988` tests: Add stressng dockerfile for stability tests `6493aa309` tests: Add stressor CPU test for stability tests `ef68a3a36` metrics: Add stability test for kata CI `7c934dc7d` gpu: Fix cold-plug of VFIO devices `8d66ef518` metrics: Increase qemu jitter value `5600e28b5` metrics: Increase jitter value for clh `a6b1f5e21` ci: Build src/tools components as part of our tests / releases `501a168a8` kata-deploy: Build components from src/tools `6ef42db5e` static-build: Add scripts to build content from src/tools `4d08ec29b` packaging: Add get_tools_image_name() `98097c96d` packaging: Use git abbreviated hash `489caf1ad` ci: kata-monitor: Move tests over `a3fb067f1` ci: Add placeholder for kata-monitor tests `57cb4ce20` ci: Make install_kata aware of container engines `de1eeee33` ci: Create a generic install_crio function `64a200085` ci: Add install_cni_plugins helper `8132fe15c` ci: Modify containerd default config `8cb7df1be` metrics: Add checkmetrics for latency test `e90440ae2` metrics: Add qemu latency value limit `a74a8f8a9` metrics: Add latency value limits for kata CI `d7def8317` metrics: Fix general check static warnings `928553d1b` docs: Update url in kata vra document `b0a3293d5` runtime-rs: ch: Enable Intel TDX `523399c32` runtime-rs: ch: Add more consts `dea806581` runtime-rs: ch: Remove unused function `995f2c015` runtime-rs: ch: Only handle particular pending device types `b1b96a5c4` runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check `9ac29b8d3` metrics: Add init_env function to latency test `dfd0c9fa9` runtime: clh: Re-generate the client code `8f9f087e3` versions: Upgrade to Cloud Hypervisor v35.0 `81c8babca` metrics: Fix latency yamls path `481573682` metrics: Fix C-Ray documentation `ef63d67c4` ci: crio: Trail '\r' from exec_host() output `74c12b292` ci: crio: Enable default capabilities `358dc2f56` kata-deploy: Fix CRI-O detection `ebaa4fa4c` ci: crio: Pass `-y` to apt `97e73b223` metrics: Fix spelling warnings `36c8cd6f1` metrics: Fix metrics README `15425a2b8` local-build: Fix .docker ownership before build-payload `13ca7d9f9` gha: Add pandoc as a dependency for static checks `08bc8e4db` metrics: Add latency benchmark for gha `6776b55d7` metrics: Enable latency test in gha run script `94e2ccc2d` runtime: fix reading cgroup stats of sandboxes `d507d189b` fc: Add support for noflush cache option `2ca781518` clh: Direct IO support for block devices `0c95697cc` ci: Trigger payload-after-push on workflow_dispatch `28cbc3b51` ci: rootfs-image build-asset is failing Fixes: #8027 `87a861648` gha: Install hunspell for static checks `8c3c50ca8` ci: Actually enable the CRI-O tests `3a6510ad6` osbuild: Reduce guest components binary size with strip `07a6e63a6` ci: k8s: rke2: Use sudo to call systemd `03b82e848` ci: k8s: Add a CRI-O test `d7105cf7a` ci: k8s: Add a method to install CRI-O `54c0a471b` ci: k8s: k0s: Allow passing parameters to the k0s installer `730ef5169` deps: updating dependencies `3a2c83d69` ci: kata-deploy: Fix runner name `82ff2db46` runtime: support kernel params including spaces `604a9dd67` protocol: remove gogoprotobuff tests `f7fa7f602` ci: Enable kata-deploy tests for all the supported k8s flavours `2c908b598` ci: kata-deploy: Add the ability to deploy rke2 `eaf616491` ci: kata-deploy: Add the ability to deploy k0s `001525763` ci: kata-deploy: Add deploy-k8s argument to gha-run.sh `bf2cb0228` ci: kata-deploy: Expland tests to run on k0s / rke2 `b12b9e188` ci: kata-deploy: Add placeholder for tests on GARM `9e1fb8a96` ci: kata-deploy: Export KUBERNETES env var `09cc0ed43` ci: Move deploy_k8s() to gha-run-k8s-common.sh `486fe14c9` ci: Properly set K8S_TEST_UNION `d9ef1352a` ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name `68267a399` ci: Create clusters in individual resource groups `9aa8d1c91` metrics: Add parallel bandwidth limit for qemu `44c7c082d` versions: Bump virtiofsd to v1.8.0 `af59d4bf4` metrics: Enable parallel bandwidth iperf limit `aba36ab18` nydus: Temporarily skip tests on dragonball `b8a8dfcd1` nydus: Use `kata-${KATA_HYPERVISOR}` instead of `kata` `f6df3d6ef` static-build: Fix arch error on nydus build `2f9c9e2e6` tests: nydus: Update nydus tests `c9a4e7e46` versions: Bump nydus and nydus-snapshotter to its latest release `b73bde320` gha: nydus: Populate run() `b3904a1a3` gha: nydus: Populate install_dependencies() `d2b3b67f5` gha: nydus: Actually install kata when `install-kata` is called `0ec00ad42` gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh `568439c77` tests: nydus: Add timeout to the crictl calls `5ac3b76eb` tests: nydus: Add uid / namespace to the nydus container / sandbox `376574a16` tests: nydus: Decorate some calls with `sudo` `4290fd4b6` tests: nydus: Adapt "source ..." to GHA `a84efa3e8` tests: nydus: Adapt check to "clh" instead "cloud-hypervisor" `56a14b395` tests: common: Add install_nydus_snapshotter() `b6563783e` tests: common: Add install_nydus() `72599f191` clh: arm: Use static_sandbox_resource_mgmt=true `1f16b6627` runtime/qemu: Rework QMP/HMP support `8b1e9b0c7` ci: static-checks: Clean up static-checks job `2c5ca2eaf` ci: static-checks: Run tests depending on KVM `509c309ab` ci: static-checks: Move "sudo make test" to the new test matrix `4e963cedf` ci: static-checks: Move "make test" to the new test matrix `08f2e5ae0` runtime-rs: Ensure static-checks-build is a dep of `make test` `2bc3a616a` kata-ctl: Use `loop` instead of `kvm` module in tests `46daddc50` kata-ctl: Ensure GENERATED_CODE is a dep of `make test` `ec826f328` agent: Ensure GENERATED_CODE is a dep of `make test` `1d32410a8` ci: install_libseccomp: Do not depend on the tests repo `bf888b9a5` ci: static-checks: Move "make check" to the new test matrix `473ec8780` kata-ctl: Add `kata-types` to the Cargo.lock file `ea19549a9` kata-ctl: Ensure GENERATED_CODE is a dep of `make check` `e12577586` tests: install_rust: Also install clippy `e2c61a152` ci: static-checks: Move vendor check to its own job `6794d4c84` tests: Move install_rust.sh from the tests repo `e64508c30` tests: install_go: Remove tests repo dependency `11dff731b` tests: Move functions from kata_arch script here `75c974c80` ci: static-checks: Move kernel config check to its own job `9c233bb9e` test: Add test to verify try_from for clh Netconfig `c69a1e33b` ci: Use variable size of VMs depending on the tests running `9049d311d` runtime-rs: Add network support for cloud-hypervisor `eecd5bf2a` ci: cache: Fix ovmf-sev cache `86c41074b` ci: cache: Check the sha256sum of the component `460988c5f` ci: cache: Remove the script used to cache artefacts on Jenkins `4533a7a41` ci: cache: Also store the ${component} sha256sum `eccc76df6` ci: cache: Use the cached artefacts from ORAS `7f5e77bcb` kernel: enable Arm pl011 support `241c355e0` clh:arm64: use arm AMBA uart for hypervisor debug `094b6b2cf` ci: k8s: Temporarily disable tests that require a bigger VM instance `d0c257b3a` ci: cache: Push cached artefacts to ghcr.io `108f1b60d` kata-deploy: Generate latest_{artefact,image_builder} files `be2eb7b37` ci: cache: Install ORAS in the kata-deploy binaries builder container `fb24fb0dc` ci: k8s: devmapper: Use a smaller / cheaper VM instance `1daf02f5d` ci: nydus: Use a smaller / cheaper VM instance `e60d81f55` ci: nerdctl: Use a smaller / cheaper VM instance `4db416997` ci: docker: Use a smaller / cheaper VM instance `32841827b` ci: cri-containerd: Use a smaller / cheaper VM instance `92fff129f` ci: k8s: Don't set cpu limit request for k8s-inotofy test `faf98c062` ci: Reduce the size of the AKS VMs `adc18ecdb` ci: cache: For consistency, read all used env vars `c7a851efd` ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker `6bd15a85d` ci: cache: Export env vars needed to use ORAS `cd4fd1292` metrics: Add iperf cpu utilization limit for qemu `df5cd10ea` metrics: Add iperf value for cpu utilization `a96050a7a` tests: Apply timeout to 'ctr t kill' `9d9303678` tests/vfio: Bump VM image to Fedora 38 `faee59b52` tests/vfio: Accept single device in vfio group for CLH `df3dc1105` tests/vfio: Get rid of sync's `7211c3dcc` gha: vfio: Set test timeout to 15m `1b02f89e4` packaging: kernel: Enable VIRTIO_IOMMU on x86_64 `3a1db7a86` runtime: clh: Support enabling iommu `9f1a42c6c` tests/vfio: Give commands 30s to execute `b46b0ecf8` tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms `bfc93927f` runtime: Remove redundant check in checkPCIeConfig `7c4e73b60` runtime: Add test cases for checkPCIeConfig `fc51e4b9e` runtime: Check config for supported CLH (cold\|hot)_plug_vfio values `509771e6f` runtime: clh: Add hot_plug_vfio entry to config `5f6475a28` tests/vfio: Gather debug info and disable tdp_mmu `8fffdc81c` tests/vfio: Capture journal from vm `df815087e` tests/vfio: Change to get the test working in GHA `a92ddeea1` tests/vfio: Move dependency installation to gha-run.sh `5a551a85b` gha: vfio: Import jobs scripts from tests repo `49e2fa189` metrics: Increase jitter value for qemu `49234433a` metrics: Increase value limit for jitter in clh `813bfdec0` ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io `46bc0b1c0` ci: nerdctl: Create the containerd config `13968aa7f` ci: nerdctl: Switch to tcp port 80 ping `e0c811678` ci: docker: Switch to tcp port 80 ping `1636abbe1` runtime: issue with non-empty []Endpoint in RemoveEndpoints `0aa073967` metrics: Add iperf bandwidth value for qemu `c0ad91476` tests: fix kernel and initrd annotations `615c1cbf1` metrics: Add iperf bandwidth value for kata metrics `d53eb73ee` metrics: Ensure docker is running in init_env `ad08321b8` metrics: Add Cassandra Metrics documentation `a58ea6659` metrics: this PR skips the FIO test temprarily to fix issues `f536ef5ce` ci: docker: Also run the smoke test with runc `c83f167c5` ci: docker: Run the tests after the kata-static is created `12d833d07` ci: Add a very basic nerdctl sanity test `348b8644d` ci: Add a very basic docker sanity test `a75fd5eb8` runk: Fix rust unecessary mut error `a31c14517` kata-ctl: useless-vec warning `c8419fc3b` kata-ctl: Resolve non-minimal-cfg warning `3eaf68d95` agent-ctl: Allow clippy lint `1d8b78959` runtime-rs: Fix useless-vec warning `99f3d69e9` runtime-rs: Remove mut `16fbc27b0` dragonball: Allow ambiguous-glob-reexports `bbf191951` dragonball: Resolve non-minimal-cfg warning `75cfdd5d5` agent: config: Allow clippy lint `f3a0fd590` agent: config: Fix useles-vec warning `9e423bd3d` libs: Fix clippy unnecesary hashes error `444395050` versions: Bump rust version `a16b0962b` chore(cargo): update cargo lock `ca4b6b051` runtime: Naming conflict of network devices `202049f35` feat(runtime-rs): introduce huge page type to select VM RAM's backend `f811b064c` ci: use github.ref_name instead of $GITHUB_REF_NAME `6d795c089` ci: Add more target-branch related fixes `8509c3187` ci: Fix target-branch usage `060499dca` metrics: Remove warning from metrics documentation `c0f697fcc` runtime: Allow kernel_params annotation `b03e49794` dragonball: fix for non-deterministic builds `976d10150` runtime-rs: hypervisor: Remove debug kernel options `fde34610c` kernel: Add erofs patches needed for CC related work `dc6a4588a` versions: Bump kernel to the latest LTS release (6.1.52) `52f6449b7` kata-manager: Remove initcall_debug kernel option `8b4a0b368` kata-deploy: Remove curl after it's used `139c7f03a` kata-deploy: Fix aarch64 image build `470d06541` agent: optimize the code of systemd cgroup manager `bd24afcf7` gha: Manually rebase PR atop of the target branch before testing `72c510d05` runtime/virtiofsd: Drop all references to "--cache=none" `ead724bec` protocol: removing gogo.nullable feature `d8e4bb985` protocol: remove unused PROTO_FILE env `5e1106a77` protocol: remove unused import_path `87accaaec` protocol: use workdir during build `711a7ed96` protocol: remove mapping definitions `8db84c1bd` protocol: force GOPATH to be set `68156d77a` protocol: breaking lines to improve readability `670a8e9c7` kata-deploy: Switch to an alpine image `9d74b7ccc` k8s: ci: Skip "Pod quota" test with firecracker `f6cd3930c` ci: k8s: Remove useless skip statement from tests `3cc20b47a` ci: k8s: Also check for "fc" (for firecracker) `b5bad3cb0` ci: k8s: Add clean-up-garm argument for gha-run.sh `aaec5a09f` ci: k8s: devmapper tests should be using ubuntu 20.04 `27fa7d828` ci: k8s: Add a kata-deploy-garm target `fa62a4c01` ci: k8s: Export KUBERNETES env var `8c9380a79` ci: k8s: Install bats on GARM runners `3de23034f` ci: k8s: Wait some time after restarting k3s `adfea55b8` metrics: fix FIO test initialization `2df183fd9` ci: k8s: Append, instead of overwrite, the devmapper config `369a8af8f` ci: k8s: Decrease k3s sleep from 4 to 2 minutes `ada65b988` ci: k8s: Use vanilla kubectl with k3s `ad45ab5d3` ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644 `028a97e0d` ci: k8s: Use the proper command for sleep `3a427795e` metrics: Use TensorFlow optimized image `8d99972a8` ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml `deed1b927` Dragonball: optimize the placement of dbs-upcall features `0e8bd50cb` ci: k8s: Add k8s devmapper tests (part 0) `b28b54df0` ci: k8s: Add a function to configure devmapper for containerd `54f711721` ci: k8s: Add a function to deploy k3s `81536f21a` runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr" `b1dd09a4d` runtime: Allow virtio_fs_extra_args annotation `2efda20c7` packaging: do not install docker-compose-plugin for s390x\|ppc64le `438fbf966` metrics: Add write 95 percentile for FIO for qemu `024b4d2ff` metrics: Add write 95 percentile FIO value `e98e5cdea` metrics: Add checkmetrics to gha run script `c1edfe551` metrics: Add checkmetrics value for qemu for iperf `6a79ecedf` metrics: Add jitter value for clh `f609a9a75` metrics: Add test selector to iperf metrics `5b8db3042` metrics: Enable iperf benchmark on gha for kata metrics `60f733d30` CI: switch static-checks-dragonball CI machines to Azure `7870b33a2` runtime-rs: bring hybridVsock devices in manager. `18c94ebbe` kata-deploy: Create kata-static.tar with correct ownership `57e7bf14a` agent: refine StorageDeviceGeneric::cleanup() `53edb1937` agent: implement StorageDeviceGeneric::cleanup() `0c63453e2` types: make StorageDevice::cleanup() return possible error code `3a3d77b3b` agent: move StorageDeviceGeneric from kata-types into agent `b151cfd14` metrics: re-enable memory-usage initialization step `f3e1a6a94` osbuilder: alpine: Change mirror `ac612aef5` osbuilder: alpine: Match the version on versions.yaml `9cd706d1c` agent: avoid possible leakage of storage device `bf21411e9` tests: add policy to k8s tests `d0e061067` runtime: config: use the SEV initrd for SNP `67fed26f1` runtime: Use TDX image with in the qemu-tdx config `ac939c458` gha: Rebase atop of the target branch `82cd14ba3` versions: Update alpine to its 3.18 version `666882575` metrics: Add grabdata script for metrics report `c290eaed8` kata-sys-util: protection: Update TDX checks `d7a996c68` gha: Update to checkout@v3 action `c2ba29c15` runtime: Fix data race in ioCopy `211de08d9` osbuilder: Remove chcon operation for guest SELinux `9f21fa9b3` metrics: Add report generator link to general documentation `c0ed5ea0a` metrics: Add README for kata metrics report `a7b59a5bf` metrics: Add limit for 90 percentile for qemu value `99db6568e` metrics: Add limit for write 90 percentile value for clh `6e06392c5` metrics: Enable FIO limits for kata metrics `2e4c87472` runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure `21204caf2` runtime: fail early when starting docker container with FC `32fd01371` runtime: run prestart hooks before starting VM for FC `00e7ffd98` tests: check vmx only on Intel machines `c8dd3c073` metrics: Fix memory footprint qemu limit `8877ec62f` metrics: Fix memory inside limits for kata metrics `80146f207` tests: Fixes cpuType check on AMD machines `7e364716d` metrics: Add test setup details to metrics report `17dc1b976` metrics: Add boot lifecycle times to metrics report `3b0d6538f` metrics: Add memory inside container to metrics report `79fbb9d24` metrics: Add scaling system footprint in metrics report `8e6d4e6f3` metrics: Add metrics reportgen `139ffd4f7` metrics: Add report file titles `878d1a2e7` metrics: Generate PNGs alongside the PDF report `fce248797` metrics: Add metrics report R files `08812074d` metrics: Add report dockerfile `69781fc02` metrics: Add metrics report script `e286e842c` tests: Expand confidential test to support TDX `e31f099be` tests: Expand confidential test to support SNP `c3b9d4945` tests: Add confidential test for SEV `538c965c2` metrics: fix parsing issue on memory-usage test `3818bf331` local-build: Remove $HOME/.docker/buildx/activity/default `d1b54ede2` qemu: tdx: Workaround SMP issue with TDX 1.5 `1e34220c4` qemu: tdx: Adapt to the TDX 1.5 stack `8115a0522` versions: tdx: Update Kernel to 6.2 + TDX `ec18180f3` versions: tdx: Update TDVF to the "edk2-stable202302" `9803b2428` versions: tdx: Update QEMU to v7.2 + TDX v1.10 `dffc16e5b` runtime-rs: check peer close in log_forwarder `aaa5ab126` agent: simplify storage device by removing StorageDeviceObject `fb49d5d7c` gha: Avoid "fail-fast" in tests that are known to be flaky `183f51d6f` tests: use unique test name `6a974679f` tests: delete k8s deployment at the test's end `32a778b6d` metrics: Remove unused variable in tensorflow nhwc script `d8f3ce649` kata-deploy: Don't try to remove /opt/kata `936e8091a` gha: vfio: Run on Ubuntu 23.04 runner `0e7248264` agent: move storage device related code into dedicated files `268e84655` runtime-rs: Fix volumes and rootfs cleanup issues `8f49ee33b` agent: refine storage related code a bit `60ca12ccb` agent: switch to new storage subsystem `fcbda0b41` kata-types: introduce StorageDevice and StorageHandlerManager `b03b1f613` agent: simplify the way to manage storage object `8392c71bf` sys-util: support more mount flags in parse_mount_options() `c00d8f3d4` agent: use create_mount_destination() from kata-sys-util `5e867f053` types: add more mount related constants `880e6c9a7` agent: use function from kata-sys-utils to reduce code `3b881fbc0` local-build: Remove GID before creating group `959ca4944` metrics: Add TensorFlow ResNet50 fp32 Dockerfile `4b7d72c4a` metrics: Add TensorFlow ResNet50 FP32 benchmark `5cba38c17` kata-deploy: Avoid failing on content removal `18d42da21` runtime/fc: fix image/initrd annotation handling `9fda7059a` runtime/clh: fix image/initrd annotation handling `1a0092d63` runtime/qemu: fix image/initrd annotation handling `22d8f335d` libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml `8afd158ce` metrics: Add disk link to README `40914b25d` kata-agent: use default filemode for block device when it is set to 0 `eee2ee6ee` metrics: Fix FIO path `39bc3488f` metrics: Use function from metrics common in pytorch script `400eb8874` gha: capture additional kata-deploy output `4aee3eade` kata-types: implement serde methods for KataVirtualVolume `b875e3932` kata-types: validate KataVirtualVolume object `fa2fdc105` kata-types: implement two conversion helpers for KataVirtualVolume `6326af20e` kata-types: introduce KataVirtualVolume `c8b43f8b3` metrics: Fix README for pytorch `fb571f8be` metrics: Enable kata runtime in K8s for FIO test. `cb056f8cb` rootfs: agent: Policy support with AGENT_INIT=yes `85c02828e` metrics: Update tensorflow name in gha run script `e8a511934` metrics: Fix check results for tensorflow benchmark `2d896ad12` gha: kata-deploy: Do the runtime class cleanup as part of the cleanup `4ffc2c86f` gha: kata-deploy: Add the first kata-deploy test `8616c050a` metrics: Remove unused variable in tensorflow mobilenet script `285e616b5` tests: common: Ensure test_type is used as part of the cluster's name `790bd3548` tests: commob: Don't fail if yq is not part of the cache `ce6adecd0` gha: kata-deploy: Add run-kata-deploy-tests.sh `cfc29c11a` gha: k8s: Stop running kata-deploy tests as part of the k8s suite `f4dd15286` tests: k8s: Call ensure_yq() in setup.sh `339569b69` kata-deploy: Properly create default runtime class `2a491e9b1` metrics: Fix MobileNet help me description `d19a75e80` gha: ci: Start running kata-deploy tests `d90f7ac68` runtime-rs: add unit test for block driver `e44919f0d` runtime-rs: add load_test_config for unit test `7f48a6937` runtime-rs: add driver option `bade6a5c3` docs: Fix TensorFlow word across the document `1a1b20776` docs: Add Tensorflow Resnet50 documentation `24baededc` metrics: Add Dockerfile for ResNet50 int8 `6d971ba8d` metrics: Add Tensorflow ResNet50 int8 benchmark `25d151bd1` runk: Modify kill command's error message for containerd tests `b3592ab25` gha: cri-containerd: Enable tests `84dd02e0f` gha: cri-containerd: Add timeout to the crictl calls on testContainerStop `b29782984` gha: cri-containerd: Show pod before deleting it `ae0930824` gha: cri-containerd: Print kata logs in case of error `6c8b2ffa6` gha: cri-containerd: Group containerd logs `9e898701f` gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account `76dac8f22` agent: simplify error handling `18a7fd8e4` metrics: Rename tensorflow scripts `e55fa93db` tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx `d9ee17aae` tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks `ab829d103` agent: runtime: add the Agent Policy feature `831e73ff9` tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder `af1b46bbf` tests: Add gha-run-k8s-common.sh `416445e7e` docs: Remove installation step in virtcontainers doc `72cbcf040` kata-deploy: Add k0s support `767434d50` metrics: fix the loop used to stop kata components #7629 `5d0f0d43c` metrics: Add cassandra statefulset yaml `c1dcc1396` metrics: Add cassandra service yaml `2297a0d1c` metrics: Add block loop pvc yaml for cassandra `e3d511946` metrics: Add block loop pv yaml for cassandra test `989027159` metrics: Add block loop pvc for cassandra test `349b89969` metrics: Add Cassandra Kubernetes benchmark for kata metrics `c52d09052` gha: static-checks: Move to the Azure instances `8815ed066` runtime: Remove config warnings `afe1a6ac5` agent: support copying of directories and symlinks `ab13ef87e` runtime: propagate configmap/secrets etc changes for remote-hyp `c074ec4df` runtime: Copy shared files recursively `fdcd52ff7` metrics: Add check containers are running in tensorflow mobilenet `36337ee14` metrics: Add check containers are up in tensorflow script `f700f9b0b` metrics: Remove unused variable in tensorflow script `833cf7a68` metrics: Add check containers are running function `918c78308` metrics: Add check containers are up in tensorflow mobilenet script `9d57a1fab` metrics: Use check containers are up in tensorflow script `1c84680d8` metrics: Add check containers are up in common script `d3e57cf45` metrics: Use collect_results function in tensorflow mobilenet test `286de046a` metrics: Remove collect results function definition `9879709aa` metrics: Add common functions to the common script `4746fa3da` docs: Specify supported Firecracker version using `versions.yaml` `cc922be5e` versions: Update firecracker version to 1.4.0 `39e67b06e` dragonball: vsock add fifo/pipe stream support for passed fd hybridStream `473b0d3a3` metrics: compute tensorflow statistics `03d1fa67b` ci: unencrypted-image: Fix build context `eb463b38e` ci: unencrypted-image: Don't fail to build on s390x `a2d731ad2` ci: create-confidential-image: Add dependent actions `d1a629622` metrics: Add nginx documentation to network README `498f7c054` metrics: Add nginx kubernetes yaml `f8a5255cf` metrics: Add network nginx benchmark `43fe5d1b9` ci: k8s: tees: Ensure PR_NUMBER is exported `54f6a7850` ci: {{ pr-number }} should be {{ inputs.pr-number }} `034d7aab8` tests: k8s: Ensure the runtime classes are properly created `fac8ccf5c` ci: Add build-and-publish-tee-confidential-unencrypted-image `ab5f603ff` ci: k8s: Add the image used for unencrypted confidential tests `1e8fe131b` k8s: tests: Take advantage of `SHIMS` and `DEFAULT_SHIM` env vars `729b2dd61` agent: avoid creating new `Vec` instances when easily avoidable `aeaec9dae` tests: upgrade bats version `e66496986` metrics: install kata once and run multiple checks `baabfa9f1` agent: refine implementation of mount related code `98ba211a3` agent: fix a bug in update_ephemeral_mounts() `5333618d7` agent: make add_storage() take &[Storage] instead of Vec<Storage> `37f34781d` agent: simplify function online_cpu_memory() `d3c542237` agent: refine style of code related to sandbox `71a9f6778` agent: avoid unwrap() in function do_remove_container() `84badd89d` agent: avoid clone objects when possible `b23c5ed15` deps: Bump dependent crate versions `863283716` metrics: General improvements to mobilenet tensorflow test `3c319d8d4` metrics: Add iperf to gha run script `5b5caf890` gha: Add iperf network metrics `66db5b535` metrics: Add latency test to network README `c36572418` agent: avoid unnecessary calls to `Arc::clone` `4fbe0a3a5` runtime: bind-mount mounted block device into container `7e1b1949d` runtime: add support for kata overlays `6c867d9e8` agent: add io.katacontainers.fs-opt.overlay-rw option `6163c3565` agent: skip mount options that start with "io.katacontainers." `b2ff97aa0` dragonball: use version 0.10.4 of `fuse-backend-rs` `845eeb4d7` agent: Allow clippy::redundant_clone in the unit tests `1163fc9de` release: Revert kata-deploy changes after 3.2.0-rc0 release `3958a39d0` runtime-rs: Introduce directly attachable network `1e15369e5` metrics: Improve naming testing containers in launch times test `5dbe88330` metrics: Clean kata components before start a metric test. `3b45060b6` metrics: Add latency server yaml `9bb8451df` metrics: Add latency client yaml `64fdb9870` metrics: Add network latency test `a81ad3b58` runtime-rs: Add block device handling in cloud hypervisor `3230dec95` kata-deploy: Use host's systemctl `1b21a4624` docs: Use control-plane term instead of master `28e5e9c86` runtime-rs: fix number of queues handling in dragonball share fs device `f1d8de9be` runk: Allow runk to launch a container without pid namespace Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-20 14:44:50 +02:00
Fabiano Fidêncio	f6e20ac230	Merge pull request #7195 from fidencio/topic/adapt-kata-deploy-stable-to-using-ubuntu kata-deploy-stable: Switch to using the ubuntu based payload	2023-10-20 14:42:04 +02:00
Fabiano Fidêncio	a93fdb014b	kata-deploy-stable: Adapt to what we're using in the stable branch This is basically to make sure that folks trying to use the kata-deploy script from the main branch, to deploy stable kata-deploy images, do not have a hard time. Fixes: #7194 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-20 12:58:42 +02:00
James O. D. Hunt	79ed501a20	Merge pull request #8258 from jodh-intel/protection-fix-tdx-typo libs: protection: Fix typo in TDX output	2023-10-20 08:36:22 +01:00
Dan Mihai	52aaf10759	agent: no endpoint blocking from agent-config.toml Remove the ability to block access to kata agent endpoints by using agent-config.toml. That functionality is now implemented using the Agent Policy feature (#7573). The CCv0 branch relied on blocking endpoints using agent-config.toml but will set-up an equivalent default policy file instead (#8219). Fixes: #8228 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-20 02:26:54 +00:00
Fabiano Fidêncio	468a3e4b53	Merge pull request #8260 from gkurz/fix-8259 ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat	2023-10-19 23:58:22 +02:00
GabyCT	5d6bdbd0a1	Merge pull request #8241 from GabyCT/topic/enableagenttest tests: Enable agent stability test	2023-10-19 14:12:49 -06:00
Greg Kurz	36109da93f	ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat Fixes #8259 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-19 21:53:23 +02:00
GabyCT	dc295600b8	Merge pull request #8157 from GabyCT/topic/fixsevdoc docs: Fix paths to build kernel in SNP VMs documentation	2023-10-19 11:42:03 -06:00
Gabriela Cervantes	d01daf749b	tests: Adjust timeout for agent stability test This PR adjusts the timeout for the agent stability test to run on the gha. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-19 16:55:23 +00:00
James O. D. Hunt	9b14dda147	libs: protection: Fix typo in TDX output Add the missing closing bracket to the output of the TDX details, so rather than: ```bash $ sudo kata-ctl env 2>/dev/null \| grep available_guest_protection available_guest_protection = "tdx (major_version: 1, minor_version: 0" : ^ : Missing ')' ! ``` ... we now have: ```bash $ sudo kata-ctl env 2>/dev/null \| grep available_guest_protection available_guest_protection = "tdx (major_version: 1, minor_version: 0)" : ^ : Aha! ``` Added a unit test for this scenario. Fixes: #8257. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-19 16:06:08 +01:00
James O. D. Hunt	9336e2e492	Merge pull request #8155 from jodh-intel/runtime-rs-check-ch-tdx-build-feature runtime-rs: ch: Add TDX CH features check	2023-10-19 14:13:08 +01:00
James O. D. Hunt	048cc70654	Merge pull request #8213 from jodh-intel/validate-hypervisor-cfg-name runtime: Validate hypervisor section name in config file	2023-10-19 07:40:58 +01:00
Dan Mihai	99db6dff24	Merge pull request #8230 from microsoft/danmihai1/opa-data tests: query data from the OPA service	2023-10-18 15:32:23 -07:00
James O. D. Hunt	0e0867f15d	runtime-rs: ch: Add TDX CH features check If you attempt to create a container (a TD) on a TDX system using a custom build of Cloud Hypervisor (CH) that was not built with the `tdx` CH feature, Kata will report the following, somewhat cryptic, CH error: ``` ApiError(VmBoot(InvalidPayload)) ``` Newer versions of CH now report their build-time features in the ping API response message so we now use that, if available, to detect this scenario and generate a user-friendly error message instead. This changes improves the readability of `handle_guest_protection()` and adds a couple of additional tests for that method. Fixes: #8152. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-18 18:07:39 +01:00
James O. D. Hunt	409eadddb2	runtime-rs: ch: Improve readability of guest protection checks Improve the way `handle_guest_protection()` is structured by inverting the logic and checking the value of the `confidential_guest` setting before checking the guest protection. This makes the code easier to understand. > Notes: > > - This change also unconditionally saves the available guest protection > (where previously it was only saved when `confidential_guest=true`). > This explains the minor unit test fix. > > - This changes also errors if the CH driver finds an unexpected > protection (since only Intel TDX is currently tested). Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-18 18:06:02 +01:00
Greg Kurz	9863805752	Merge pull request #8201 from fidencio/topic/release-tag-repo-stop-tagging-the-tests-repo release: tag_repos: Stop tagging the `tests` repo	2023-10-18 18:10:39 +02:00
Gabriela Cervantes	a58afe70b8	metrics: Add iperf udp benchmark This PR adds the iperf udp benchmark for bandwdith measurement for network metrics. Fixes #8246 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-18 15:52:03 +00:00
Jianyong Wu	f9c9d8f645	runtime: QemuVirt: hotadd virtio-mem dev to pcie root port Hotplug virtio-mem device to pcie root port for Qemu Virt. Fixes: #7646 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-10-18 06:35:57 +00:00
Jianyong Wu	ef18c9550c	runtime:qemuvirt: hotadd net dev to pcie root port Hotplug network device to pcie root port as this is the only way on QemuVirt. Fixes: #7646 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-10-18 06:35:57 +00:00
Jianyong Wu	f1aec98f9d	qemu/virt: use pcie_root_port to do device hotplug for virt ACPI PCI device hotplug on qemu virt is not supported. The only way to hotplug pci device is pcie native way. Thus we need create pcie root port as default. Pcie root port number depends on following: 1. reserved one for network device as default; 2. virtio-mem dev; 3. add enough port for vhost user blk dev; Fixes: #7646 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-10-18 06:35:57 +00:00
Jianyong Wu	28a41e1d16	runtime: add a new API for Network interface Add GetEndpointsNum API for Network Interface to get the number of network endpoints. This is used for caculate the number of pcie root port for QemuVirt. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-10-18 06:35:57 +00:00
Songqian Li	09d46450f1	dragonball: add metrics support for balloon device Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-18 14:02:56 +08:00
Gabriela Cervantes	82a0814fc2	tests: Enable agent stability test This PR enables the agent stability test for stability gha CI. Fixes #8240 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-17 15:16:06 +00:00
Dan Mihai	32be8e3a87	tests: query data from the OPA service Add example for querying json data from the OPA service. Fixes: #8231 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-17 13:31:43 +00:00
David Esparza	d90d1c5c10	Merge pull request #8243 from dborquez/fix_systemctl_masked_query metrics: fixes common.sh function to always return true	2023-10-16 20:17:24 -06:00
Dan Mihai	b81c0a6693	tests: encode policy file during test Encode policy file during test - easier to understand than hard-coding the encoded file contents. Fixes: #8214 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-16 15:58:12 -07:00
David Esparza	4f9681b411	metrics: fixes common.sh function to always return true This PR corrects the init env() helper function, to make that systemctl always returns true when enumerating masked services, and preventing the test from failing Fixes: #8242 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-16 15:57:57 -06:00
David Esparza	59e8b1d5a7	Merge pull request #8206 from dborquez/memory_footprint_test_removing_trailing_commas_to_make_json_results_file_valid Memory footprint test removing trailing commas to make json results file valid	2023-10-16 14:31:28 -06:00
Gabriela Cervantes	2ef2b2a6dc	docs: Fix paths to build kernel in SNP VMs documentation This PR fixes the correct path to setup, build and install properly the kernel for snp. Fixes #8156 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-16 20:09:02 +00:00
Fabiano Fidêncio	db37692f36	Merge pull request #8226 from microsoft/danmihai1/policy-typo policy: allow access to ReseedRandomDev	2023-10-16 19:17:31 +02:00
Peng Tao	45e82b6581	Merge pull request #8192 from bergwolf/github/deps runtime/kata-ctl: update dependencies	2023-10-16 16:39:17 +08:00
Chao Wu	44e602d69a	Merge pull request #8014 from openanolis/chao/fix_nydus_break runtime-rs : fix Nydus support for runtime-rs + Dragonball	2023-10-16 01:30:22 -05:00
Chao Wu	408b59c02c	runtime-rs: fix bugs to support Nydus v5 1. enable virtio-fs-pro in Dragonball to have the ability to process nydus backend registry 2. change passthrough for rw layer's readonly config to false to have the accurate read write ability. Fixes:#8013 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-10-16 10:22:21 +08:00
Chao Wu	157caea9fe	Revert "nydus: Temporarily skip tests on dragonball" This reverts commit `aba36ab188`. Fixes: #8013 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-10-16 10:22:21 +08:00
Chao Wu	678fe3cd31	Dragonball: fix Nydus config serde problem Since Nydus snapshotter has been updated in previous commits, there is a problem that the config passthrough to Dragonball during mount_rafs is RafsConfig instead of ConfigV2, but Dragonball could only serde ConfigV2 so it will panic. We need to add the support for RafsConfig Fixes:#8013 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-10-16 10:22:21 +08:00
Dan Mihai	b6ec621389	policy: allow access to ReseedRandomDev Allow access to the ReseedRandomDev endpoint by default. Using false for ReseedRandomDevRequest was unintended. Fixes: #8225 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-13 21:18:27 +00:00
David Esparza	908519db9d	metrics: skips docker restart when it is not installed or is masked. To avoid errors when initializing the test environment, the kill_processes_before_start() helper function needs to verify that docker is installed before attempting to stop it. Fixes: #8218 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-13 18:02:00 +00:00
David Esparza	c2763120aa	metrics: removing trailing comma characters from json file. This PR removes trailing commas so that the json results file is valid. This PR also changes the way data results are collected by terating through the array of memory values to calculate their average. Fixes: #8204 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-13 18:00:57 +00:00
Beraldo Leal	5ef691528d	tests: fixes permission denied when running test After running cri-containerd/integration-tests twice we receive permission denied during containerd clean. Fixes: #8216 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-10-12 19:23:40 +00:00
GabyCT	1974d13122	Merge pull request #8188 from dborquez/metrics_add_fio_readme.md metrics: removal of reference in the documentation to the fio dax subtest.	2023-10-12 10:53:55 -06:00
James O. D. Hunt	3e8cf6959c	runtime: Validate hypervisor section name in config file Previously, if you accidentally modified the name of the hypervisor section in the config file, the default golang runtime gives a cryptic error message ("`VM memory cannot be zero`"). This can be demonstrated using the `kata-runtime` utility program which uses the same golang config package as the actual runtime (`containerd-shim-kata-v2`): ```bash $ kata-runtime env >/dev/null; echo $? 0 $ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml $ kata-runtime env >/dev/null; echo $? VM memory cannot be zero 1 ``` The hypervisor name is now validated so that the behaviour becomes: ```bash $ kata-runtime env >/dev/null; echo $? 0 $ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml $ ./kata-runtime env >/dev/null; echo $? /etc/kata-containers/configuration.toml: configuration file contains invalid hypervisor section: "foo" 1 ``` Fixes: #8212. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-12 13:53:37 +01:00
James O. D. Hunt	45d28998d9	Merge pull request #8149 from jodh-intel/runtime-rs-ch-detect-tdx-version runtime-rs: ch: Detect Intel TDX version	2023-10-12 10:09:42 +01:00
QuanweiZhou	f904e64155	Merge pull request #8179 from Apokleos/directvol-urlEncode runitme-rs: use the same base64 as kata-runtime/direct-volume does	2023-10-12 09:04:11 +08:00
GabyCT	bc6eadf4f6	Merge pull request #8197 from GabyCT/topic/enablescability tests: Enable scability test for stability CI	2023-10-11 16:41:46 -06:00
Archana Shinde	f814b1a0a2	Merge pull request #8073 from amshinde/runtime-rs-vfio-clh runtime-rs: Add support for adding vfio device for cloud-hypervisor	2023-10-11 15:01:55 -07:00
Gabriela Cervantes	ef6388e815	tests: Remove unused function from scability test This PR removes an unused function from scability test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-11 19:44:21 +00:00
Fabiano Fidêncio	fbc8f8f466	scripts: Use install_yq from the `kata-containers` repo As the file is already part of the kata-containers repo, and the tests repo is about to become read-only, we're good to drop the tests references from here and use everything coming from the `kata-containers` repo instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-11 12:52:55 +02:00
Fabiano Fidêncio	65b1a2d277	release: tag_repos: Stop tagging / updating the `tests` repo As we've moved all the tests to the `kata-containers` repo, the `tests` repo will become a read-only repo. Fixes: #8200 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-11 11:45:27 +02:00
James O. D. Hunt	87b760f569	runtime-rs: ch: Detect Intel TDX version Improve the `GuestProtection` handling to detect the version of Intel TDX available. The TDX version is now logged by the Cloud Hypervisor driver. Fixes: #8147. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-11 09:38:00 +01:00
alex.lyn	73e81f5e39	runitme-rs: unify base64 encoding for direct-volume Direct-volume needs to use the same base64 character set as kata-runtime/direct-volume does. Fixes: #8175 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-10-11 14:00:13 +08:00
Gabriela Cervantes	c6463cb5ae	tests: Fix path for versions yaml for soak parallel test This PR fixes the path for versions yaml for soak parallel test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-10 22:29:20 +00:00
David Esparza	89c9454fca	metrics: removal of reference in the documentation to the dax test. This PR removes the reference in the documentation to the DAX subtest of the FIO benchmark, because this metric is currently WIP. Fixes: #8159 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-10 15:55:59 -06:00
Gabriela Cervantes	30ff58904e	tests: Enable scability test for stability CI This PR enables the scability test for stability CI gha. Fixes #8196 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-10 19:59:57 +00:00
GabyCT	538131ab44	Merge pull request #8154 from GabyCT/topic/addstability tests: Enable soak parallel stability test	2023-10-10 13:53:14 -06:00
Archana Shinde	8d6f7b9096	runtime-rs: Add support for handling vfio device for cloud-hypervisor This change adds support for adding and removing vfio devices for cloud-hypervisor. Fixes: #6691 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-10 12:25:44 -07:00
Gabriela Cervantes	e786b2b019	gha: Add install dependencies for stability tests This PR adds the install dependencies for stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-10 16:05:48 +00:00
Chao Wu	936553ae79	Merge pull request #7505 from lisongqian/feat/dragonball_metrics dragonball: vcpu metrics change to be recorded per vcpu	2023-10-10 10:52:40 -05:00
Wainer Moschetta	d311c3dd04	Merge pull request #7621 from wainersm/gha-run-local ci: k8s: adapt gha-run.sh to run locally	2023-10-10 11:19:19 -03:00
David Esparza	93fef543e0	Merge pull request #8127 from dborquez/fix_iperf_check_kata_processes_issue metrics: removes kata components and k8s deployment when test finishes	2023-10-10 07:05:24 -06:00
lisongqian	dbfe6512fc	dragonball: vcpu metrics change to be recorded per vcpu In this commit, the vcpu metrics in Dragonball will be changed to record per-vcpu. Fixes: #7248 Signed-off-by: lisongqian <mail@lisongqian.cn>	2023-10-10 16:22:40 +08:00
lisongqian	fa60fbe023	dragonball: METRICS is refactored to RwLock<DragonballMetrics> In this commit, the METRICS is refactored to RwLock<DragonballMetrics>. Fixes: #7248 Signed-off-by: lisongqian <mail@lisongqian.cn>	2023-10-10 16:22:40 +08:00
Peng Tao	500d1c5cee	kata-ctl: update rustls-webpki/webpki dependency The old ones have security issues. ref: https://github.com/briansmith/webpki/issues/69 https://github.com/briansmith/webpki/issues/69 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	d7660d82a0	runtime: unify gopkg.in/yaml.v3 to v3.0.1 The older versions have Denial of Service issues. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	fc9a107e8e	runtime: unify swag and testify dependency So that we don't need to depend on that many versions of them. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	79ebb959c5	runtime: update runc dependency to v1.1.9 To pick up security fixes. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	7f3e8bd65e	runtime: unify golang.org/x/text to v0.7.0 The older versions contain security issues. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	df325ae371	runtime: update golang.org/x/net to v0.7.0 To pick up fix for the following issue: A maliciously crafted HTTP/2 stream could cause excessive CPU consumption in the HPACK decoder, sufficient to cause a denial of service from a small number of small requests. Fixes: #8190 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:39 +00:00
David Esparza	bba34910df	metrics: stops kata components and k8s deployment when test finishes This PR adds a trap whenever the scrip exits, it deletes the iperf k8s deployment and k8s services, and deletes the kata components. This way, when the script finishes, it verifies that there are indeed no kata components still running. Fixes: #8126 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-09 13:41:43 -06:00
Gabriela Cervantes	84e3d884e4	gha: Add general dependencies to stability tests This PR adds the general dependencies to stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-09 17:02:49 +00:00
Gabriela Cervantes	dec3951ca5	tests: Add soak parallel stability test This PR adds the soak parallel stability test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-09 17:02:49 +00:00
Gabriela Cervantes	0f04d527d9	tests: Enable soak parallel test This PR enables the soak parallel test for stability test. Fixes #8153 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-09 17:02:49 +00:00
Wainer dos Santos Moschetta	e669282c25	ci: k8s: set KUBERNETES default value The KUBERNETES variable is mostly used by kata-deploy whether to apply k3s specific deployments or not. It is used to select the type of kubernetes to be installed (k3s, k0s, rancher...etc) and it is always set on CI. Running the script locally we want to set a value by default to avoid `KUBERNETES: unbound variable` errors. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta	c30c3ff185	tests: run k8s-volume on a given node This test can give false-positive on a multi-node cluster. Changed it to use the new get_one_kata_node() and the modified exec_host() to run the setup commands on a given node (that has kata installed) and ensure the test pod is scheduled at that same node. Fixes #7619 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta	666993da8d	tests: run k8s-file-volume on a given node This test can give false-positive on a multi-node cluster. Changed it to use the new get_one_kata_node() and the modified exec_host() to run the setup commands on a given node (that has kata installed) and ensure the test pod is scheduled at that same node. Fixes #7619 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta	3a00fc9101	tests: exec_host() now gets the node name The exec_host() simply fails on cluster with multi-nodes because `kubectl get node -o name" will return a list o names. Moreover, it will return control nodes names which usually don't have kata installed. Fixes #7619 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	61c9c17bff	tests: add get_one_kata_node() to tests_common.sh The introduced get_one_kata_node() returns the first node that has the kata-runtime=true label, i.e., supposedly a node with kata installed. This is useful for tests that should run on a determined worker node on a multi-nodes cluster. Fixes #7619 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	68f083c4d0	ci: k8s: set KATA_HYPERVISOR default value Let KATA_HYPERVISOR be qemu by default in gh-run.sh as this variable is required to tweak some configurations of kata-deploy. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	6677a61fe4	ci: k8s: configurable deploy kata timeout The deploy-kata() of gha-run.sh will wait for 10 minutes for the kata deploy installation finish. This allow users of the script to overwrite that value by exporting the KATA_DEPLOY_WAIT_TIMEOUT environment variable. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	200e542921	ci: k8s: shellcheck fixes to gha-run.sh Fixed a couple of warns shellcheck emitted and disabled others: * SC2154 (var is referenced but not assigned) * SC2086 (Double quote to prevent globbing and word splitting) Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	4af78be13a	kata-deploy: re-format kata-[deploy\|cleanup].yaml The .tests/integration/kubernetes/gh-run.sh script run `yq write` a couple of times to edit the kata-[deploy\|cleanup].yaml, resulting on the file being formatted again. This is annoying because leaves the git tree dirty. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	d54e6d9cda	ci: k8s: run_tests() for kcli The only difference to the other platforms is that it needs to export KUBECONFIG. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	c2ef1f0fb0	ci: k8s: add deploy-kata-kcli() to gh-run.sh The cleanup-kcli() behaves like other deploy kata for bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG should be exported. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	d2be8eef1a	ci: k8s: add cleanup-kcli() to gha-run.sh The cleanup-kcli() behaves like other clean up for bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG should be exported. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	cbb9aa15b6	ci: k8s: set default image for deploy_kata() On CI workflows the variables DOCKER_REGISTRY, DOCKER_REPO and DOCKER_TAG are exported to match the built image. However, when running the script outside of CI context, a developer might just use the latest image which in this case will be `quay.io/kata-containers/kata-deploy-ci:kata-containers-latest`. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	89bef7d036	ci: k8s: create k8s clusters with kcli Adapted the gha-run.sh script to create a Kubernetes cluster locally using the kcli tool. Use `./gha-run.sh create-cluster-kcli` to create it, and `./gha-run.sh delete-cluster-kcli` to delete. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Fabiano Fidêncio	1280f85343	Merge pull request #8171 from bergwolf/github/fix-up-gha GHA: fix up referenced yaml exceeding 20 limit problem	2023-10-09 09:37:03 +02:00
Peng Tao	954d40cce5	gha: combine coco jobs into a single yaml So that we don't risk exceeding the GHA 20 rerefenced yaml files limit that easy. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-08 14:22:01 +00:00
Peng Tao	b60e0a9b57	gha: combine basic amd64 jobs into a single yaml GHA has an undocumented limitation that there can be at most 20 referenced yamls in a single yaml file. We workaround it by combining multiple jobs into a single yaml file. Fixes: #8161 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-08 13:55:01 +00:00
Fabiano Fidêncio	108db0a721	Merge pull request #8162 from sprt/sprt/unbreak-ci gha: ci: Revert tracing test PR to unbreak CI	2023-10-08 10:13:46 +02:00
Aurélien Bombo	e9bd852113	gha: ci: Revert tracing test PR to unbreak CI Revert "Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests" This unbreaks CI as seen in https://github.com/kata-containers/kata-containers/actions/runs/6434757133 Fixes: #8161 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-10-06 14:13:17 -07:00
James O. D. Hunt	16fe81f27c	Merge pull request #8124 from jodh-intel/ch-enable-feature runtime-rs: ch: Enable feature	2023-10-06 13:02:08 +01:00
Fabiano Fidêncio	fa6786d1d7	Merge pull request #8117 from fidencio/topic/ci-add-runk-tests gha: ci: Port runk tests over	2023-10-06 11:19:55 +02:00
Fabiano Fidêncio	8fec654716	Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests ci: gha: Port tracing tests over	2023-10-06 10:06:57 +02:00
GabyCT	265f53e594	Merge pull request #8082 from dborquez/enable_fio_on_ctr Enable fio test using containerd client	2023-10-05 17:26:22 -06:00
GabyCT	c8b9ec1cb5	Merge pull request #8108 from GabyCT/topic/ghastability gha: Add stability tests workflow for gha	2023-10-05 17:10:10 -06:00
James O. D. Hunt	b8a46a4b85	runtime-rs: ch: Enable feature Enable the Cloud Hypervisor driver (the `cloud-hypervisor` build feature) for the rust runtime. Fixes: #6264. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-05 17:58:39 +01:00
Gabriela Cervantes	0f2dc8c675	gha: Add containerd stability tests to ci yaml This PR adds containerd stability tests to ci yaml. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-05 15:21:24 +00:00
Fabiano Fidêncio	89f73e658d	Merge pull request #8110 from fidencio/topic/gha-be-more-specific-about-the-arm-runners gha: arm64: Ensure the builder is arm64-builder	2023-10-04 21:20:08 +02:00
Fabiano Fidêncio	da91c9df88	ci: Port runk tests to this repo I'm basically moving the runk tests from the tests repo to this one, and I'm adding the "Signed-off-by:" of every single contributor the tests. Fixes: #8116 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Chen Yiyang <cyyzero@qq.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-04 20:41:29 +02:00
Fabiano Fidêncio	7f23772763	ci: Add placeholder for runk tests The runk test has been executed as part of the former "ubuntu" jenkins CI. We're porting it to GHA and running it against LTS containerd. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-04 20:40:32 +02:00
Fabiano Fidêncio	9205acc3d2	ci: Move tracing tests here I'm basically moving the tracing tests from the tests repo to this one, and I'm adding the "Signed-off-by:" of every single contributor to the tests. Fixes: #8114 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2023-10-04 20:02:27 +02:00
Gabriela Cervantes	85d290a048	gha: Add stability gha run script This PR adds the stability gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-04 17:45:45 +00:00
Gabriela Cervantes	54f0c8f88e	gha: Add stability tests workflow for gha This PR adds the stability test workflow for gha for the kata CI. Fixes #8107 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-04 16:32:13 +00:00
Fabiano Fidêncio	3bb2923e5d	ci: Add placeholder for tracing tests The tracing tests are currently running as part of the Jenkins CI with the following setups: * Container Engines: containerd * VMMs: QEMU \| Cloud Hypervisor * Snapshotters: overlayfs \| devmapper We'll be restricting those tests to be running on LTS version of containerd, without devmapper. As it's known due to our GHA limitation, this is just a placeholder and the tests will actually be added in the next interations. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-04 18:02:02 +02:00
Fabiano Fidêncio	2c3bf406dc	ci: Create a function to install docker This will be re-used in other tests as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-04 15:01:51 +02:00
Fabiano Fidêncio	c2cce12de5	Merge pull request #8100 from fidencio/topic/kata-deploy-build-agent kata-deploy: Build kata-agent as we build all the other components	2023-10-04 11:56:03 +02:00
Steve Horsman	c430cc3707	Merge pull request #8098 from stevenhorsman/k8s-registry-suite versions: migrate out of k8s.gcr.io	2023-10-04 10:51:39 +01:00
Fabiano Fidêncio	119f03de26	gha: arm64: Ensure the builder is arm64-builder Otherwise we'll use any arm64 machine that's added as a runner, and whenever new machines are added those may end up being only used for running some specific set of the tests. Fixes: #8109 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-04 11:08:11 +02:00
Fabiano Fidêncio	59b9380d1c	Merge pull request #8093 from stevenhorsman/crictl-pod-config-update doc: Update crictl pod-config	2023-10-04 10:49:04 +02:00
David Esparza	8c498ef5ee	metrics: Use jq tool to pretty-print json metrics output This PR enables the use of jq pretty-print feature to improve the formatting of metric results json files. Fixes: #8081 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-03 23:33:19 -06:00
David Esparza	a2159a6361	metrics: Enables FIO test for kata containers FIO benchmark is enabled to measure IO in Kata at different latencies using containerd client, in order to complement the CI metrics testing set. This PR asl deprecated the previous Fio bench based on k8s. Fixes: #8080 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-03 23:32:38 -06:00
Fabiano Fidêncio	f337315952	Merge pull request #8106 from fidencio/topic/gha-fix-k0s-related-cis gha: Fix k0s deployment	2023-10-03 21:47:40 +02:00
GabyCT	d1d9af5de2	Merge pull request #8085 from GabyCT/topic/stabilitytests tests: Add stability test for kata CI	2023-10-03 11:28:49 -06:00
Fabiano Fidêncio	70e7ec3e23	gha: Fix k0s deployment The tests are failing when setting up k0s, and that happens because we download a kubectl binary matching the kubernetes version k0s is using, and we do that by: ``` sudo k0s kubectl version --short 2>/dev/null \| ... ``` With kubectl 1.28, which is now the default on k0s, `kubectl version --short` has been removed, leading us to an empty stringm causing then the error in the CI. Fixes: #8105 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 17:21:40 +02:00
Fabiano Fidêncio	560bbffb57	packaging: tools: Remove `set -x` leftover This was used for debugging, and ended up being merged with that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	18fa483d90	packaging: release: Mention newly added images We've added two new containerd builder images recently, one for the components under `src/tools` and another one for the Kata Containers agent. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	ca3b888371	packaging: tools: Fix container image env var name This should be TOOLS_CONTAINER_BUILDER instead of VIRTIOFSD_CONTAINER_BUILDER. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	5ca66795c7	packaging: Allow passing the TOOLS_CONTAINER_BUILDER This follows what we've been doing for all the components we're building, but was missed as part of #8077. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	02acef9575	gha: Build the kata-agent as part of our workflows The kata-agent binary won't be released, just built so it can be used, later on, as part of our tests and as part of the rootfs build. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	5208386ab1	packaging: Build the kata-agent Let's add the needed functions to start building the kata-agent, with or without the OPA support. For now this build is not used as part of the rootfs build, but later on this will (not as part of this series, though). Fixes: #8099 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	1727487eef	agent: Allow specifying DESTDIR and AGENT_POLICY via env vars This will help to build the agent binary as part of the kata-deploy localbuild, as we need to pass the DESTDIR to where the agent will be installed, and also whether we're building the agent with policy support enabled or not. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 14:18:45 +02:00
Fabiano Fidêncio	45c1188839	packaging: Add get_agent_image_name() This will be used for building the kata-agent. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 14:17:38 +02:00
Wainer dos Santos Moschetta	0db8fb8f98	versions: migrate out of k8s.gcr.io The k8s.gcr.io is deprecated for a while now and has been redirected to registry.k8s.io. However on some bare-metal machines in our testing pools that redirection is not working, so let's just replace the registries. Fixes #8098 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> (cherry picked from commit b2c3bca558c38deff2117d5909d9071c23c05590)	2023-10-03 11:52:59 +01:00
stevenhorsman	a1a0543671	doc: Fix spelling Spell check failed with: ``` [kata-spell-check.sh:275] WARNING: Word 'overcommitment': did you mean one of the following?: over commitment, over-commitment, commitment ``` So update this to pass the static checks Fixes: # Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-10-03 10:17:38 +01:00
Gabriela Cervantes	6339605a14	tests: Add general stability fixes This PR adds general stability fixes. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-02 19:42:46 +00:00
stevenhorsman	59ae244442	doc: Update crictl pod-config - Ensure that our documented crictl pod config file contents have uid and namespace fields for compatibility with crictl 1.24+ This avoids a user potentially hitting the error: ``` getting sandbox status of pod "d3af2db414ce8": metadata.Name, metadata.Namespace or metadata.Uid is not in metadata "&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}" getting sandbox status of pod "-A": rpc error: code = NotFound desc = an error occurred when try to find sandbox: not found ``` Fixes: #8092 Signed-off-by: stevenhorsman <steven@uk.ibm.com> (cherry picked from commit `8f8c2215`)	2023-10-02 14:53:46 +01:00
Gabriela Cervantes	fd19f4082f	tests: Add agent stability test This PR adds the agent stability test to stability test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 22:37:02 +00:00
Gabriela Cervantes	215577032f	tests: Add cassandra stress in stability tests This PR adds the cassandra stress at the stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 22:34:45 +00:00
GabyCT	a890ad3a16	Merge pull request #8066 from GabyCT/topic/urlvra docs: Update url in kata vra document	2023-09-28 14:59:34 -06:00
Zvonko Kaiser	79e33c211c	Merge pull request #7325 from zvonkok/vfio-sandbox-id-debug gpu: Adding CDI support for cold and hot-plug of VFIO devices	2023-09-28 21:31:12 +02:00
Gabriela Cervantes	f2d3ea988d	tests: Add stressng dockerfile for stability tests This PR adds the stressng dockerfile for stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 16:35:22 +00:00
Gabriela Cervantes	6493aa309e	tests: Add stressor CPU test for stability tests This PR adds the stressor CPU test for stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 16:33:08 +00:00
Gabriela Cervantes	ef68a3a36b	metrics: Add stability test for kata CI This PR adds the stability test for kata containers repository. Fixes #8084 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 16:23:36 +00:00
David Esparza	f7ef45b167	Merge pull request #8077 from fidencio/topic/kata-deploy-ship-the-tools kata-deploy: build & ship the rust components from src/tools/	2023-09-28 09:59:19 -06:00
Zvonko Kaiser	7c934dc7da	gpu: Fix cold-plug of VFIO devices We need to do proper sandbox sizing when we're doing cold-plug introduce CDI, the de-facto standard for enabling devices in containers. containerd will pass-through annotations for accumulated CPU,Memory and now CDI devices. With that information sandbox sizing can be derived correctly. Fixes: #7331 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-09-28 09:49:13 +00:00
GabyCT	fcc755fc3b	Merge pull request #8068 from GabyCT/topic/limitlatency metrics: Add latency value limits for kata CI	2023-09-27 13:28:41 -06:00
Greg Kurz	defbb64ac8	Merge pull request #8036 from rye-stripe/bugfix/overhead-metrics runtime: fix reading cgroup stats of sandboxes	2023-09-27 19:39:55 +02:00
Archana Shinde	95455e6fe8	Merge pull request #8058 from likebreath/0925/clh_v35.0 Upgrade to Cloud Hypervisor v35.0	2023-09-27 10:39:32 -07:00
Gabriela Cervantes	8d66ef5185	metrics: Increase qemu jitter value This PR increases qemu jitter value. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-27 17:31:07 +00:00
Gabriela Cervantes	5600e28b54	metrics: Increase jitter value for clh This PR increases jitter value for clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-27 17:30:19 +00:00
Fabiano Fidêncio	a6b1f5e21b	ci: Build src/tools components as part of our tests / releases Build those as part of our CI and release workflows. Fixes #5520 #5348 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:50:25 +02:00
Fabiano Fidêncio	501a168a81	kata-deploy: Build components from src/tools Let's add targets and actually enable users and oursevles to build those components in the same way we build the rest of the project. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:49:02 +02:00
Fabiano Fidêncio	6ef42db5ec	static-build: Add scripts to build content from src/tools As we'd like to ship the content from src/tools, we need to build them in the very same way we build the other components, and the first step is providing scripts that can build those inside a container. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:48:56 +02:00
Fabiano Fidêncio	4d08ec29bc	packaging: Add get_tools_image_name() This will be used for building all the (rust) components from src/tools. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:48:35 +02:00
Fabiano Fidêncio	98097c96de	packaging: Use git abbreviated hash This will make it easier to build images that rely on several directories hashes. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:48:30 +02:00
Fabiano Fidêncio	8b25e90027	Merge pull request #8075 from fidencio/topic/ci-add-kata-monitor-tests ci: Port kata-monitor tests from Jenkins to GHA	2023-09-27 15:48:46 +02:00
Fabiano Fidêncio	489caf1ad0	ci: kata-monitor: Move tests over Let's move, adapt, and use the kata-monitor tests from the tests repo. In this PR I'm keeping the SoB from every single contributor from who touched those tests in the past. Fixes: #8074 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-27 11:40:31 +02:00
Fabiano Fidêncio	a3fb067f1b	ci: Add placeholder for kata-monitor tests The kata-monitor tests is currently running as part of the Jenkins CI with the following setups: * Container Engines: CRI-O \| containerd * VMMs: QEMU When using containerd, we're testing it with: * Snapshotter: overlayfs \| devmapper We will stop running those tests on devmapper / overlayfs as that hardly would get us a functionality issue. Also, we're restricting this to run with the LTS version of containerd, when containerd is used. As it's known due to our GHA limitation, this is just a placeholder and the tests will actually be added in the next iterations. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:31:17 +02:00
Fabiano Fidêncio	57cb4ce204	ci: Make install_kata aware of container engines This will help us when running tests using CRI-O. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:31:17 +02:00
Fabiano Fidêncio	de1eeee334	ci: Create a generic install_crio function This will serve us quite will in the upcoming tests addition, which will also have to be executed using CRi-O. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:26:13 +02:00
Fabiano Fidêncio	64a2000859	ci: Add install_cni_plugins helper This will become handy when doing tests with CRI-O, as CRI-O doesn't install the CNI plugins for us. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:26:13 +02:00
Fabiano Fidêncio	8132fe15c9	ci: Modify containerd default config Let's ensure we have runc running with `SystemdCgroups = false`, otherwise we'll face failures when running tests depending on runc on Ubuntu 22.04, woth LTS containerd. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:16:12 +02:00
Chelsea Mafrica	a49bc68374	runtime-rs: Update status for pause and resume Pause and resume task do not currently update the status of the container to paused or running, so fix this. This is specifically for pausing the task and not the VM. Fixes #6434 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-09-26 17:22:47 -07:00
Gabriela Cervantes	8cb7df1bed	metrics: Add checkmetrics for latency test This PR adds the checkmetrics for latency test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 19:11:08 +00:00
Gabriela Cervantes	e90440ae24	metrics: Add qemu latency value limit This PR adds the qemu latency value limit for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 17:30:09 +00:00
Gabriela Cervantes	a74a8f8a9d	metrics: Add latency value limits for kata CI This PR adds latency value limits for kata CI. Fixes #8067 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 17:29:07 +00:00
Gabriela Cervantes	d7def8317a	metrics: Fix general check static warnings This PR fixes general check static warnings. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 16:30:59 +00:00
GabyCT	309103169d	Merge pull request #8056 from GabyCT/topic/fixlatencypath metrics: Fix latency yamls path	2023-09-26 10:16:55 -06:00
Gabriela Cervantes	928553d1ba	docs: Update url in kata vra document This PR updates the url in kata vra document. Fixes #8065 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 16:13:12 +00:00
GabyCT	5c0afaacf4	Merge pull request #8018 from GabyCT/topic/fixreadme metrics: Fix metrics README	2023-09-26 09:51:47 -06:00
David Esparza	83326f89b3	Merge pull request #8054 from GabyCT/topic/fixcrdoc metrics: Fix C-Ray documentation	2023-09-26 09:50:19 -06:00
James O. D. Hunt	31478b9c33	Merge pull request #7944 from jodh-intel/runtime-rs-ch-enable-tdx runtime-rs: ch: Enable Intel TDX	2023-09-26 14:11:12 +01:00
James O. D. Hunt	b0a3293d53	runtime-rs: ch: Enable Intel TDX Allow Cloud Hypervisor to create a confidential guest (a TD or "Trust Domain") rather than a VM (Virtual Machine) on Intel systems that provide TDX functionality. > Notes: > > - At least currently, when built with the `tdx` feature, Cloud Hypervisor > cannot create a standard VM on a TDX capable system: it can only create > a TD. This implies that on TDX capable systems, the Kata Configuration > option `confidential_guest=` must be set to `true`. If it is not, Kata > will detect this and display the following error: > > ``` > TDX guest protection available and must be used with Cloud Hypervisor (set 'confidential_guest=true') > ``` > > - This change expands the scope of the protection code, changing > Intel TDX specific booleans to more generic "available guest protection" > code that could be "none" or "TDX", or some other form of guest > protection. Fixes: #6448. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 10:55:25 +01:00
James O. D. Hunt	523399c329	runtime-rs: ch: Add more consts Introduce a few new constants (for PCI segment count and FS queues) and move the disk queue constants to `convert.rs` to allow them to be used there too. > Note: > > This change gives the `ShareFs` code it's own set of values rather > than relying on the disk queue constants. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 08:41:32 +01:00
James O. D. Hunt	dea8065811	runtime-rs: ch: Remove unused function Delete the `handle_pending_devices_after_boot()` function which is no longer required. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 08:41:32 +01:00
James O. D. Hunt	995f2c015f	runtime-rs: ch: Only handle particular pending device types Modify the Cloud Hypervisor `add_device()` method to add `ShareFs` and `Network` devices to the list of pending devices since only these two device types need to be cached before VM startup. Full details in the comments. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 08:41:32 +01:00
James O. D. Hunt	b1b96a5c49	runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check Remove the `VIRTIO_BLK_MMIO` check which appears to have been added erroneously in the first place. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 08:41:32 +01:00
Gabriela Cervantes	9ac29b8d38	metrics: Add init_env function to latency test This Pr adds the init_env function to latency test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-25 22:06:00 +00:00
Bo Chen	dfd0c9fa9a	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v35.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #8057 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-09-25 12:22:37 -07:00
Bo Chen	8f9f087e35	versions: Upgrade to Cloud Hypervisor v35.0 Details of this release can be found in ourroadmap project as iteration v35.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #8057 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-09-25 12:22:01 -07:00
Fabiano Fidêncio	a4daa86535	Merge pull request #8028 from fidencio/topic/ci-test-with-crio-part-2 ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI	2023-09-25 18:40:42 +02:00
Gabriela Cervantes	81c8babca9	metrics: Fix latency yamls path This PR fixes the latency yamls path for the latency test for kata metrics. Fixes #8055 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-25 15:52:24 +00:00
Gabriela Cervantes	4815736820	metrics: Fix C-Ray documentation This PR fixes the C-Ray documentation for kata metrics. Fixes #8052 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-25 15:27:58 +00:00
Fabiano Fidêncio	ef63d67c41	ci: crio: Trail '\r' from exec_host() output We've faced this as part of the CI, only happening with the CRI-O tests: ``` not ok 1 Test readonly volume for pods # (from function `exec_host' in file tests_common.sh, line 51, # in test file k8s-file-volume.bats, line 25) # `exec_host "echo "$file_body" > $tmp_file"' failed with status 127 # [bats-exec-test:38] INFO: k8s configured to use runtimeclass # bash: line 1: $'\r': command not found # # Error from server (NotFound): pods "test-file-volume" not found ``` I must say I didn't dig into figuring out why this is happening, but we may be safe enough to just trail the '\r', as long as all the tests keep passing on containerd. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-25 16:42:18 +02:00
Fabiano Fidêncio	74c12b2927	ci: crio: Enable default capabilities We need the default capabilities to be enabled, especially `SYS_CHROOT`, in order to have tests accessing the host to pass. A huge thanks to Greg Kurz for spotting this and suggesting the fix. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2023-09-25 14:56:15 +02:00
Fabiano Fidêncio	358dc2f569	kata-deploy: Fix CRI-O detection Some of the "k8s distros" allow using CRI-O in a non-official way, and if that's done we cannot simply assume they're on containerd, otherwise kata-deploy will simply not work. In order to avoid such issue, let's check for `cri-o` as the container engine as the first place and only proceed with the checks for the "k8s distros" after we rule out that CRI-O is not being used. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-25 14:56:15 +02:00
Fabiano Fidêncio	ebaa4fa4c1	ci: crio: Pass `-y` to apt That was something overlooked during my tests. :-/ Fixes: #8005 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-25 14:56:15 +02:00
GabyCT	11cf0e2d28	Merge pull request #8038 from GabyCT/topic/latency metrics: Enable latency test in gha run script	2023-09-22 16:57:53 -06:00
GabyCT	3ef57b335e	Merge pull request #8045 from jepio/fix-docker-ownership local-build: Fix .docker ownership before build-payload	2023-09-22 14:43:38 -06:00
Archana Shinde	9bb9a3e7a4	Merge pull request #7966 from amshinde/runtime-rs-network-clh runtime-rs: Add network support for cloud-hypervisor	2023-09-22 13:08:09 -07:00
Gabriela Cervantes	97e73b2234	metrics: Fix spelling warnings This PR fixes general spelling warnings detected by the spelling check. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-22 15:50:51 +00:00
Gabriela Cervantes	36c8cd6f1f	metrics: Fix metrics README This PR fixes the network metrics section at the README by leaving the current tests that we have in our kata metrics. Fixes #8017 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-22 15:28:58 +00:00
Fabiano Fidêncio	c5a5a0c95e	Merge pull request #8012 from arronwy/strip osbuild: Reduce guest components binary size with strip	2023-09-22 15:45:38 +02:00
Fabiano Fidêncio	9d190f2390	Merge pull request #8042 from GabyCT/topic/pandoc gha: Add pandoc as a dependency for static checks	2023-09-22 15:31:18 +02:00
Jeremi Piotrowski	15425a2b80	local-build: Fix .docker ownership before build-payload The permissions on .docker/buildx/activity/default are regularly broken by us passing docker.sock + $HOME/.docker to a container running as root and then using buildx inside. Fixup ownership before executing docker commands. Fixes: #8027 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-22 13:44:53 +02:00
Jeremi Piotrowski	a5338e885e	Merge pull request #8030 from portersrc/8027-ci-rootfs-image-build-asset-is-failing-oras ci: rootfs-image build-asset is failing	2023-09-22 11:07:50 +02:00
Chao Wu	6f98fbafde	Merge pull request #6706 from guixiongwei/feat/thp feat(runtime-rs): introduce huge page mode to select VM RAM's backend	2023-09-22 15:27:06 +08:00
Gabriela Cervantes	13ca7d9f97	gha: Add pandoc as a dependency for static checks To avoid the failure of not finding pandoc command this PR adds that package as a dependency for static checks. Fixes #8041 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-21 20:14:41 +00:00
Jeremi Piotrowski	28dd5ae91e	Merge pull request #7799 from UiPath/clh-directio-support clh: Direct IO support for block devices	2023-09-21 19:16:08 +02:00
David Esparza	6de9f39895	Merge pull request #8020 from GabyCT/topic/fixhunspell gha: Install hunspell for static checks	2023-09-21 10:58:40 -06:00
Gabriela Cervantes	08bc8e4db4	metrics: Add latency benchmark for gha This PR adds the latency benchmark for gha for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-21 16:14:39 +00:00
Gabriela Cervantes	6776b55d7e	metrics: Enable latency test in gha run script This PR enables the latency test for gha run script for kata metrics. Fixes #8037 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-21 16:11:58 +00:00
Peteris Rudzusiks	94e2ccc2d5	runtime: fix reading cgroup stats of sandboxes The cgroup stats come from resourcecontrol package in the form of pointers to structs. The sandbox Stat() method incorrectly was expecting structs. This caused the cpu and memory stats to always be 0, which in turn caused incorrect pod overhead metrics. Fixes #8035 Signed-off-by: Peteris Rudzusiks <rye@stripe.com>	2023-09-21 17:00:53 +02:00
Alexandru Matei	d507d189bb	fc: Add support for noflush cache option Firecracker supports noflush semantic via Unsafe cache type. There is no support for direct i/o, remove it from config file Fixes: #7823 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-09-21 14:48:24 +03:00
Alexandru Matei	2ca781518a	clh: Direct IO support for block devices Clh suports direct i/o for disks. It doesn't offer any support for noflush, removed passing of option to cloud-hypervisor internal config Fixes: #7798 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-09-21 14:48:24 +03:00
Fabiano Fidêncio	dd27912f31	Merge pull request #8032 from fidencio/topic/ci-make-push-after-build-be-trigger-by-workflow-dispatch ci: Trigger payload-after-push on workflow_dispatch	2023-09-21 10:25:24 +02:00
Fabiano Fidêncio	0c95697cc4	ci: Trigger payload-after-push on workflow_dispatch This will allow us to easily test failures and fixes on that workflows. Fixes: #8031 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-21 09:24:13 +02:00
Chris Porter	28cbc3b51c	ci: rootfs-image build-asset is failing Fixes: #8027 Signed-off-by: Chris Porter <porter@ibm.com>	2023-09-21 00:58:42 -05:00
Fabiano Fidêncio	21f6f9a173	Merge pull request #8016 from fidencio/topic/ci-test-with-crio-part-1 ci: Actually enable the CRI-O tests	2023-09-21 07:42:27 +02:00
Wainer Moschetta	87e64a07ed	Merge pull request #7979 from beraldoleal/gogo-removal protocol: remove gogoprotobuff tests	2023-09-20 22:38:10 -03:00
Gabriela Cervantes	87a8616488	gha: Install hunspell for static checks Seems like the static checks are failing due the missing of the hunspell package this PR fixes that. Fixes #8019 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-20 16:58:10 +00:00
Fabiano Fidêncio	8c3c50ca8a	ci: Actually enable the CRI-O tests The test has been added to the repo, but we have to also add it to the list of jobs to be executed. Fixes: #8005 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 18:01:25 +02:00
David Esparza	03554c799a	Merge pull request #8006 from fidencio/topic/ci-test-with-crio-part-0 ci: k8s: Also run tests with CRI-O	2023-09-20 07:45:17 -06:00
Fabiano Fidêncio	c6a9e50c37	Merge pull request #8004 from microsoft/danmihai1/quoted-spaces runtime: support kernel params including spaces	2023-09-20 12:10:51 +02:00
Wang, Arron	3a6510ad61	osbuild: Reduce guest components binary size with strip opa_linux_amd64_static 38M => 27M kata-agent 30M => 23M ls -alh opa_linux_amd64_static -rw-rw-r-- 1 arron arron 38M Jul 28 01:59 opa_linux_amd64_static ➜ kata-containers git:(main) ✗ strip opa_linux_amd64_static ➜ kata-containers git:(main) ✗ ls -alh opa_linux_amd64_static -rw-rw-r-- 1 arron arron 27M Sep 20 16:12 opa_linux_amd64_static ls -alh ./usr/bin/kata-agent -rwxr-xr-x. 1 root root 30M Jul 30 23:41 ./usr/bin/kata-agent ls -alh ./usr/bin/kata-agent -rwxr-xr-x. 1 root root 23M Sep 20 16:13 ./usr/bin/kata-agent Fixes: #8011 Signed-off-by: Wang, Arron <arron.wang@intel.com>	2023-09-20 16:23:17 +08:00
Fabiano Fidêncio	07a6e63a6b	ci: k8s: rke2: Use sudo to call systemd Otherwise we'll face the following error: ``` Failed to enable unit: Interactive authentication required. ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 08:48:29 +02:00
Fabiano Fidêncio	03b82e8484	ci: k8s: Add a CRI-O test Let's make sure we'll also be testing k8s using CRI-O. For now, we'll only be running the CRI-O test with QEMU. Once it becomes stable we can expand this to other Hypervisors as well. Fixes: #8005 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 00:59:09 +02:00
Fabiano Fidêncio	d7105cf7a4	ci: k8s: Add a method to install CRI-O This is based on official CRI-O documentations[0] and right now we're making this specific to Ubuntu as that's what we have as runners. We may want to expand this in the future, but we're good for now. [0]: https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 00:59:09 +02:00
Fabiano Fidêncio	54c0a471b1	ci: k8s: k0s: Allow passing parameters to the k0s installer We'll need this in order to setup k0s with a different container engine. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 00:59:09 +02:00
Fabiano Fidêncio	31ef64606c	Merge pull request #8007 from fidencio/topic/ci-kata-deploy-fix-garm-runner-name ci: kata-deploy: Fix runner name	2023-09-20 00:58:33 +02:00
Beraldo Leal	730ef51693	deps: updating dependencies Updating dependencies after make check, make test. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-19 16:54:35 -04:00
GabyCT	6111ef6fb6	Merge pull request #7990 from GabyCT/topic/parallelbandwidth metrics: Enable parallel bandwidth iperf limit	2023-09-19 14:52:21 -06:00
Fabiano Fidêncio	3a2c83d69b	ci: kata-deploy: Fix runner name It should be garm-ubuntu-2004-smaller instead of garm-ubuntu-2004-small. Fixes: #7890 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 22:34:37 +02:00
Dan Mihai	82ff2db460	runtime: support kernel params including spaces Support quoted kernel command line parameters that include space characters. Example: dm-mod.create="dm-verity,,,ro,0 736328 verity 1 /dev/vda1 /dev/vda2 4096 4096 92041 0 sha256 f211b9f1921ef726d57a72bf82be23a510076639fa8549ade10f85e214e0ddb4 065c13dfb5b4e0af034685aa5442bddda47b17c182ee44ba55a373835d18a038" Fixes: #8003 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-09-19 20:26:38 +00:00
Beraldo Leal	604a9dd673	protocol: remove gogoprotobuff tests This is part of a bigger effort to drop gogoprotobuff from our code base. IIUC, those options are basically used by *pb_test.go, and since we are dropping gogoprotobuff and those are auto generated tests, let's just remove it. Fixes #7978. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-19 12:55:42 -04:00
Fabiano Fidêncio	5560e72024	Merge pull request #7896 from fidencio/topic/ground-work-for-testing-all-k8s-flavours-we-support ci: kata-deploy: Enable all k8s flavours that we support	2023-09-19 17:44:34 +02:00
Fabiano Fidêncio	f7fa7f602a	ci: Enable kata-deploy tests for all the supported k8s flavours Let's ensure we test kata-deploy on RKE2 and k0s as well. Fixes: #7890 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	2c908b598c	ci: kata-deploy: Add the ability to deploy rke2 This will be very useful in the near future, when we start testing kata-deploy with rke2 as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	eaf6164916	ci: kata-deploy: Add the ability to deploy k0s This will be very useful in the near future, when we start testing kata-deploy with k0s as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	0015257636	ci: kata-deploy: Add deploy-k8s argument to gha-run.sh We'll be using exactly the same code used for the k8s tests, which are already deploying k3s on GARM. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	bf2cb02283	ci: kata-deploy: Expland tests to run on k0s / rke2 We just need to make sure the correct overlay is applied, following what we already have been doing for k3s. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	6d5d844e5c	Merge pull request #7983 from sprt/resource-group-naming ci: Create clusters in individual resource groups	2023-09-19 12:54:21 +02:00
Fabiano Fidêncio	b12b9e1886	ci: kata-deploy: Add placeholder for tests on GARM We'll be testing kata-deploy with different kubernetes flavours as part of our GARM tests, and this is a place-holder for this. Once enabled, we'll do nothing, just `return 0`, so we can then properly add the tests after this commit gets merged. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 12:42:02 +02:00
Fabiano Fidêncio	9e1fb8a966	ci: kata-deploy: Export KUBERNETES env var So we have a better control on which flavour of kubernetes kata-deploy is expected to be targetting. This was also done as part of `fa62a4c01b`, for the k8s tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 12:37:56 +02:00
Fabiano Fidêncio	09cc0ed438	ci: Move deploy_k8s() to gha-run-k8s-common.sh This will allow us to re-use the function in the kata-deploy tests, which will come soon. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 12:37:56 +02:00
Fabiano Fidêncio	1829f5c049	Merge pull request #7992 from skaegi/virtiofsd-1.8.0 versions: Bump virtiofsd to v1.8.0	2023-09-19 11:52:49 +02:00
Fabiano Fidêncio	486fe14c99	ci: Properly set K8S_TEST_UNION Otherwise only the first test will be executed Signed-off-by: Aurélien Bombo <abombo@microsoft.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 10:23:58 +02:00
Aurélien Bombo	d9ef1352af	ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name Ideally we'd add the instance_type or the full K8S_TEST_HOST_TYPE but that exceeds the maximum amount of characteres allowed for the cluster name. With this in mind, let's use the first letter of K8S_TEST_HOST_TYPE instead. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-09-19 10:23:58 +02:00
Aurélien Bombo	68267a3996	ci: Create clusters in individual resource groups This makes it so that each AKS cluster is created in its own individual resource group, rather than using the "kataCI" resource group for all test clusters. This is to accommodate a tool that we recently introduced in our Azure subscription which automatically deletes resource groups after a set amount of time, in order to keep spending under control. The tool will automatically delete any resource group, unless it has a tag SkipAutoDeleteTill = YYYY-MM-DD. When this tag is present, the resource group will be retained until the specified date. Note that I tagged all current resource groups in our subscription with SkipAutoDeleteTill = 2043-01-01 so that we don't lose any existing resources. Fixes: #7982 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-09-19 10:23:55 +02:00
Fabiano Fidêncio	84c0d59d23	Merge pull request #7985 from fidencio/topic/clh-use-static_sandbox_resource_mgmt-as-default-on-arm clh: arm: Use static_sandbox_resource_mgmt=true	2023-09-19 09:25:34 +02:00
Gabriela Cervantes	9aa8d1c917	metrics: Add parallel bandwidth limit for qemu This PR adds the parallel bandwidth limit for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-18 21:08:54 +00:00
Simon Kaegi	44c7c082d9	versions: Bump virtiofsd to v1.8.0 https://gitlab.com/virtio-fs/virtiofsd/-/releases/v1.8.0 was released two weeks ago. We have fully tested and are using this version. Also bumps toolchain version to match what virtiofsd used. Fixes: #7960 Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>	2023-09-18 15:21:15 -04:00
Fabiano Fidêncio	5f8e210d3b	Merge pull request #7961 from ChengyuZhu6/update_nydus Bump nydus versions and update nydus tests	2023-09-18 21:02:20 +02:00
Fabiano Fidêncio	c3ee913bf6	Merge pull request #7953 from gkurz/extra-monitor-socket runtime/qemu: Rework QMP/HMP support	2023-09-18 19:04:14 +02:00
Gabriela Cervantes	af59d4bf4a	metrics: Enable parallel bandwidth iperf limit This PR enables the parallel bandwidth iperf limit for kata metrics. Fixes #7989 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-18 16:32:11 +00:00
Fabiano Fidêncio	aba36ab188	nydus: Temporarily skip tests on dragonball We're hitting a specific issue after updating, which will require some work on dragonball before it can be re-added here. The issue: ``` ... 3: failed to do rafs mount\\n 4: fail to attach rafs \\\"/var/lib/containerd-nydus/snapshots/2/fs/image/image.boot\\\"\\n 5: add share fs mount\\n 6: Mount rafs at /rafs/197ef3db03c86b91bf3045ff59183ce8b5750941ad1d3484f4a8301a70f5109f/rootfs_lower error: Failed to Mount backend ... Caused by: vmm action error: FsDevice(AttachBackendFailed(\\\"attach/detach a backend filesystem failed:: missing field `version` at line 1 column 489\\\"))\"): unknown" ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	b8a8dfcd15	nydus: Use `kata-${KATA_HYPERVISOR}` instead of `kata` This will ensure we're testing with the correct runtime, instead of using the `default` one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
ChengyuZhu6	f6df3d6efb	static-build: Fix arch error on nydus build Fix the arch error when downloading the nydus tarball. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Steven Horsman <steven@uk.ibm.com>	2023-09-18 17:40:06 +02:00
ChengyuZhu6	2f9c9e2e63	tests: nydus: Update nydus tests To support the v0.12.0 nydus-snapshotter, we need to update the config files and the commandline to start nydus-snapshotter. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	c9a4e7e46d	versions: Bump nydus and nydus-snapshotter to its latest release As we need https://github.com/containerd/nydus-snapshotter/pull/530 in. Fixes #7984 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	b73bde320d	gha: nydus: Populate run() And with this we finally enable the nydus tests to run as part of our GHA CI. Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	b3904a1a30	gha: nydus: Populate install_dependencies() Let's have all the dependencies needed for running the nydus tests installed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	d2b3b67f5d	gha: nydus: Actually install kata when `install-kata` is called We've been simply doing nothing whenever `install-kata` was called, and that was the intent when we added the placeholder calls. Now, let's install kata, as expected. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	0ec00ad42e	gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh As we've added install_nydus() and install_nydus_snapshotter(), which do conform with the pattern we're following on GHA, let's rely on them rather than relying on the bits coming from nydus_test.sh. Later on we'll have install_nydus() and install_nydus_snapshotter() as part of the dependencies install in our `gha-run.sh`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	568439c77b	tests: nydus: Add timeout to the crictl calls Similarly to what's been done for the cri-containerd tests, as part of `84dd02e0f9`, we need to add the timeout here for the crictl calls. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	5ac3b76eb1	tests: nydus: Add uid / namespace to the nydus container / sandbox Otherwise we may face errors like: ``` getting sandbox status of pod "d3af2db414ce8": metadata.Name, metadata.Namespace or metadata.Uid is not in metadata "&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}" getting sandbox status of pod "-A": rpc error: code = NotFound desc = an error occurred when try to find sandbox: not found ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	376574a16c	tests: nydus: Decorate some calls with `sudo` Otherwise we canoot properly start the nydus snapshotter, nor properly kill it after it's been started. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	4290fd4b67	tests: nydus: Adapt "source ..." to GHA The "source ..." we've been doing was not changed since those tests were part of the Jenkins tests, and we need to adapt them, either setting the correct path or entirely removing the ones that are not relevant to us anymore. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	a84efa3e87	tests: nydus: Adapt check to "clh" instead "cloud-hypervisor" As that's what we've been using as part of the GHA. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	56a14b3950	tests: common: Add install_nydus_snapshotter() This function will be used to download and install the nydus-snapshotter, and it follows the same pattern we already have introduced for downloading and installing another dependencies from GitHub. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	b6563783e2	tests: common: Add install_nydus() This function will be used to download and install nydus, and it follows the same pattern we already have introduced for downloading and installing another dependencies from GitHub. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	72599f1911	clh: arm: Use static_sandbox_resource_mgmt=true Users have noticed that this is needed, as CLH does not yet implement a way to hotplug resources on aarh64. With this patch, when building for x86_64, I can see the this is the resulting config: ``` $ ARCH=amd64 make ... $ cat config/configuration-clh.toml \| grep static_sandbox_resource_mgmt static_sandbox_resource_mgmt=false ``` And when building for aarch64: ``` $ ARCH=arm64 make ... $ cat config/configuration-clh.toml \| grep static_sandbox_resource_mgmt static_sandbox_resource_mgmt=true ``` Fixes: #7941 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 14:14:10 +02:00
Jeremi Piotrowski	dfa6af54df	Merge pull request #7806 from jongwu/clh_serial clh:arm64: use arm AMBA UART for hypervisor debug	2023-09-18 12:29:07 +02:00
Greg Kurz	1f16b6627b	runtime/qemu: Rework QMP/HMP support PR #6146 added the possibility to control QEMU with an extra HMP socket as an aid for debugging. This is great for development or bug chasing but this raises some concerns in production. The HMP monitor allows to temper with the VM state in a variety of ways. This could be intentionally or mistakenly used to inject subtle bugs in the VM that would be extremely hard if not even impossible to debug. We definitely don't want that to be enabled by default. The feature is currently wired to the `enable_debug` setting in the `[hypervisor.qemu]` section of the configuration file. This setting has historically been used to control "debug output" and it is used as such by some downstream users (e.g. Openshift). Forcing people to have the extra HMP backdoor at the same time is abusive and dangerous. A new `extra_monitor_socket` is added to `[hypervisor.qemu]` to give fine control on whether the HMP socket is wanted or not. This setting is still gated by `enable_debug = true` to make it clear it is for debug only. The default is to not have the HMP socket though. This isn't backward compatible with #6416 but it is for the sake of "better safe than sorry". An extra monitor socket makes the QEMU instance untrusted. A warning is thus logged to the journal when one is requested. While here, also allow the user to choose between HMP and QMP for the extra monitor socket. Motivation is that QMP offers way more options to control or introspect the VM than HMP does. Users can also ask for pretty json formatting well suited for human reading. This will improve the debugging experience. This feature is only made visible in the base and GPU configurations of QEMU for now. Fixes #7952 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-09-18 12:13:01 +02:00
Greg Kurz	cab46c9e23	Merge pull request #7973 from fidencio/topic/ci-use-bigger-machine-sizes-for-the-needed-tests-part-0 ci: Use variable size of VMs depending on the tests running	2023-09-18 12:06:44 +02:00
Fabiano Fidêncio	0e3bfac3b3	Merge pull request #7976 from fidencio/topic/ci-static-checks-rework-part-0 ci: Rework static checks	2023-09-18 11:01:18 +02:00
Peng Tao	6eedd9b0b9	Merge pull request #7738 from Xuanqing-Shi/7732/handle-non-empty-endpoints-in-RemoveEndpoints runtime: incorrect handling of non-empty []Endpoint parameter in Remo…	2023-09-18 10:58:28 +08:00
Fabiano Fidêncio	8b1e9b0c75	ci: static-checks: Clean up static-checks job Now that the static-checks job only takes care of running the static-checks, let's clean it up, remove all the unneeded steps, make sure that we're using the actions in their latest version, and have it running in a cost free runner. At some point I'd like to see those tests done in parallel, in the same way that I've organised the build-checks, but that's something for someone else, at some other time. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 14:23:02 +02:00
Fabiano Fidêncio	2c5ca2eaf8	ci: static-checks: Run tests depending on KVM With this we're removing the dragonball static-checks CI, as the test is running here now. :-) Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 14:22:38 +02:00
Fabiano Fidêncio	509c309ab2	ci: static-checks: Move "sudo make test" to the new test matrix We're moving it out of the previous "static-checks" confusing matrix, and adding it to the matrix that was currently being used for the `make vendor` and `make check` checks. This will allow us to have one job per component, and with that we can easily run those in parallel and on the zero cost runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:23 +02:00
Fabiano Fidêncio	4e963cedf4	ci: static-checks: Move "make test" to the new test matrix We're moving it out of the previous "static-checks" confusing matrix, and adding it to the matrix that was currently being used for the `make vendor` and `make check` checks. This will allow us to have one job per component, and with that we can easily run those in parallel and on the zero cost runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:17 +02:00
Fabiano Fidêncio	08f2e5ae0b	runtime-rs: Ensure static-checks-build is a dep of `make test` Otherwise `make test` will simply fail with: ``` error[E0583]: file not found for module `config` ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:13 +02:00
Fabiano Fidêncio	2bc3a616ae	kata-ctl: Use `loop` instead of `kvm` module in tests This makes it pssible to run the tests in the cost free runners, which are not KVM capable. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:08 +02:00
Fabiano Fidêncio	46daddc500	kata-ctl: Ensure GENERATED_CODE is a dep of `make test` Otherwise `make test` will simply fail with: ``` error[E0583]: file not found for module `version` ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:01 +02:00
Fabiano Fidêncio	ec826f328f	agent: Ensure GENERATED_CODE is a dep of `make test` Otherwise `make test` will fail with: ``` error[E0583]: file not found for module `version` ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:57 +02:00
Fabiano Fidêncio	1d32410a83	ci: install_libseccomp: Do not depend on the tests repo It makes things way simpler, waaaaay simpler. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:49 +02:00
Fabiano Fidêncio	bf888b9a5e	ci: static-checks: Move "make check" to the new test matrix We're moving it out of the previous "static-checks" confusing matrix, and adding it to the matrix that was currently being used for the `make vendor` checks. This will allow us to have one job per component, and with that we can easily run those in parallel and on the zero cost runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:45 +02:00
Fabiano Fidêncio	473ec87806	kata-ctl: Add `kata-types` to the Cargo.lock file Commit message covered everything. :-) Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:40 +02:00
Fabiano Fidêncio	ea19549a99	kata-ctl: Ensure GENERATED_CODE is a dep of `make check` Otherwise `make check` would fail with: ``` Error writing files: failed to resolve mod `version`: /home/runner/work/kata-containers/kata-containers/src/tools/kata-ctl/src/ops/version.rs does not exist make: *** [../../../utils.mk:176: standard_rust_check] Error 1 ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:36 +02:00
Fabiano Fidêncio	e125775863	tests: install_rust: Also install clippy clippy is used as part our tests, so it's useful to have it installed while we're already installing rust. In case of developers, they also better be using it. :-) Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:31 +02:00
Fabiano Fidêncio	e2c61a152c	ci: static-checks: Move vendor check to its own job Similarly to the static-check jobs, those jobs can be run on the zero cost runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:30 +02:00
Fabiano Fidêncio	6794d4c843	tests: Move install_rust.sh from the tests repo We'll use it as part of the refactoring we're doing in the static check tests. I can see a lot of other uses of this, but changing all of them to this one is out of the scope for this PR. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:29 +02:00
Fabiano Fidêncio	e64508c308	tests: install_go: Remove tests repo dependency We can rely on the functions that are now part of the common.bash. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:28 +02:00
Fabiano Fidêncio	11dff731b7	tests: Move functions from kata_arch script here We can use this a lot as part of our CI, but right now I'm just moving those here with the intent to use later on in this series. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:28 +02:00
Fabiano Fidêncio	75c974c802	ci: static-checks: Move kernel config check to its own job It doesn't make sense to run this for all the bits of the matrix, neither it's demanding enough to require running this in one of our Azure sponsored runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:25 +02:00
Archana Shinde	9c233bb9e0	test: Add test to verify try_from for clh Netconfig Add tests to verify conversion from runtime NetworkConfig to clh specific config. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-09-16 00:24:14 -07:00
Fabiano Fidêncio	c69a1e33bd	ci: Use variable size of VMs depending on the tests running Let me start with a fair warning that this commit is hard to split into different parts that could be easily tested (or not tested, just ignored) without breaking pieces. Now, about the commit itself, as we're on the run to reduce costs related to our sponsorship on Azure, we can split the k8s tests we run in 2 simple groups: * Tests that can be run in the smaller Azure instance (D2s_v5) * Tests that required the normal Azure instance (D4s_v5) With this in mind, we're now passing to the tests which type of host we're using, which allows us to select to run either one of the two types of tests, or even both in case of running the tests on a baremetal system. Fixes: #7972 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 09:13:54 +02:00
Archana Shinde	9049d311df	runtime-rs: Add network support for cloud-hypervisor This PR adds support for adding a network device before starting the cloud-hypervisor VM. Support for adding and removing network devices is not really added to the resource manager, so supporting this for cloud-hypervisor is not scoped in this PR. This also changes "pending_devices" for clh implementation from an Option of vector to simply a vector. This simplifies the structure a bit as we can simple iterate over the pending devices instead of having to check for a "Some" value as this is not really required. Fixes: #6333 Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-09-15 23:25:20 -07:00
Greg Kurz	79c494eb4e	Merge pull request #7969 from fidencio/topic/ci-cache-using-oras-part-3 ci: cache: Check the sha256sum of the components & fix ovmf-sev cache usage	2023-09-15 16:30:22 +02:00
Fabiano Fidêncio	eecd5bf2aa	ci: cache: Fix ovmf-sev cache The cached tarball is relying on the component name, thus it's important to set it correctly, otherwise we'll end up always building it. With this patch applied: ``` ≡ ⨯ make ovmf-sev-tarball make ovmf-sev-tarball-build make[1]: Entering directory '/home/ffidenci/src/upstream/kata-containers/kata-containers' /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-binaries-in-docker.sh --build=ovmf-sev sha256:67cc94e393dc1d5bfc2b77a77e83c9b1c0833d0fbbebaa9e9e36f938bb841fcc Build kata version 3.2.0-rc0: ovmf-sev INFO: DESTDIR /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/destdir Downloading a76f5522493f ovmf-sev-builder-image-version Downloading 7e98c854bd94 kata-static-ovmf-sev.tar.xz Downloading 559311973ff8 ovmf-sev-version Downloaded a76f5522493f ovmf-sev-builder-image-version Downloading 353b655c2297 ovmf-sev-sha256sum Downloaded 559311973ff8 ovmf-sev-version Downloaded 353b655c2297 ovmf-sev-sha256sum Downloaded 7e98c854bd94 kata-static-ovmf-sev.tar.xz Pulled [registry] ghcr.io/kata-containers/cached-artefacts/ovmf-sev:latest-main-x86_64 Digest: sha256:933236c2c79e53be3ca7acc0b966d0ddac9c0335edcb1e8cad8b9bb3aaf508ce kata-static-ovmf-sev.tar.xz: OK INFO: Using cached tarball of ovmf-sev drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./ drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./opt/ drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./opt/kata/ drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./opt/kata/share/ drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./opt/kata/share/ovmf/ -rwxr-xr-x runner/runner 4194304 2023-09-15 10:34 ./opt/kata/share/ovmf/AMDSEV.fd ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/builddir ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/builddir make[1]: Leaving directory '/home/ffidenci/src/upstream/kata-containers/kata-containers' ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 12:39:22 +02:00
Fabiano Fidêncio	86c41074b4	ci: cache: Check the sha256sum of the component We've removed this in the part 2 of this effort, as we were not caching the sha256sum of the component. Now that this part has been merged, let's get back to checking it. Fixes: #7834 -- part 3 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 12:34:30 +02:00
Fabiano Fidêncio	f5e52d02d3	Merge pull request #7964 from fidencio/topic/ci-cache-using-oras-part-2 ci: cache: Use the artefacts stored in ghcr.io/kata-containers/cached-artefacts/${component}	2023-09-15 12:29:28 +02:00
Fabiano Fidêncio	2fe0b494da	Merge pull request #7959 from fidencio/topic/ci-run-on-smaller-garm-instances ci: Run some of the GARM tests in smaller instances	2023-09-15 11:30:13 +02:00
Fabiano Fidêncio	460988c5f7	ci: cache: Remove the script used to cache artefacts on Jenkins That's not needed anymore, as we've switched to using ORAS and an OCI registry to cache the artefacts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:27:55 +02:00
Fabiano Fidêncio	4533a7a416	ci: cache: Also store the ${component} sha256sum This is something that was done by our Jenkins jobs, but that I ended up missing when writing `d0c257b3a7`. Now, let's also add the sha256sum to the cached artefact, and in a coming up PR (after this one is merged) we will also start checking for that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:25:26 +02:00
Fabiano Fidêncio	eccc76df63	ci: cache: Use the cached artefacts from ORAS In the previous series related to the artefacts we build, we've switching from storing the artefacts on Jenkins, to storing those in the ghcr.io/kata-containers/cached-artefacts/${artefact_name}. Now, let's take advantage of that and actually use the artefacts coming from that "package" (as GitHub calls it). NOTE: One thing that I've noticed that we're missing, is storing and checking the sha256sum of the artefact. The storing part will be done in a different commit, and the checking the sha256sum will be done in a different PR, as we need to ensure those were pushed to the registry before actually taking the bullet to check for them. Fixes: #7834 -- part 2 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:13:47 +02:00
Jeremi Piotrowski	6f30d00ae7	Merge pull request #7956 from fidencio/topic/ci-reduce-the-machine-size-used ci: Reduce the size of the AKS VMs	2023-09-15 08:49:08 +02:00
Steve Horsman	1b8f3fa9ae	Merge pull request #7957 from fidencio/topic/ci-cache-using-oras-part-1 ci: cache: Allow pushing our artefacts to an OCI registry	2023-09-15 07:45:24 +01:00
Jianyong Wu	7f5e77bcb8	kernel: enable Arm pl011 support Enable pl011 (ttyAMA0) support in kernel for aarch64. Fixes: #5080 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-09-15 01:45:16 +00:00
Jianyong Wu	241c355e07	clh:arm64: use arm AMBA uart for hypervisor debug cloud hypervisor on arm64 only support arm AMBA UART(pl011) as tty. So, the console should be set to "ttyAMA0" instead of "ttyS0" when enable hypervisor debug mode. Fixes: #5080 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-09-15 01:44:23 +00:00
Fabiano Fidêncio	094b6b2cf8	ci: k8s: Temporarily disable tests that require a bigger VM instance The list of tests which require a bigger VM instance is: * k8s-number-cpus.bats -- failing on all CIs * k8s-parallel.bats -- only failing on the cbl-mariner CI * k8s-scale-nginx.bats -- only failing on the cbl-mariner CI We'll keep those disabled while we re-work the logic to only run those in a bigger (and more expensive) VM instance. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 01:33:19 +02:00
GabyCT	6fe5cd3bd5	Merge pull request #7937 from GabyCT/topic/iperfbandwidth metrics: Add iperf value for cpu utilization	2023-09-14 16:47:19 -06:00
Fabiano Fidêncio	d0c257b3a7	ci: cache: Push cached artefacts to ghcr.io Let's push the artefacts to ghcr.io and stop relying on jenkins for that. Fixes: #7834 -- part 1 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	108f1b60dd	kata-deploy: Generate latest_{artefact,image_builder} files Right now this is not used, but it'll be used when we start caching the artefacts using ORAS. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	be2eb7b378	ci: cache: Install ORAS in the kata-deploy binaries builder container ORAS is the tool which will help us to deal with our artefacts being pushed to and pulled from a container registry. As both the push to and the pull from will be done inside the kata-deploy binaries builder container, we need it installed there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	fb24fb0dc1	ci: k8s: devmapper: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:27:05 +02:00
Fabiano Fidêncio	1daf02f5d4	ci: nydus: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:25:41 +02:00
Fabiano Fidêncio	e60d81f554	ci: nerdctl: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:25:41 +02:00
Fabiano Fidêncio	4db416997c	ci: docker: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:25:41 +02:00
Fabiano Fidêncio	32841827b8	ci: cri-containerd: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:25:35 +02:00
Fabiano Fidêncio	92fff129fd	ci: k8s: Don't set cpu limit request for k8s-inotofy test Without setting the cpu limit / request to 1, we can make this test run in a smaller VM instance without any issue. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 22:03:16 +02:00
Fabiano Fidêncio	faf98c0623	ci: Reduce the size of the AKS VMs We do not need a very powerful machine for our tests, as we're not building anything there. The instance we switched to (Standard_D2s_v5) still has nested virt available, as shown here[0], but has half of the amount of vCPUs / Memory, which should be fine only for running the tests, costing us basically half of the price[1]. [0]: https://learn.microsoft.com/en-us/azure/virtual-machines/dv5-dsv5-series [1]: https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/#pricing Fixes: #7955 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 22:03:16 +02:00
Fabiano Fidêncio	adc18ecdb1	ci: cache: For consistency, read all used env vars Instead of having some of them only being considered if explicitly passed to the script. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 20:24:48 +02:00
Fabiano Fidêncio	c7a851efd7	ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker As the environment variables are now being passed down from the GitHub Actions, let's make sure they're exposed to the container used to build the kata-deploy binaries, and during the build process we'll be able to use those to log in and push the artefacts to the OCI registry, using ORAS. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 20:24:48 +02:00
Fabiano Fidêncio	2e8b41f39c	Merge pull request #7954 from fidencio/topic/ci-cache-using-oras-part-0 ci: cache: Export env vars needed to use ORAS	2023-09-14 20:23:55 +02:00
Fabiano Fidêncio	6bd15a85d5	ci: cache: Export env vars needed to use ORAS We do the build of our artefacts inside a container image, and we need to expose some env vars to the container so ORAS can be used there to push the artefacts we want to cache to ghcr.io. The env vars we're exposing are: * ARTEFACT_REGISTRY: The registry where we're going to save the artefacts. * ARTEFACT_REGISTRY_USERNAME: The username to log in to the registry, as ORAS does not use the same json file used by docker. * ARTEFACT_REGISTRY_PASSWORD: The pasword to log in to the the registry, as the ORAS does not use the same json file used by docker. * TARGET_BRANCH: The target branch, which will be part of the tag of the artefact, as we may end up caching the artefacts for both main and stable branches. Fixes: #7834 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 19:36:33 +02:00
Gabriela Cervantes	cd4fd1292a	metrics: Add iperf cpu utilization limit for qemu This PR adds the iperf cpu utilization limit for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-14 17:17:47 +00:00
Gabriela Cervantes	df5cd10ea0	metrics: Add iperf value for cpu utilization This PR adds the iperf value for cpu utilization for kata metrics. Fixes #7936 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-14 16:06:49 +00:00
Jeremi Piotrowski	b54dd8cdf4	Merge pull request #7704 from jepio/vfio-part-1 gha: vfio: Import test script	2023-09-14 16:45:31 +02:00
Jeremi Piotrowski	a96050a7ad	tests: Apply timeout to 'ctr t kill' This task has been observed to hang at times. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	9d93036783	tests/vfio: Bump VM image to Fedora 38 We need a very recent L2 guest kernel to fix all the bugs that occur in nested virtualization. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	faee59b520	tests/vfio: Accept single device in vfio group for CLH cloud hypervisor does not emulate pcie switches or pci bridges, so we need to accept a lonely device. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	df3dc1105c	tests/vfio: Get rid of sync's It is fine to start a VM with the disk image without syncing it as we now run the test in an ephemeral Azure instance. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	7211c3dccc	gha: vfio: Set test timeout to 15m Sometimes the test gets stuck running commands in the container - need to investigate why later. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	1b02f89e4f	packaging: kernel: Enable VIRTIO_IOMMU on x86_64 Cloud Hypervisor exposes a VIRTIO_IOMMU device to the VM when IOMMU support is enabled. We need to add it to the whitelist because dragonball uses kernel v5.10 which restricted VIRTIO_IOMMU to ARM64 only. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	3a1db7a86b	runtime: clh: Support enabling iommu by enabling IOMMU on the default PCI segment. For hotplug to work we need a virtualized iommu and clh exposes one if there is some device or PCI segment that requests it. I would have preferred to add a separate PCI segment for hotplugging vfio devices but unfortunately kata assumes there is only one segment all over the place. See create_pci_root_bus_path(), split_vfio_pci_option() and grep for '0000'. Enabling the IOMMU on the default PCI segment requires passing enabling IOMMU on every device that is attached to it, which is why it is sprinkled all over the place. CLH does not support IOMMU for VirtioFs, so I've added a non IOMMU segment for that device. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	9f1a42c6cc	tests/vfio: Give commands 30s to execute This is a to catch the case of the guest getting stuck. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	b46b0ecf8b	tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms This shouldn't be hiding behind only a qemu check, we need this for clh as well. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	bfc93927fb	runtime: Remove redundant check in checkPCIeConfig There is no way for this branch to be hit, as port is only set when it is different than config.NoPort. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	7c4e73b609	runtime: Add test cases for checkPCIeConfig These test cases shows which options are valid for CLH/Qemu, and test that we correctly catch unsupported combinations. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	fc51e4b9eb	runtime: Check config for supported CLH (cold\|hot)_plug_vfio values The only supported options are hot_plug_vfio=root-port or no-port. cold_plug_vfio not supported yet. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	509771e6f5	runtime: clh: Add hot_plug_vfio entry to config hot_plug_vfio needs to be set to root-port, otherwise attaching vfio devices to CLH VMs fails. Either cold_plug_vfio or hot_plug_vfio is required, and we have not implemented support for cold_plug_vfio in CLH yet. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	5f6475a28a	tests/vfio: Gather debug info and disable tdp_mmu tdp_mmu had some issues up until around Linux v6.3 that make it work particularly bad when running nested on Hyper-V. Reload the module at the start of the test and disable the tdp_mmu param. Gather debug info at the end of the test to make it easier to figure out what went wrong. This uses github actions group syntax so that each section can be collapsed. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	8fffdc81c5	tests/vfio: Capture journal from vm For debugging (though this doesn't get exposed yet). Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	df815087e7	tests/vfio: Change to get the test working in GHA - reduce memory and cpu usage to fit in a D4s_v5 - source correct lib - mount workspace from 9p - disable cpu mitigations for speed - drop unused commands and variables - install containerd - install kata from built artifacts Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	a92ddeea15	tests/vfio: Move dependency installation to gha-run.sh To match the flow of other github actions workflows. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	5a551a85b1	gha: vfio: Import jobs scripts from tests repo This imports the vfio test scripts github.com/kata-containers/tests. The test case doesn't work yet but doing the changes in a separate commit will make it easier to track the changes. The only change in this commit is renaming vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh Fixes: #6555 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Fabiano Fidêncio	a1e3fa7ac4	Merge pull request #7905 from microsoft/danmihai1/mariner-annotations tests: fix kernel and initrd annotations	2023-09-14 10:37:42 +02:00
GabyCT	1d331124ad	Merge pull request #7925 from GabyCT/topic/bandwidthlimit metrics: Add iperf bandwidth value for kata metrics	2023-09-13 17:43:55 -06:00
Gabriela Cervantes	49e2fa189c	metrics: Increase jitter value for qemu This PR increases the jitter value for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-13 22:36:09 +00:00
Gabriela Cervantes	49234433a7	metrics: Increase value limit for jitter in clh This PR increases the value limit for jitter in clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-13 21:27:08 +00:00
David Esparza	0a24d3f718	Merge pull request #7923 from GabyCT/topic/addcassandradoc metrics: Add Cassandra Metrics documentation	2023-09-13 10:17:00 -06:00
GabyCT	c565053bac	Merge pull request #7895 from GabyCT/topic/removewarning metrics: Remove warning from metrics documentation	2023-09-13 10:16:38 -06:00
Fabiano Fidêncio	8b9df1d32e	Merge pull request #7929 from fidencio/topic/use-tcp-port-ping-on-docker-nerdctl-tests ci: docker: nerdctl: Switch to tcp port 80 ping	2023-09-13 15:46:31 +02:00
Peng Tao	55ca7e8aec	Merge pull request #7907 from Xuanqing-Shi/7876/network-devices-naming-conflict runtime: Naming conflict of network devices	2023-09-13 19:29:41 +08:00
Fabiano Fidêncio	813bfdec01	ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io This will ensure that we're calling the correct binary for the hypervisor. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:10:14 +02:00
Fabiano Fidêncio	46bc0b1c01	ci: nerdctl: Create the containerd config Otherwise we'll fail to configure kata-containers in the `install-kata` step. This is mostly needed because the nerdctl-full tarball doesn't provide a contaienrd configuration, just the binary, as contaienrd does not actually require a configuration file to run with the default config. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
Fabiano Fidêncio	13968aa7f6	ci: nerdctl: Switch to tcp port 80 ping TIL that the Azure VMs we use are created without an explicit outbund connectivity defined. This leads us to issues using `ping ...` as part of our tests, and when consulting Jeremi Piotrowski about the issue he pointed me out to two interesting links: * https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access * https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity For your own sanity, do not read the comments, after all this is internet. :-) Anyways, the suggestion is to use nping instead, which is provided by the nmap package, so we can explicitly switch to using the tcp port 80 for the ping. With this in mind, I'm switching the image we use for the test and using one that provided nping as a possible entry point, and from now on (this part of) the tests should work. Fixes: #7910 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
Fabiano Fidêncio	e0c811678b	ci: docker: Switch to tcp port 80 ping TIL that the Azure VMs we use are created without an explicit outbund connectivity defined. This leads us to issues using `ping ...` as part of our tests, and when consulting Jeremi Piotrowski about the issue he pointed me out to two interesting links: * https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access * https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity For your own sanity, do not read the comments, after all this is internet. :-) Anyways, the suggestion is to use nping instead, which is provided by the nmap package, so we can explicitly switch to using the tcp port 80 for the ping. With this in mind, I'm switching the image we use for the test and using one that provided nping as a possible entry point, and from now on (this part of) the tests should work. Fixes: #7910 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
shixuanqing	1636abbe1c	runtime: issue with non-empty []Endpoint in RemoveEndpoints In the RemoveEndpoints(), when the endpoints paramete isn't empty, using idx may result in wrong endpoint removals. To improve, directly passing the endpoint parameter helps locate the correct elements within n.eps. Fixes: #7732 Signed-off-by: shixuanqing <1356292400@qq.com> Fixes: #7732 Signed-off-by: shixuanqing <1356292400@qq.com> Update src/runtime/virtcontainers/network_linux.go Co-authored-by: Xuewei Niu <justxuewei@apache.org>	2023-09-13 09:47:18 +00:00
Peng Tao	9766f9090c	Merge pull request #7719 from beraldoleal/nullable Remove gogoproto.nullable extension	2023-09-13 15:11:56 +08:00
David Esparza	c2b2a00ad9	Merge pull request #7899 from GabyCT/topic/startdocker metrics: Ensure docker is running in init_env	2023-09-12 23:01:26 -06:00
Gabriela Cervantes	0aa073967d	metrics: Add iperf bandwidth value for qemu This PR adds the iperf bandwidth value for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-12 20:57:14 +00:00
Dan Mihai	c0ad914766	tests: fix kernel and initrd annotations Fix kernel and initrd annotations in the k8s tests on Mariner. These annotations must be applied to the spec.template for Deployment, Job and ReplicationController resources. Fixes: #7764 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-09-12 20:15:25 +00:00
Gabriela Cervantes	615c1cbf19	metrics: Add iperf bandwidth value for kata metrics This PR adds the iperf bandwidth value for kata metrics. Fixes #7924 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-12 19:30:24 +00:00
Gabriela Cervantes	d53eb73eec	metrics: Ensure docker is running in init_env This PR ensures that docker is running as part of the init_env function in kata metrics to avoid failures like docker is not running and making the kata metrics CI to fail. Fixes #7898 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-12 19:13:09 +00:00
GabyCT	c0d502493e	Merge pull request #7921 from dborquez/metrics_disable_fio_test metrics: this PR skips the FIO test temprarily to fix issues	2023-09-12 12:08:48 -06:00
Gabriela Cervantes	ad08321b83	metrics: Add Cassandra Metrics documentation This PR adds the Cassandra Metrics documentation for kata metrics. Fixes #7922 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-12 16:30:35 +00:00
David Esparza	a58ea66592	metrics: this PR skips the FIO test temprarily to fix issues FIO test is showing ongoing issues when running in k8s. Working on running FIO on the ctr client which has been shown to be stable. Fixes: #7920 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-09-12 10:23:57 -06:00
Fabiano Fidêncio	2d8447fc6b	Merge pull request #7916 from fidencio/topic/add-functional-nerdctl-tests ci: Add a very basic nerdctl sanity test	2023-09-12 17:47:08 +02:00
James O. D. Hunt	7feb8de9dc	Merge pull request #7887 from jodh-intel/hypervisor-remove-debug-kernel-options runtime-rs: hypervisor: Remove debug kernel options	2023-09-12 16:31:48 +01:00
Fabiano Fidêncio	f536ef5ce1	ci: docker: Also run the smoke test with runc This will help us to make sure that the failure is actually related to Kata Containers. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-12 16:54:02 +02:00
Fabiano Fidêncio	c83f167c59	ci: docker: Run the tests after the kata-static is created There's no reason to wait till the payload is created to run the tests, as we rely on the tarball, not on the kata-deploy payload. That was a mistake on my side, and that's already fixed for the nerdctl tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-12 16:53:47 +02:00
Fabiano Fidêncio	12d833d07d	ci: Add a very basic nerdctl sanity test Let's add a very basic sanity test to check that we can spawn a containers using nerdctl + Kata Containers. This will ensure that, at least, we don't regress to the point where this feature doesn't work at all. In the future, we should also test all the VMMs with devmapper, but that's for a follow-up PR after this test is working as expected. Fixes: #7911 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-12 16:52:55 +02:00
Greg Kurz	be71a0ab4e	Merge pull request #7811 from stevenhorsman/bump-rust-to-1.72 versions: Bump rust version	2023-09-12 15:30:35 +02:00
Fabiano Fidêncio	b020912629	Merge pull request #7913 from fidencio/topic/add-functional-docker-tests ci: Add a very basic docker sanity test	2023-09-12 15:28:49 +02:00
Fabiano Fidêncio	348b8644d6	ci: Add a very basic docker sanity test Let's add a very basic sanity test to check that we can spawn a containers using docker + Kata Containers. This will ensure that, at least, we don't regress to the point where this feature doesn't work at all. For now we're running this test against Cloud Hypervisor and QEMU only, due to an already reported issue with dragonball: https://github.com/kata-containers/kata-containers/issues/7912 In the future, we should also test all the VMMs with devmapper, but that's for a follow-up PR after this test is working as expected. Fixes: #7910 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-12 15:15:26 +02:00
stevenhorsman	a75fd5eb81	runk: Fix rust unecessary mut error - Fix `error: variable does not need to be mutable` in rust 1.72 Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	a31c145172	kata-ctl: useless-vec warning - Fix clippy::useless-vec warning Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	c8419fc3bb	kata-ctl: Resolve non-minimal-cfg warning - In rust 1.72, clippy warned clippy::non-minimal-cfg as the cfg has only one condition, so doesn't need to be wrapped in the any combinator. Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	3eaf68d954	agent-ctl: Allow clippy lint - Allow `clippy::redundant-closure-call` which has issues with the guard function passed into the `run_if_auto_values` macro Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	1d8b78959d	runtime-rs: Fix useless-vec warning Fix clippy::useless-vec warning Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	99f3d69e94	runtime-rs: Remove mut Fix `error: variable does not need to be mutable` Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	16fbc27b09	dragonball: Allow ambiguous-glob-reexports The bindgen generated code is triggering lots of ambiguous-glob-reexports warnings in rust 1.70+ Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	bbf1919516	dragonball: Resolve non-minimal-cfg warning - In rust 1.72, clippy warned clippy::non-minimal-cfg as the cfg has only one condition, so doesn't need to be wrapped in the all combinators. Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	75cfdd5d59	agent: config: Allow clippy lint - Allow `clippy::redundant-closure-call` in `from_cmdline` which has issues with the guard function passed into the `parse_cmdline_param` macro Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	f3a0fd5907	agent: config: Fix useles-vec warning Fix clippy::useless-vec warning Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	9e423bd3d6	libs: Fix clippy unnecesary hashes error - Fix error: unnecessary hashes around raw string literal Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	444395050a	versions: Bump rust version Bump rust to 1.72.0 to test what extra warnings/issues we get Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
Yipeng Yin	a16b0962b5	chore(cargo): update cargo lock Update cargo lock for runtime-rs, agent and kata-ctl. Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>	2023-09-12 15:27:38 +08:00
Chao Wu	c800d0739f	Merge pull request #7889 from UiPath/fix-dragonball-build dragonball: fix for non-deterministic builds	2023-09-12 14:06:18 +08:00
shixuanqing	ca4b6b051d	runtime: Naming conflict of network devices When creating a new endpoint, we check existing endpoint names and automatically adjust the naming of the new endpoint to ensure uniqueness. Fixes: #7876 Signed-off-by: shixuanqing <1356292400@qq.com>	2023-09-12 04:29:51 +00:00
Guixiong Wei	202049f35e	feat(runtime-rs): introduce huge page type to select VM RAM's backend This commit allows us to specify the huge page backend when enabling huge page. Currently, we support two backends: thp and hugetlbfs, the default is hugetlbfs. To ensure backward compatibility, we introduce another configuration item "hugepage_type" to select the memory backend, which is available only when "enable_hugepages" is true. Besides, we add an annotation "io.katacontainers.config.hypervisor.hugepage_type" to configure huge page type per pod. Fixes: #6703 Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>	2023-09-12 11:28:27 +08:00
Zhongtao Hu	e1f54f96d0	Merge pull request #7766 from Apokleos/wrap-vsock-virtiofs runtime-rs: bring hybrid vsock devices in manager.	2023-09-12 09:27:34 +08:00
GabyCT	af29eeb8b1	Merge pull request #7901 from fidencio/topic/ci-target-branch-fixes-follow-up-3 ci: use github.ref_name instead of $GITHUB_REF_NAME	2023-09-11 15:31:29 -06:00
Fabiano Fidêncio	f811b064ca	ci: use github.ref_name instead of $GITHUB_REF_NAME As, regardless of what's mentioned in the documentation, it seems that $GITHUB_REF_NAME is passed down as a literal string. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 22:14:55 +02:00
Fabiano Fidêncio	dc0b350e49	Merge pull request #7900 from fidencio/topic/ci-target-branch-fixes-follow-up-2 ci: Add more target-branch related fixes	2023-09-11 21:26:26 +02:00
Fabiano Fidêncio	6d795c089e	ci: Add more target-branch related fixes The ones for the payload-after-push.yamland ci-nightly.yaml are not that much important right now, but they're needed for when we start running those on stable branches as well. The other ones were missed during `bd24afcf73`. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 20:42:57 +02:00
Fabiano Fidêncio	07d0ad0ad7	Merge pull request #7897 from fidencio/topic/ci-devmapper-do-the-rebase-as-well ci: Fix target-branch usage	2023-09-11 20:30:53 +02:00
Fabiano Fidêncio	d7f991d139	Merge pull request #7151 from Yuan-Zhuo/fix-systemd-cgroup agent: optimize the code of systemd cgroup manager	2023-09-11 20:15:51 +02:00
Fabiano Fidêncio	8509c31870	ci: Fix target-branch usage We missed those one as part of `bd24afcf73`. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 20:10:27 +02:00
Gabriela Cervantes	060499dcae	metrics: Remove warning from metrics documentation Now that the metrics migration from the tests to kata containers has been completed, this PR removes the warning from the main metrics documentation. Fixes #7894 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-11 16:41:48 +00:00
GabyCT	b384757ac7	Merge pull request #7874 from fidencio/topic/manually-rebase-branches-atop-of-the-target-one gha: Manually rebase PR atop of the target branch before testing	2023-09-11 10:35:01 -06:00
Fabiano Fidêncio	46e73cf7a2	Merge pull request #7884 from fidencio/topic/update-kernel-to-the-latest-lts-plus-bring-in-erofs-patches Update kernel to the latest LTS release (v6.1.52) and bring in erofs patches needed for the CC work	2023-09-11 13:58:43 +02:00
James O. D. Hunt	c0f697fcc5	runtime: Allow kernel_params annotation To support the removal of the `initcall_debug` and `earlyprintk=` options from the default guest kernel cmdline, add `kernel_params` to the list of enabled annotations to allow those kernel options (or others) to be set using `kata-deploy` for either runtime. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-11 12:12:12 +01:00
Alexandru Matei	b03e49794e	dragonball: fix for non-deterministic builds Fixes: #7888 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-09-11 14:07:10 +03:00
Fabiano Fidêncio	93bad13769	Merge pull request #7875 from fidencio/topic/kata-deploy-fix-arm64-image-build kata-deploy: Fix aarch64 image build	2023-09-11 11:36:52 +02:00
James O. D. Hunt	976d10150c	runtime-rs: hypervisor: Remove debug kernel options Removed the following kernel command line options: - `earlyprintk=ttyS0` - `initcall_debug` Both these options are only useful when debugging a guest kernel failure which is not a common occurrence. Further, the `earlyprintk=` option can have a large negative performance impact (it can increase the VM boot time significantly). If the user wishes to use either of these options, they can add them to the `kernel_params=` setting in the Kata configuration file's hypervisor stanza. Fixes: #7886. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-11 09:43:39 +01:00
Fabiano Fidêncio	fde34610cd	kernel: Add erofs patches needed for CC related work All the patches have already been merged upstream and they've just been cherry-picked to this branch. Fixes: #7885 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 10:39:37 +02:00
Fabiano Fidêncio	dc6a4588a2	versions: Bump kernel to the latest LTS release (6.1.52) We're bumping here in order to make our lives easier backporting EROFS patches needed for the CC related work. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 10:32:16 +02:00
James O. D. Hunt	52f6449b70	kata-manager: Remove initcall_debug kernel option Removed the addition of the `initcall_debug` kernel option when agent debugging enabled. This option has nothing to do with the agent. If the user wishes to use this option, they can add it to the `kernel_params=` setting in the Kata configuration file's hypervisor stanza. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-11 09:31:44 +01:00
Fabiano Fidêncio	6cd5d83a37	Merge pull request #7865 from gkurz/fix-more-virtiofs-args runtime: Fix more virtiofs args	2023-09-09 21:30:16 +02:00
Fabiano Fidêncio	8b4a0b368f	kata-deploy: Remove curl after it's used There's no need to keep curl there after the kubectl binary has already been downloaded. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-09 10:52:05 +02:00
Fabiano Fidêncio	139c7f03ab	kata-deploy: Fix aarch64 image build Similarly to what's been done for x86_64 -> amd64, we need to do a aarch64 -> arm64 change in order to be able to download the kubectl binary. Fixes: #7861 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-09 10:51:52 +02:00
Fabiano Fidêncio	94f5a69346	Merge pull request #7862 from fidencio/topic/kata-deploy-use-alpine-as-base-image kata-deploy: Switch to an alpine image	2023-09-09 09:02:13 +02:00
Yuan-Zhuo	470d065415	agent: optimize the code of systemd cgroup manager 1. Directly support CgroupManager::freeze through systemd API. 2. Avoid always passing unit_name by storing it into DBusClient. 3. Realize CgroupManager::destroy more accurately by killing systemd unit rather than stop it. 4. Ignore no such unit error when destroying systemd unit. 5. Update zbus version and corresponding interface file. Acknowledgement: error handling for no such systemd unit error refers to Fixes: #7080, #7142, #7143, #7166 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com> Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>	2023-09-09 13:56:43 +08:00
GabyCT	fa818bfad1	Merge pull request #7867 from GabyCT/topic/optimizedimage metrics: Use TensorFlow optimized image	2023-09-08 11:34:21 -06:00
Fabiano Fidêncio	bd24afcf73	gha: Manually rebase PR atop of the target branch before testing We're changing what's been done as part of `ac939c458c`, as we've notcied issues using `github.event.pull_request.merge_commit_sha`. Basically, whenever a force-push would happen, the reference of merge_commit_sha wouldn't be updated, leading us to test PRs with the old code. :-/ In order to get the rebase properly working, we need to ensure we pull the hash of the commit as part of checkout action, and ensure fetch-depth is set to 0. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 18:56:31 +02:00
GabyCT	dc7414f5c1	Merge pull request #7870 from dborquez/metrics_fio_fix_clean_env_order metrics: fix FIO test initialization	2023-09-08 10:28:10 -06:00
Greg Kurz	72c510d057	runtime/virtiofsd: Drop all references to "--cache=none" This syntax belongs to the legacy C virtiofsd implementation that we don't support anymore since kata-containers 3.1.3 because of other API breaking changes. People have been warned to switch from "none" to "never" since kata-containers 2.5.2. Let's officially do that. The compat code that would convert "none" to "never" isn't needed anymore. Just drop it. Fixes #7864 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-09-08 17:57:30 +02:00
Beraldo Leal	ead724bec1	protocol: removing gogo.nullable feature gogo.nullable is the main gogo.protobuf' feature used here. Since we are trying to remove gogo.protobuf, the first reasonable step seems to be remove this feature. This is a core update, and it will change how the structs are defined. I could spot only a few places using those structs, based on make check/build. Fixes #7723. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	d8e4bb9859	protocol: remove unused PROTO_FILE env There is no reference to PROTO_FILE and this is not working. Also we are not inside a Makefile, so makes sense to adapt the usage to reflect the script instead of a make command. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	5e1106a770	protocol: remove unused import_path import_path is used as the default package when no input files specify go_package. However, all the files we are currently building already have a go_package definition, making this behavior both redundant and error-prone. Additionally, one of our files (types.pb.go) resides outside the grpc directory, indicating that it's indeed ignored but also inconsistent. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	87accaaecb	protocol: use workdir during build Currently, the script searches for .proto files within $GOPATH/. Consequently, modifications to a definition file in the current working directory won't influence the output .pb.go if the directory is outside of $GOPATH. For developers, it's more intuitive to alter the local codebase than the version stored in $GOPATH. With this modification, the generated .pb.go files will be relative to the current working directory, removing the need to clone this project under $GOPATH/src/github.com/kata-containers. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	711a7ed965	protocol: remove mapping definitions The definitions are already specified in the .proto files using the go_package option. Centralizing them in one location reduces the potential for errors and simplifies the script. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	8db84c1bd2	protocol: force GOPATH to be set Currently, if GOPATH is not set, errors will raise since protoc is using GOPATH to find packages. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	68156d77ac	protocol: breaking lines to improve readability Just a small change to improve the readability of modules before the actual changes. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Fabiano Fidêncio	670a8e9c73	kata-deploy: Switch to an alpine image This will make our image smaller, and still ensure it's multi-arch support. Fixes: #7861 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 17:39:51 +02:00
Fabiano Fidêncio	0b26a5d053	Merge pull request #7871 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-3 ci: k8s: Add clean-up-garm argument for gha-run.sh	2023-09-08 17:27:57 +02:00
Fabiano Fidêncio	9d74b7ccc9	k8s: ci: Skip "Pod quota" test with firecracker The test is failing, and an issue has been opened to track it. For now, let's skip it. Issue: https://github.com/kata-containers/kata-containers/issues/7873 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 15:51:46 +02:00
Fabiano Fidêncio	f6cd3930c5	ci: k8s: Remove useless skip statement from tests There's absolutely no need to have the skip check as part of the test itself when it's already done as part of the setup function. We're only touching the files here that were touched in the previous commit. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 14:25:29 +02:00
Fabiano Fidêncio	3cc20b47a6	ci: k8s: Also check for "fc" (for firecracker) Let's keep both checks for now, but in the future we'll be able to remove the check for "firecracker", as the hypervisor name used as part of the GitHub Actions has to match what's used as part of the kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class) instead of `firecracker`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 14:25:24 +02:00
Fabiano Fidêncio	b5bad3cb0f	ci: k8s: Add clean-up-garm argument for gha-run.sh The tests are failing to finish as the argument is invalid. Fixes: #6542 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 14:04:50 +02:00
Fabiano Fidêncio	05e2e7636e	Merge pull request #7868 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-2 ci: k8s: Second round of fix-ups with the devmapper CI	2023-09-08 11:02:20 +02:00
Fabiano Fidêncio	aaec5a09f3	ci: k8s: devmapper tests should be using ubuntu 20.04 That's what we've been using as part of Jenkins, so let's ensure things will work as they did before, and only after that consider upgrading the base OS used for the tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 10:09:04 +02:00
Fabiano Fidêncio	27fa7d828d	ci: k8s: Add a kata-deploy-garm target We've been using the `kata-deploy-tdx` target as that also uses k3s as base, but it's better to just have a specific garm target. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 10:09:04 +02:00
Fabiano Fidêncio	fa62a4c01b	ci: k8s: Export KUBERNETES env var So we have a better control on which flavour of kubernetes kata-deploy is expected to be targetting. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 10:09:04 +02:00
Fabiano Fidêncio	8c9380a798	ci: k8s: Install bats on GARM runners GARM runners do not come with the whole set of tools we need, or are used to when it comes to the GHA runners, so we need to manually install bats on those. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 10:09:04 +02:00
Fabiano Fidêncio	3de23034f8	ci: k8s: Wait some time after restarting k3s Let's put a 1 minute sleep, just to make sure everything is back up again. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:46:58 +02:00
David Esparza	adfea55b8f	metrics: fix FIO test initialization This PR changes the order in which the FIO test first cleans the environment and then checks if the environment is indeed clean. Fixes: #7869 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-09-07 15:41:59 -06:00
Fabiano Fidêncio	2df183fd99	ci: k8s: Append, instead of overwrite, the devmapper config As we were using `tee` without the `-a` (or `--apend`) aptton, the containerd config would be overwritten, leading to a NotReady state of the Node. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
Fabiano Fidêncio	369a8af8f7	ci: k8s: Decrease k3s sleep from 4 to 2 minutes It should be plenty, and worked well in local tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
Fabiano Fidêncio	ada65b988a	ci: k8s: Use vanilla kubectl with k3s Let's download the vanilla kubectl binary into `/usr/bin/`, as we need to avoid hitting issues like: ```sh error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied ``` The issue basically happens because k3s links `/usr/local/bin/kubectl` to `/usr/local/bin/k3s`, and that does extra stuff that vanilla `kubectl` doesn't do. Also, in order to properly use the k3s.yaml config with the vanilla kubectl, we're copying it to ~/.kube/config. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
Fabiano Fidêncio	ad45ab5d33	ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644 Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users than root. As --write-config-mode is being passed, and that's an option that has to be passed to the `server`, -s is also added to the command line. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
Fabiano Fidêncio	028a97e0d5	ci: k8s: Use the proper command for sleep `wait` waits for a job to complete, not a number of seconds. Not sure how I got that wrong in the first place, but it's what it's. Fixes: #6542 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
David Esparza	34f580901f	Merge pull request #7824 from dborquez/fix_memory_usage_initialization metrics: re-enable memory-usage initialization step	2023-09-07 14:24:27 -06:00
Gabriela Cervantes	3a427795ea	metrics: Use TensorFlow optimized image This PR replaces the ubuntu image for one which has TensorFlow optimized for kata metrics. Fixes #7866 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-07 15:38:51 +00:00
Chao Wu	cd8c217ee1	Merge pull request #6879 from openanolis/chao/update_upstream_upcall_feature Dragonball: optimize the placement of dbs-upcall features	2023-09-07 18:07:53 +08:00
Fabiano Fidêncio	dfa1cce916	Merge pull request #7860 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-1 ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml	2023-09-07 11:48:30 +02:00
Fabiano Fidêncio	8d99972a8a	ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml integrations -> integration integrtion -> integration Fixes: #6542 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 11:31:30 +02:00
Fabiano Fidêncio	0483d3d16d	Merge pull request #7841 from fidencio/topic/ci-add-k8s-devmapper-tests ci: k8s: Add k8s devmapper tests (part 0)	2023-09-07 10:53:09 +02:00
Jeremi Piotrowski	f6cc01d77c	Merge pull request #7833 from jepio/kata-static-fix-ownership kata-deploy: Create kata-static.tar with correct ownership	2023-09-07 10:16:23 +02:00
Peng Tao	435e890cd9	Merge pull request #7703 from bergwolf/github/nerdctl-fc runtime: run prestart hooks before starting VM for FC	2023-09-07 10:55:31 +08:00
Chao Wu	deed1b927d	Dragonball: optimize the placement of dbs-upcall features Currently, the dbs-upcall features have 2 problems that are needed to be fixed : There are redundant dbs-upcall features that are needed to be removed. Some place should be controlled by dbs-upcall but not being implemented. This commit will fix those two problems. fixes: #6878 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-09-07 10:27:29 +08:00
Fabiano Fidêncio	0e8bd50cbb	ci: k8s: Add k8s devmapper tests (part 0) Let's enable the devmapper kubernetes tests to match exactly what's been tested as part of the Jenkins CI. Fixes: #6542 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-06 23:08:38 +02:00
Fabiano Fidêncio	b28b54df04	ci: k8s: Add a function to configure devmapper for containerd This function right now is completely based on what's part of the tests repo[0], and that's the reason I'm keeping the `Signed-off-by` of all the contributors to that file. This is not perfect, though, as it changes the default snapshotter to devmapper, instead of only doing so for the Kata Containers specific runtime handlers. OTOH, this is exactly what we've always been doing as part of the tests. We'll improve it, soon enough, when we get to also add a way for kata-deploy to set up different snapshotters for different handlers. But, for now, this is as good (or as bad) as it's always been. It's important to note that the devmapper setup doesn't take into consideration a BM machine, and this is not suitable for that. We're really only targetting GHA runners which will be thrown away after the run is over. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-06 23:08:17 +02:00
Fabiano Fidêncio	54f7117212	ci: k8s: Add a function to deploy k3s One can use different kubernetes flavours for getting a kubernetes cluster up and running. As part of our CI, though, I really would like to avoid contributors spending time maintaining and updating kubernetes dependencies, as done with the tests repo, and which has been proven to be really good on getting things rotten. With this in mind, I'm taking the bullet and using "k3s" as the way to deploy kubernetes for the devmapper related tests, and that's the reason I'm adding a function to do so, and this will be used later on as part of this series. It's important to note that the k3s setup doesn't take into consideration a BM machine, and this is not suitable for that. We're really only targetting GHA runners which will be thrown away after the run is over. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-06 23:07:41 +02:00
David Esparza	cf258090aa	Merge pull request #7843 from GabyCT/topic/ffiolimit metrics: Add write 95 percentile FIO value	2023-09-06 14:52:00 -06:00
Fabiano Fidêncio	c5e1e7ddc3	Merge pull request #7854 from fidencio/topic/runtime-allow-virtio_fs_extra_args-annotation runtime: Allow virtio_fs_extra_args annotation	2023-09-06 19:20:40 +02:00
Greg Kurz	81536f21af	runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr" The "-o" syntax belongs to the legacy C virtiofsd. It is deprecated with the rust implementation. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-09-06 17:50:35 +02:00
Fabiano Fidêncio	b1dd09a4d3	runtime: Allow virtio_fs_extra_args annotation Some use cases may just require passing extra arguments to virtiofsd, and having this disabled by default makes it impossible to set when using kata-deploy, as changes in the configuration file would be overwritten by the daemon-set. With this in mind, let's allow users to pass whatever thet need (and here I'm specifically looking at `--xattr`) as a virtio_fs_extra_arg. Fixes: #7853 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-06 17:11:16 +02:00
Hyounggyu Choi	d27fe18167	Merge pull request #7849 from BbolroC/hot-fix-dockerbuild packaging: do not install docker-compose-plugin for s390x\|ppc64le	2023-09-06 13:13:25 +02:00
Hyounggyu Choi	2efda20c77	packaging: do not install docker-compose-plugin for s390x\|ppc64le This PR is to skip installing docker-compose-plugin while buiding a `build-kata-deploy` image for s390x\|ppc64le. It is a temporary solution to fix current CI failures for s390x regarding `hash sum mismatch`. Fixes: #7848 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-09-06 11:12:03 +02:00
Zhongtao Hu	aa85e0b3ec	Merge pull request #7714 from justxuewei/volumes-cleanup runtime-rs: Fix volumes and rootfs cleanup issues	2023-09-06 10:13:55 +08:00
Gabriela Cervantes	438fbf9669	metrics: Add write 95 percentile for FIO for qemu This PR adds the write 95 percentile for FIO for qemu for checkmetrics for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 22:50:31 +00:00
Gabriela Cervantes	024b4d2ffe	metrics: Add write 95 percentile FIO value This PR adds the write 95 percentile FIO value for checkmetrics for kata metrics. Fixes #7842 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 21:00:05 +00:00
GabyCT	3e3a91fd2c	Merge pull request #7577 from GabyCT/topic/enableiperfm metrics: Enable iperf benchmark on gha for kata metrics	2023-09-05 14:53:47 -06:00
Gabriela Cervantes	e98e5cdea2	metrics: Add checkmetrics to gha run script This PR adds the checkmetrics to gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 17:05:03 +00:00
Gabriela Cervantes	c1edfe5511	metrics: Add checkmetrics value for qemu for iperf This PR adds the checkmetrics value for qemu for iperf benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 16:04:52 +00:00
Gabriela Cervantes	6a79ecedf9	metrics: Add jitter value for clh This PR adds jitter value for clh for iperf metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 16:04:52 +00:00
Gabriela Cervantes	f609a9a754	metrics: Add test selector to iperf metrics This PR adds test selector to iperf metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 16:04:52 +00:00
Gabriela Cervantes	5b8db30422	metrics: Enable iperf benchmark on gha for kata metrics This PR enables the iperf benchmark to run on the gha for kata metrics. Fixes #7575 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 16:04:52 +00:00
Jeremi Piotrowski	cf46b056fd	Merge pull request #7839 from openanolis/chao/switch_to_azure CI: switch static-checks-dragonball CI machines to Azure	2023-09-05 10:59:02 +02:00
Chao Wu	60f733d301	CI: switch static-checks-dragonball CI machines to Azure Previously, static-checks-dragonball is using machines from Alibaba Cloud to run all the CI jobs. Currently, we are going through an internal process to apply for the new machines for Dragonball CI. Before the internal process is over, we will temporarily use Azure VM to run static-checks-dragonball jobs. fixes: #7838 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-09-05 15:19:07 +08:00
alex.lyn	7870b33a2d	runtime-rs: bring hybridVsock devices in manager. Currently, virtio_vsock are still outside of the device manager. This causes some management issues,such as the inability to unify PCI address management. Just do some work for hybrid vsock. Fixes: #7655 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-09-05 08:46:56 +08:00
Jeremi Piotrowski	18c94ebbe3	kata-deploy: Create kata-static.tar with correct ownership Pass --owner and --group to the tar invokation to prevent gihtub runner user from leaking into release artifacts. Fixes: #7832 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-04 17:24:00 +02:00
Fabiano Fidêncio	b663ec21ac	Merge pull request #7803 from GabyCT/topic/readmereportdoc metrics: Add README for kata metrics report	2023-09-03 21:57:13 +02:00
Fabiano Fidêncio	e490b0bc76	Merge pull request #7808 from ManaSugi/fix/remove-manual-chcon osbuilder: Remove chcon operation for guest SELinux	2023-09-03 21:55:02 +02:00
Fabiano Fidêncio	27dab249a0	Merge pull request #7800 from jodh-intel/kata-sys-util-update-tdx-protection-checks kata-sys-util: protection: Update TDX checks	2023-09-02 14:47:51 +02:00
Jiang Liu	d5729e818c	Merge pull request #7819 from jiangliu/storage-cleanup Improve the way to clean up storage devices for sandbox	2023-09-02 17:02:51 +08:00
Jiang Liu	57e7bf14a6	agent: refine StorageDeviceGeneric::cleanup() Refine StorageDeviceGeneric::cleanup() to improve safety. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 14:22:21 +08:00
Jiang Liu	53edb19374	agent: implement StorageDeviceGeneric::cleanup() Refactor cleanup_sandbox_storage as StorageDeviceGeneric::cleanup(). Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 14:00:26 +08:00
Jiang Liu	0c63453e28	types: make StorageDevice::cleanup() return possible error code Make StorageDevice::cleanup() return possible error code. Fixes: #7818 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 13:27:06 +08:00
Jiang Liu	3a3d77b3b5	agent: move StorageDeviceGeneric from kata-types into agent Move StorageDeviceGeneric from kata-types into agent, so we can refactor code later. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 13:12:17 +08:00
Jiang Liu	d848126b61	Merge pull request #7821 from jiangliu/storage-leak agent: avoid possible leakage of storage device	2023-09-02 12:40:40 +08:00
Fabiano Fidêncio	4f92e6df90	Merge pull request #7683 from microsoft/danmihai1/policy-tests tests: add policy to existing tests	2023-09-01 23:52:15 +02:00
David Esparza	b151cfd140	metrics: re-enable memory-usage initialization step This PR re-enables the initialization step disabled on `538c965c2b`. Fixes: #7804 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-09-01 14:29:34 -06:00
Fabiano Fidêncio	f3e1a6a94f	osbuilder: alpine: Change mirror As we're hitting a lot of: ``` ERROR: https://dl-5.alpinelinux.org/alpine/v3.18/main: operation timed out ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-01 16:01:42 +00:00
Fabiano Fidêncio	ac612aef5e	osbuilder: alpine: Match the version on versions.yaml We've switching to 3.18 as part of `82cd14ba39`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-01 16:01:33 +00:00
Jiang Liu	9cd706d1c9	agent: avoid possible leakage of storage device When a storage device is used by more than one container, the second and forth instances will cause storage device reference count leakage, thus cause storage device leakage. The reason is: add_storages() will increase reference count of existing storage device, but forget to add the device to the `mount_list` array, thus leak the reference count. Fixes: #7820 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-01 22:52:42 +08:00
Dan Mihai	bf21411e90	tests: add policy to k8s tests Use AGENT_POLICY=yes when building the Guest images, and add a permissive test policy to the k8s tests for: - CBL-Mariner - SEV - SNP - TDX Also, add an example of policy rejecting ExecProcessRequest. Fixes: #7667 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-09-01 14:28:08 +00:00
Dan Mihai	d0e0610679	runtime: config: use the SEV initrd for SNP Thanks Unmesh Deodhar! Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-09-01 14:28:08 +00:00
Fabiano Fidêncio	67fed26f18	runtime: Use TDX image with in the qemu-tdx config Let's make sure we use the TDX image as part of the QEMU TDX configuration, which will help us to have the policies tested here. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-01 14:28:08 +00:00
Fabiano Fidêncio	f65ffb23da	Merge pull request #7814 from fidencio/topic/gha-rebase-prs-atop-of-main-for-the-tests gha: Rebase PR atop of the target branch before testing	2023-09-01 16:26:32 +02:00
Fabiano Fidêncio	ef70aeb6b8	Merge pull request #7817 from fidencio/topic/update-alpine-to-its-latest-release versions: Update alpine to its 3.18 version	2023-09-01 14:51:58 +02:00
Fabiano Fidêncio	ac939c458c	gha: Rebase atop of the target branch We have two scenarios we care about this, `pull_request` and `pull_request_target` events triggered a job. `pull_request` event: When using the checkout action, it'll already provide a "rebased atop of main" repo for us, nothing else is needed, and that's basically what we already have as part of the jobs in our CI. `pull_request_target` event: This one is a little bit tricky, as the checkout action, unless passing a spsecific repo, give us the PR checked out rebased atop of the HEAD of the PR branch. Jeremi Piotrowski nicely pointed out that we could use github.event.pull_request.merge_commit_sha instead, which is the result of the PR's branch with the official repo target branch. Now, the only cases where the contributor's rebase would still be needed is when the action itself has been changed. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-01 11:23:31 +02:00
Jeremi Piotrowski	bde06758b1	Merge pull request #7761 from jepio/iocopy-fix-race runtime: Fix data race in ioCopy	2023-09-01 09:30:54 +02:00
Fabiano Fidêncio	82cd14ba39	versions: Update alpine to its 3.18 version 3.15 will be out of life in 2 months from now. Fixes: #7816 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-31 23:02:54 +02:00
GabyCT	d75c7b5f9c	Merge pull request #7813 from GabyCT/topic/genreport metrics: Add grabdata script for metrics report	2023-08-31 13:33:38 -06:00
Gabriela Cervantes	6668825752	metrics: Add grabdata script for metrics report This PR adds the grabdata script so it can be used for the metrics report for kata metrics. Fixes #7812 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-31 16:17:29 +00:00
James O. D. Hunt	c290eaed8c	kata-sys-util: protection: Update TDX checks Update the protection checking code to detect newer versions of Intel TDX (whose userland interface has now stabilised). > Note: that we don't need to retain the existing behaviour since: > > - We haven't yet landed the TDX feature (#6448). > - Systems wishing to use TDX will need to use the latest available > system components (such as firmware and host kernel). Also added an explicit TDX unit test. Fixes: #7384. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-08-31 16:15:15 +01:00
Fabiano Fidêncio	d7a996c686	gha: Update to checkout@v3 action At this point we should always be using the latest checkout action. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-31 16:02:31 +02:00
Jeremi Piotrowski	d7612440b8	Merge pull request #7789 from beraldoleal/tests/amd Fixes tests on AMD machines	2023-08-31 11:23:51 +02:00
Jeremi Piotrowski	c2ba29c15b	runtime: Fix data race in ioCopy IoCopy is a tricky function (I don't claim to fully understand its contract), but here is what I see: The goroutine that runs it spawns 3 goroutines - one for each stream to handle (stdin/stdout/stderr). The goroutine then waits for the stream goroutines to exit. The idea is that when the process exits and is closed, the stdout goroutine will be unblocked and close stdin - this should unblock the stdin goroutine. The stderr goroutine will exit at the same time as the stdout goroutine. The iocopy routine then closes all tty.io streams. The problem is that the stdout goroutine decrements the WaitGroup before closing the stdin stream, which causes the iocopy goroutine to race to close the streams. Move the wg.Done() of the stdout routine past the close so that this race becomes impossible. I can't guarantee that this doesn't affect some unspecified behavior. Fixes: #5031 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-08-31 10:17:38 +02:00
Manabu Sugimoto	211de08d9e	osbuilder: Remove chcon operation for guest SELinux Remove the `chcon` operation which adds `container_runtime_exec_t` label to the `kata-agent` binary because the container-selinux package including the `39f83cc74d` commit has been released officially. Ref. https://centos.pkgs.org/9-stream/centos-appstream-x86_64/container-selinux-2.221.0-1.el9.noarch.rpm.html The container-selinux package is installed in a guest rootfs when we create it with `SELinux = yes`, and `restorecon` sets `container_runtime_exec_t` to the `kata-agent`. Fixes: #7807 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-31 16:44:32 +09:00
GabyCT	b467f2ef68	Merge pull request #7772 from GabyCT/topic/fiolimit metrics: Enable FIO limits for kata metrics	2023-08-30 14:49:04 -06:00
Gabriela Cervantes	9f21fa9b39	metrics: Add report generator link to general documentation This PR adds the report generator link to general documentation. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 16:55:14 +00:00
Gabriela Cervantes	c0ed5ea0ad	metrics: Add README for kata metrics report This PR adds the README for kata metrics report. Fixes #7802 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 16:36:08 +00:00
Fabiano Fidêncio	aa2b51a831	Merge pull request #7783 from GabyCT/topic/makereport metrics: Add metrics report script	2023-08-30 17:11:39 +02:00
Gabriela Cervantes	a7b59a5bf9	metrics: Add limit for 90 percentile for qemu value This PR adds the limit for 90 percentile for qemu value for FIO kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 13:53:38 +00:00
Gabriela Cervantes	99db6568e9	metrics: Add limit for write 90 percentile value for clh This PR adds the limit for write 90 percentile value for clh for FIO metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 13:53:38 +00:00
Gabriela Cervantes	6e06392c55	metrics: Enable FIO limits for kata metrics This PR enables the FIO limits for kata metrics. Fixes #7771 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 13:53:38 +00:00
David Esparza	924d06a7f5	Merge pull request #7787 from GabyCT/topic/fixmemoryinsidelimit metrics: Fix memory inside limits for kata metrics	2023-08-30 07:45:17 -06:00
Peng Tao	2e4c874726	runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure If we are running FC hypervisor, it is not started when prestart hooks are executed. So we should just ignore such error and just go ahead and run the hooks. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-30 03:06:11 +00:00
Peng Tao	21204caf20	runtime: fail early when starting docker container with FC FC does not support network device hotplug. Let's add a check to fail early when starting containers created by docker. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-30 02:52:01 +00:00
Peng Tao	32fd013716	runtime: run prestart hooks before starting VM for FC Add a new hypervisor capability to tell if it supports device hotplug. If not, we should run prestart hooks before starting new VMs as nerdctl is using the prestart hooks to set up netns. To make nerdctl + FC to work, we need to run the prestart hooks before starting new VMs. Fixes: #6384 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-30 02:52:01 +00:00
Beraldo Leal	00e7ffd988	tests: check vmx only on Intel machines When running on amd machines, those tests will fail because there is no vmx flag. Following other tests that checks for cpuType, let's adapt them to restrict vmx only on Intel machines. Fixes #7788. Related #5066 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-08-29 20:04:31 -04:00
Gabriela Cervantes	c8dd3c0737	metrics: Fix memory footprint qemu limit This PR fixes the memory footprint qemu limit for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 22:51:21 +00:00
Gabriela Cervantes	8877ec62fb	metrics: Fix memory inside limits for kata metrics This PR fixes the memory inside limit for clh for kata metrics due to the recent changes that we had in the script which impacted in the performance measurement. Fixes #7786 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 21:38:18 +00:00
Beraldo Leal	80146f2078	tests: Fixes cpuType check on AMD machines cpuType is not initialized yet. gets 0 (Intel) by default, failing on AMD machines. Fixes #7785 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-08-29 17:04:07 -04:00
Gabriela Cervantes	7e364716dd	metrics: Add test setup details to metrics report This PR adds test setup details to metrics report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:56:53 +00:00
Gabriela Cervantes	17dc1b9760	metrics: Add boot lifecycle times to metrics report This PR adds the boot lifecycle times to metrics report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:55:44 +00:00
Gabriela Cervantes	3b0d6538f2	metrics: Add memory inside container to metrics report This PR adds memory inside container to metrics report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:53:17 +00:00
Gabriela Cervantes	79fbb9d243	metrics: Add scaling system footprint in metrics report This PR adds scaling system footprint in metrics report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:51:27 +00:00
Gabriela Cervantes	8e6d4e6f3d	metrics: Add metrics reportgen This PR adds metrics reportgen for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:45:36 +00:00
Gabriela Cervantes	139ffd4f75	metrics: Add report file titles This PR adds report file titles for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:43:06 +00:00
GabyCT	8f2dae7b53	Merge pull request #7775 from dborquez/fix_memory_usage_parsing_results metrics: fix parsing issue on memory-usage test	2023-08-29 11:26:13 -06:00
Gabriela Cervantes	878d1a2e7d	metrics: Generate PNGs alongside the PDF report This PR generates the PNGs for the kata metrics PDF report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 16:50:32 +00:00
Gabriela Cervantes	fce2487971	metrics: Add metrics report R files This PR adds the metrics report R files. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 16:45:22 +00:00
Gabriela Cervantes	08812074d1	metrics: Add report dockerfile This PR adds the report dockerfile for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 16:28:32 +00:00
Gabriela Cervantes	69781fc027	metrics: Add metrics report script This PR adds metrics report script for kata metrics. Fixes #7782 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 16:25:14 +00:00
Chao Wu	e4fb20c74a	Merge pull request #7585 from lifupan/main dragonball: vsock add fifo/pipe stream support for passed fd hybridSt…	2023-08-29 23:39:21 +08:00
Fabiano Fidêncio	50e51bcafe	Merge pull request #7185 from UnmeshDeodhar/add-cc-sev-test tests: Add confidential test	2023-08-29 15:32:25 +02:00
Fabiano Fidêncio	e286e842c1	tests: Expand confidential test to support TDX Let's expand the confidential test to also support TDX. The main difference on the test, though, is that we're not grepping for a string in the `dmesg` output, but rather relying on `cpuid` to detect a TDX guest. Fixes: #7184 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-29 14:10:47 +02:00
Unmesh Deodhar	e31f099be1	tests: Expand confidential test to support SNP Let's expand the confidential test to also support SNP. Fixes: #7184 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-08-29 14:10:47 +02:00
Unmesh Deodhar	c3b9d4945e	tests: Add confidential test for SEV Add a test case for the launch of unencrypted confidential container, verifying that we are running inside a TEE. Right now the test only works with SEV, but it'll be expanded in the coming commits, as part of this very same series. Fixes: #7184 Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-29 14:10:34 +02:00
David Esparza	538c965c2b	metrics: fix parsing issue on memory-usage test This PR fixes an issues in the parsing results stage, by collecting just the n-results from the n-running containers, discarding irrelevant data. Fixes: #7774 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-28 23:39:46 -06:00
Fabiano Fidêncio	708b0a3052	Merge pull request #7768 from fidencio/topic/update-tdx-to-the-6.2-kernel-based-stack tdx: Update the components needed for using the 6.2 kernel stack	2023-08-28 19:27:15 +02:00
Fabiano Fidêncio	3818bf3311	local-build: Remove $HOME/.docker/buildx/activity/default The file can be removed between builds without causing any issue, and leaving it around has been causing us some headache due to: ``` ERROR: open /home/runner/.docker/buildx/activity/default: permission denied ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:41:36 +02:00
Fabiano Fidêncio	d1b54ede29	qemu: tdx: Workaround SMP issue with TDX 1.5 `...,sockets=1,cores=numvcpus,threads=1,...` must be used. Fixes: #7770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:41:36 +02:00
Archana Shinde	1e34220c41	qemu: tdx: Adapt to the TDX 1.5 stack QEMU for TDX 1.5 makes use of private memory map/unmap. Make changes to govmm to support this. Support for private backing fd for memory is added as knob to the qemu config. Userspace's map/unmap operations are done by fallocate() ioctl on the backing store fd. Reference: https://lore.kernel.org/linux-mm/20220519153713.819591-1-chao.p.peng@linux.intel.com/ Fixes: #7770 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:41:36 +02:00
Fabiano Fidêncio	8115a0522d	versions: tdx: Update Kernel to 6.2 + TDX This is the version that's been used and tested inside Intel, and it matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15. Fixes: #7770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:11:34 +02:00
Fabiano Fidêncio	ec18180f34	versions: tdx: Update TDVF to the "edk2-stable202302" This is the version that's been used and tested inside Intel, and it matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15. Fixes: #7770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:11:34 +02:00
Fabiano Fidêncio	9803b24286	versions: tdx: Update QEMU to v7.2 + TDX v1.10 This is the version that's been used and tested inside Intel, and it matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15. Fixes: #7770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:11:27 +02:00
Fabiano Fidêncio	02a08c956b	Merge pull request #7754 from microsoft/danmihai1/pod-quota-deployment tests: delete k8s deployment at the test's end	2023-08-27 17:52:00 +02:00
Fabiano Fidêncio	98037ced52	Merge pull request #7755 from microsoft/danmihai1/unique-test-name tests: use unique test name	2023-08-27 17:27:40 +02:00
Zhongtao Hu	f0440a9cfe	Merge pull request #7742 from frezcirno/fix-log-forwarder-loop runtime-rs: check peer close in log_forwarder	2023-08-26 10:44:09 +08:00
Fabiano Fidêncio	16a610d788	Merge pull request #7758 from fidencio/topic/gha-avoid-fail-fast-till-everything-is-ultra-stable gha: Avoid "fail-fast" in tests that are known to be flaky	2023-08-25 16:49:26 +02:00
Jiang Liu	91db888d83	Merge pull request #7602 from jiangliu/agent-storage Refine storage device management for kata-agent	2023-08-25 22:20:18 +08:00
Zixuan Tan	dffc16e5b3	runtime-rs: check peer close in log_forwarder The log_forwarder task does not check if the peer has closed, causing a meaningless loop during the period of “kata vm exit”, when the peer closed, and “ShutdownContainer RPC received” that aborts the log forwarder. This patch fixes the problem. Fixes: #7741 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2023-08-25 19:00:07 +08:00
Jiang Liu	aaa5ab1264	agent: simplify storage device by removing StorageDeviceObject Simplify storage device implementation by removing StorageDeviceObject. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-25 17:23:16 +08:00
Fabiano Fidêncio	fb49d5d7ce	gha: Avoid "fail-fast" in tests that are known to be flaky Otherwise we'll have to re-run all the tests due to a flaky behaviour in one of the parts. Fixes: #7757 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-25 10:00:17 +02:00
Dan Mihai	183f51d6f6	tests: use unique test name k8s-pid-ns.bats was already using the test name from k8s-kill-all-process-in-container.bats - probably a copy/paste bug. Fixes: #7753 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-25 03:41:06 +00:00
Dan Mihai	6a974679f2	tests: delete k8s deployment at the test's end At the end of k8s-kill-all-process-in-container.bats, delete the deployment it created. Fixes: #7752 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-25 03:34:37 +00:00
David Esparza	686eb3878b	Merge pull request #7751 from GabyCT/topic/unusednhwc metrics: Remove unused variable in tensorflow nhwc script	2023-08-24 18:34:06 -06:00
Fabiano Fidêncio	f1d8e1f513	Merge pull request #7747 from fidencio/topic/kata-deploy-dont-try-to-remove-opt-kata kata-deploy: Don't try to remove /opt/kata	2023-08-24 18:56:52 +02:00
Gabriela Cervantes	32a778b6da	metrics: Remove unused variable in tensorflow nhwc script This PR removes unused variable in tensorflow nhwc script. Fixes #7750 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-24 15:54:27 +00:00
David Esparza	875a85ee14	Merge pull request #7736 from GabyCT/topic/tensorflowfp32 metrics: Add TensorFlow ResNet50 FP32 benchmark	2023-08-24 08:56:24 -06:00
Fabiano Fidêncio	d8f3ce6497	kata-deploy: Don't try to remove /opt/kata The directory is a host path mount and cannot be removed from within the container. What we actually want to remove is whatever is inside that directory. This may raise errors like: ``` rm: cannot remove '/opt/kata/': Device or resource busy ``` Fixes: #7746 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-24 13:57:36 +02:00
Jeremi Piotrowski	71c90b994a	Merge pull request #7745 from jepio/vfio-part-0 gha: vfio: Run on Ubuntu 23.04 runner	2023-08-24 12:15:19 +02:00
Greg Kurz	9991772b26	Merge pull request #7718 from littlejawa/fix_filemode_when_zero kata-agent: use default filemode for block device when it is set to 0	2023-08-24 11:40:28 +02:00
Jeremi Piotrowski	936e8091a7	gha: vfio: Run on Ubuntu 23.04 runner The vfio test requires nested-nested virtualization: L0 Azure host -> L1 Ubuntu VM -> L2 Fedora VM -> L3 Kata This hits a kernel bug on v5.15 but works quite nicely on the v6.2 kernel included in Ubuntu 23.04. We can switch back to Ubuntu 22.04 when they roll out v6.2. Fixes: #6555 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-08-24 10:10:02 +02:00
Jiang Liu	0e7248264d	agent: move storage device related code into dedicated files Move storage device related code into dedicated files. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 13:48:51 +08:00
Xuewei Niu	268e846558	runtime-rs: Fix volumes and rootfs cleanup issues There are several processes for container exit: - Non-detach mode: `Wait` request is sent by containerd, then `wait_process()` will be called eventually. - Detach mode: `Wait` request is not sent, the `wait_process()` won’t be called. - Killed by ctr: For example, a container runs `tail -f /dev/null`, and is killed by `sudo ctr t kill -a -s SIGTERM <CID>`. Kill request is sent, then `kill_process()` will be called. User executes `sudo ctr c rm <CID>`, `Delete` request is sent, then `delete_process()` will be called. - Exited on its own: For example, a container runs `sleep 1s`. The container’s state goes to `Stopped` after 1 second. User executes the delete command as below. Where do we do container cleanup things? - `wait_process()`: No, because it won’t be called in detach mode. - `delete_process()`: No, because it depends on when the user executes the delete command. - `run_io_wait()`: Yes. A container is considered exited once its IO ended. And this always be called once a container is launched. Fixes: #7713 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-08-24 13:23:47 +08:00
Jiang Liu	8f49ee33b2	agent: refine storage related code a bit Refine storage related code by: - remove the STORAGE_HANDLER_LIST - define type alias - move code near to its caller Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 13:09:10 +08:00
Jiang Liu	60ca12ccb0	agent: switch to new storage subsystem Switch to new storage subsystem to create a StorageDevice for each storage object. Fixes: #7614 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 13:09:09 +08:00
Jiang Liu	fcbda0b419	kata-types: introduce StorageDevice and StorageHandlerManager Introduce StorageDevice and StorageHandlerManager, which will be used to refine storage device management for kata-agent. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 13:08:55 +08:00
Jiang Liu	b03b1f6134	agent: simplify the way to manage storage object Simplify the way to manage storage objects, and introduce StorageStateCommon structures for coming extensions. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:58:24 +08:00
Jiang Liu	8392c71bf2	sys-util: support more mount flags in parse_mount_options() Support more mount flags in parse_mount_options(). Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:17:39 +08:00
Jiang Liu	c00d8f3d48	agent: use create_mount_destination() from kata-sys-util Use create_mount_destination() from kata-sys-util crate to reduce redundant code. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:17:38 +08:00
Jiang Liu	5e867f0538	types: add more mount related constants Add more mount related constants. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:17:36 +08:00
Jiang Liu	880e6c9a76	agent: use function from kata-sys-utils to reduce code Use function get_linux_mount_info() from kata-sys-util crate to share common code. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:17:34 +08:00
QuanweiZhou	a6921dd837	Merge pull request #7698 from jiangliu/virtual-volume kata-types: introduce KataVirtualVolume to support nydus, direct volume and image pull	2023-08-24 11:50:39 +08:00
Fabiano Fidêncio	7705c5962e	Merge pull request #7728 from ManaSugi/fix/typo-test-toml libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml	2023-08-23 23:55:41 +02:00
GabyCT	c1712e1930	Merge pull request #7737 from jepio/fix-local-build local-build: Remove GID before creating group	2023-08-23 12:26:39 -06:00
Jeremi Piotrowski	3b881fbc0e	local-build: Remove GID before creating group docker install now creates a group with gid 999 which happens to match what we need to get docker-in-docker to work. Remove the group first as we don't need it. Fixes: #7726 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-08-23 18:58:38 +02:00
David Esparza	ebce5d25a9	Merge pull request #7734 from fidencio/topic/kata-deploy-fix-removal kata-deploy: Avoid failing on content removal	2023-08-23 10:29:57 -06:00
Gabriela Cervantes	959ca49447	metrics: Add TensorFlow ResNet50 fp32 Dockerfile This PR adds the TensorFlow ResNet50 fp32 Dockerfile for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-23 16:24:58 +00:00
Gabriela Cervantes	4b7d72c4a8	metrics: Add TensorFlow ResNet50 FP32 benchmark This PR adds TensorFlow ResNet50 FP32 benchmark for kata metrics. Fixes #7735 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-23 16:21:09 +00:00
Fabiano Fidêncio	e7e4cc2182	Merge pull request #7716 from bergwolf/github/image-initrd-assets runtime: fix image and initrd assets handling	2023-08-23 18:02:15 +02:00
Fabiano Fidêncio	5cba38c175	kata-deploy: Avoid failing on content removal We can simply use `rm -f` all over the place and avoid the container returning any error. Fixes: #7733 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-23 16:49:26 +02:00
Peng Tao	18d42da21e	runtime/fc: fix image/initrd annotation handling Right now if we configure an image annotation and have a config file setting initrd, the initrd config would override the image annotation. Make sure annotations are preferred over config options in image and initrd path handling. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-23 03:47:28 +00:00
Peng Tao	9fda7059a5	runtime/clh: fix image/initrd annotation handling We should make sure annotations are preferred over config options in image and initrd path handling. Fixes: #7705 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-23 03:47:28 +00:00
Peng Tao	1a0092d631	runtime/qemu: fix image/initrd annotation handling Right now if we configure an image annotation and have a config file setting initrd, the initrd config would override the image annotation. Add a helper function ImageOrInitrdAssetPath to make sure annotations are preferred over config options in image and initrd path handling. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-23 03:47:27 +00:00
Manabu Sugimoto	22d8f335d6	libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml Change `pdisable_guest_seccomp` to `disable_guest_seccomp` Fixes: #7727 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-23 12:08:18 +09:00
GabyCT	b8990c0490	Merge pull request #7722 from GabyCT/topic/adddiskreadme metrics: Add disk link to README	2023-08-22 12:29:54 -06:00
GabyCT	514d3d42b8	Merge pull request #7712 from GabyCT/topic/fixfiopath metrics: Fix FIO path	2023-08-22 12:28:28 -06:00
Gabriela Cervantes	8afd158cef	metrics: Add disk link to README This PR adds disk link to README documentation for kata metrics. Fixes #7721 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-22 16:20:31 +00:00
Julien Ropé	40914b25d4	kata-agent: use default filemode for block device when it is set to 0 When the FileMode field for the device is unset (0), use a default value instead to allow the use of the device from the container. This behaviour is seen from cri-o typically. Note: this is what runc is doing, which is why regular containers don't have an issue. This change makes sure kata behaves the same as runc. Fixes: #7717 Signed-off-by: Julien Ropé <jrope@redhat.com>	2023-08-22 16:08:14 +02:00
Fabiano Fidêncio	8032797418	Merge pull request #7708 from microsoft/danmihai1/kata-deploy-log gha: capture additional kata-deploy output	2023-08-21 23:43:51 +02:00
David Esparza	d2c130ea69	Merge pull request #7710 from GabyCT/topic/fixpytorch1 metrics: Use function from metrics common in pytorch script	2023-08-21 15:31:24 -06:00
Gabriela Cervantes	eee2ee6eeb	metrics: Fix FIO path This PR fixes the FIO path for the FIO files. Fixes #7711 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-21 21:06:04 +00:00
David Esparza	9347051592	Merge pull request #7666 from dborquez/metrics_improve_fio_test metrics: Enable kata runtime in K8s for FIO test.	2023-08-21 13:51:57 -06:00
Gabriela Cervantes	39bc3488f5	metrics: Use function from metrics common in pytorch script This PR uses a common function into the pytorch script. Fixes #7709 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-21 16:12:35 +00:00
Dan Mihai	400eb88743	gha: capture additional kata-deploy output 10 lines can be insufficient for diagnostics. Fixes: #7707 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-21 15:58:57 +00:00
GabyCT	700759232f	Merge pull request #7690 from GabyCT/topic/fixpytorch metrics: Fix README for pytorch	2023-08-21 09:50:14 -06:00
Jiang Liu	6e038e66e4	Merge pull request #7680 from GabyCT/topic/removetime metrics: Remove unused variable in tensorflow mobilenet script	2023-08-21 23:39:07 +08:00
Jiang Liu	4aee3eade0	kata-types: implement serde methods for KataVirtualVolume Implement serilization/deserialization methods for KataVirtualVolume. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-21 16:46:56 +08:00
Jiang Liu	b875e39323	kata-types: validate KataVirtualVolume object Implement method validate() for KataVirtualVolume to validate message format. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-21 16:42:07 +08:00
Jiang Liu	fa2fdc1057	kata-types: implement two conversion helpers for KataVirtualVolume Enable conversions from NydusExtraOptions/DirectVolumeMountInfo to KataVirtualVolume. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-21 16:35:26 +08:00
Jiang Liu	6326af20e3	kata-types: introduce KataVirtualVolume Introduce structure KataVirtualVolume to to encapsulate information for extra mount options and direct volumes, so we could build a common infrastructure to handle these cases. Fixes: #7699 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-21 16:19:47 +08:00
Gabriela Cervantes	c8b43f8b3e	metrics: Fix README for pytorch This PR fixes the pytorch reference in the README file. Fixes #7689 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-18 20:14:49 +00:00
Aurélien	fa34d61805	Merge pull request #7664 from microsoft/danmihai1/agent-init-policy rootfs: agent: Policy support with AGENT_INIT=yes	2023-08-18 10:51:55 -07:00
Fabiano Fidêncio	7e66d1f6b5	Merge pull request #7649 from fidencio/topic/k8s-tests-remove-kata-deploy-tests gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy	2023-08-18 07:47:26 +02:00
David Esparza	fb571f8be9	metrics: Enable kata runtime in K8s for FIO test. This PR configures the corresponding kata runtime in K8s based on the tested hypervisor. This PR also enables FIO metrics test in the kata metrics-ci. Fixes: #7665 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-17 17:11:27 -06:00
Dan Mihai	cb056f8cb3	rootfs: agent: Policy support with AGENT_INIT=yes When building with AGENT_POLICY=yes and AGENT_INIT=yes: 1. Include OPA and the Policy settings in rootfs. 2. Start OPA from the kata agent. Before these changes, building with both AGENT_POLICY=yes and AGENT_INIT=yes was unsupported. Starting OPA from systemd (when AGENT_INIT=no) was already supported. Fixes: #7615 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-17 22:37:58 +00:00
GabyCT	c358056a3f	Merge pull request #7685 from GabyCT/topic/changename metrics: Fix check results for tensorflow benchmark	2023-08-17 15:39:43 -06:00
Gabriela Cervantes	85c02828e1	metrics: Update tensorflow name in gha run script This PR update tensorflow name in gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-17 20:17:48 +00:00
Gabriela Cervantes	e8a5119343	metrics: Fix check results for tensorflow benchmark This PR fixes the check results for tensorflow benchmark now that we change the name of the test. Fixes #7684 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-17 19:52:45 +00:00
Fabiano Fidêncio	2d896ad12f	gha: kata-deploy: Do the runtime class cleanup as part of the cleanup Instead of doing this as part of the test itself, let's ensure it's done before running the tests and during the tests cleanup. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 18:54:46 +02:00
Fabiano Fidêncio	4ffc2c86f3	gha: kata-deploy: Add the first kata-deploy test This test, at least for now, only checks whether the runtimeclasses have been properly created. This is just a migration from a test we had as part of the k8s suite. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 18:54:46 +02:00
GabyCT	4ba684e6e4	Merge pull request #7653 from GabyCT/topic/tensorflowfp32 metrics: Add Tensorflow ResNet50 int8 benchmark	2023-08-17 10:44:25 -06:00
Gabriela Cervantes	8616c050ae	metrics: Remove unused variable in tensorflow mobilenet script This PR removes unused variable in tensorflow mobilenet script. Fixes #7679 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-17 16:04:18 +00:00
Fabiano Fidêncio	285e616b5e	tests: common: Ensure test_type is used as part of the cluster's name By doing this we can make sure there won't be any clash on the cluster name created for either the k8s or the kata-deploy tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 14:22:16 +02:00
Fabiano Fidêncio	790bd3548d	tests: commob: Don't fail if yq is not part of the cache This may happen on external runners. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 14:22:14 +02:00
Fabiano Fidêncio	ce6adecd0a	gha: kata-deploy: Add run-kata-deploy-tests.sh This will have the same function as run-k8s-tests.sh has, but for kata-deploy. Right now it doesn't have any tests, and the command to actually run the tests is commented out, but right now this is just a placeholder that will be populated sooner than later. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 09:49:03 +02:00
Fabiano Fidêncio	cfc29c11a3	gha: k8s: Stop running kata-deploy tests as part of the k8s suite In a follow-up series, we'll add a whole suite for the kata-deploy tests. With this in mind, let's already get rid of this one and avoid more kata-deploy tests to land here. Fixes: #7642 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 09:48:54 +02:00
Fabiano Fidêncio	e470a650e0	Merge pull request #7654 from sprt/ci-fixes kata-deploy: Properly create default runtime class	2023-08-17 09:43:34 +02:00
Wedson Almeida Filho	962378606e	Merge pull request #7627 from wedsonaf/error-conv agent: simplify error handling	2023-08-16 21:02:38 -03:00
Aurélien Bombo	f4dd152863	tests: k8s: Call ensure_yq() in setup.sh It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing the yq error so let's try this instead. Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-08-16 14:13:56 -07:00
GabyCT	3d0cfc88c9	Merge pull request #7662 from GabyCT/topic/fixhelptensorflow metrics: Fix MobileNet help me description	2023-08-16 14:13:39 -06:00
Aurélien Bombo	339569b69c	kata-deploy: Properly create default runtime class The default `kata` runtime class would get created with the `kata` handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong hypervisor and broke CI. Fixes: #7663 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-08-16 11:04:44 -07:00
Gabriela Cervantes	2a491e9b1f	metrics: Fix MobileNet help me description This PR fixes MobileNet help me description in the tensorflow script. Fixes #7661 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-16 15:25:39 +00:00
Fabiano Fidêncio	606e419fac	Merge pull request #7660 from fidencio/topic/add-kata-deploy-tests-as-part-of-the-ci gha: ci: Start running kata-deploy tests	2023-08-16 16:44:08 +02:00
Fabiano Fidêncio	d19a75e80c	gha: ci: Start running kata-deploy tests Let's add the tests as part of the ci.yaml, so they an be triggered as part of each PR. For this PR those tests won't be triggered, courtesy to the `pull_request_target` event we rely on. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-16 16:08:05 +02:00
Fabiano Fidêncio	4adcf2192e	Merge pull request #7651 from ManaSugi/runk/containerd-test runk: Modify kill command's error message for containerd tests	2023-08-16 15:37:48 +02:00
Zhongtao Hu	5c8a61a4c8	Merge pull request #7558 from openanolis/fix/driver_option runtime-rs: add driver option	2023-08-16 13:56:29 +08:00
Zhongtao Hu	d90f7ac689	runtime-rs: add unit test for block driver add unit test for block driver Fixes:#7539 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-08-16 11:45:27 +08:00
Zhongtao Hu	e44919f0da	runtime-rs: add load_test_config for unit test add load_test_config for unit test Fixes:#7539 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-08-16 11:32:56 +08:00
Zhongtao Hu	7f48a69379	runtime-rs: add driver option add driver option when handle linux devices Fixes:#7539 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-08-16 11:32:49 +08:00
Gabriela Cervantes	bade6a5c3b	docs: Fix TensorFlow word across the document This PR fixes the TensorFlow word across the document to have uniformity across all the document. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-15 20:13:05 +00:00
Fabiano Fidêncio	0bc48eab60	Merge pull request #7640 from fidencio/topic/gha-cri-containerd-enable-tests gha: cri-containerd: Enable tests	2023-08-15 21:18:28 +02:00
Gabriela Cervantes	1a1b207760	docs: Add Tensorflow Resnet50 documentation This PR adds the Tensorflow Resnet50 documentation. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-15 17:46:44 +00:00
Gabriela Cervantes	24baededc0	metrics: Add Dockerfile for ResNet50 int8 This PR adds the dockerfile for ResNet50 int8 benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-15 17:38:26 +00:00
Gabriela Cervantes	6d971ba8df	metrics: Add Tensorflow ResNet50 int8 benchmark This PR adds the Tensorflow ResNet50 int8 script for kata metrics. Fixes #7652 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-15 17:30:22 +00:00
Manabu Sugimoto	25d151bd1b	runk: Modify kill command's error message for containerd tests The error message when the kill command is executed with the container's state == Stopped should be "container not running" because the containerd tests expect that OCI runtimes return the error message and compare it. If the error message is different from the expected one, the tests fail. Fixes: #7650 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-16 00:39:50 +09:00
GabyCT	0bbabeaaf8	Merge pull request #7644 from GabyCT/topic/renametensorflow metrics: Rename tensorflow scripts	2023-08-15 09:23:24 -06:00
Fabiano Fidêncio	46d25d908d	Merge pull request #7643 from fidencio/topic/add-functional-kata-deploy-tests gha: tests: Add kata-deploy functional tests -- Part 1	2023-08-15 15:23:48 +02:00
Fabiano Fidêncio	b3592ab25c	gha: cri-containerd: Enable tests As the cri-containerd tests have been fully migrated to GHA, let's make sure we get them running. Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:32:42 +02:00
Fabiano Fidêncio	84dd02e0f9	gha: cri-containerd: Add timeout to the crictl calls on testContainerStop As part of the runners, we're hitting a timeout that I cannot reproduce, at all, when allocating the same instance and running the tests manually. The default timeout to connect to the server is 2s when using `crictl`. Let's increase this to 20s. It's fairly important to mention that in the first tests I used a timeout of 10s, and that helped but we still hit issues every now and then. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Fabiano Fidêncio	b29782984a	gha: cri-containerd: Show pod before deleting it It'll help us to debug failures with the pod stop / pod delete. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Fabiano Fidêncio	ae0930824a	gha: cri-containerd: Print kata logs in case of error We need this to fully understand what are the issues we're facing. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Fabiano Fidêncio	6c8b2ffa60	gha: cri-containerd: Group containerd logs This improves readability in case of failures by a lot. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Fabiano Fidêncio	9e898701f5	gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account Short commit log says it all. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Wedson Almeida Filho	76dac8f22c	agent: simplify error handling We extend the `Result` and `Option` types with associated types that allows converting a `Result<T, E>` and `Option<T>` into `ttrpc::Result<T>`. This allows the elimination of many `match` statements in favor of calling the map function plus the `?` operator. This transformation simplifies the code. Fixes: #7624 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-15 06:55:27 -03:00
Fabiano Fidêncio	e107d1d94e	Merge pull request #7574 from microsoft/danmihai1/policy agent: runtime: add Agent Policy feature	2023-08-15 11:29:13 +02:00
Bin Liu	ea81eb6c2e	Merge pull request #7169 from chethanah/runk/support-no-pid-ns runk: Support without pid ns	2023-08-15 13:00:40 +08:00
Gabriela Cervantes	18a7fd8e4e	metrics: Rename tensorflow scripts This PR renames the tensorflow scripts to include the data format that is being used as we will have multiple tests with different data and model formats for tensorflow so this will help us to distinguish them. Fixes #7645 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-14 20:40:35 +00:00
GabyCT	a740c80251	Merge pull request #7626 from GabyCT/topic/cassandrak metrics: Add Cassandra Kubernetes benchmark for kata metrics	2023-08-14 14:22:52 -06:00
GabyCT	4e5e39e8b3	Merge pull request #7618 from GabyCT/topic/addfunctionscommon metrics: Add common functions to the common script	2023-08-14 14:22:30 -06:00
GabyCT	a19d471c01	Merge pull request #7629 from dborquez/metrics_improve_stopping_kata_components metrics: fix the loop used to stop kata components	2023-08-14 14:22:06 -06:00
Fabiano Fidêncio	e55fa93db9	tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx This will not be tested as part of the PR, thanks to the `pull_request_target` event, but we want it to be added so we can build atop of that in a coming up series. Fixes: #7642 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-14 21:38:00 +02:00
Fabiano Fidêncio	d9ee17aaec	tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks This will not be tested as part of the PR, thanks to the `pull_request_target` event, but we want it to be added so we can build atop of that in a coming up series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-14 21:37:52 +02:00
Chelsea Mafrica	22465d22f0	Merge pull request #7638 from ManaSugi/fix/virtcontainers-doc docs: Remove installation step in virtcontainers doc	2023-08-14 10:21:57 -07:00
Dan Mihai	ab829d1038	agent: runtime: add the Agent Policy feature Fixes: #7573 To enable this feature, build your rootfs using AGENT_POLICY=yes. The default is AGENT_POLICY=no. Building rootfs using AGENT_POLICY=yes has the following effects: 1. The kata-opa service gets included in the Guest image. 2. The agent gets built using AGENT_POLICY=yes. After this patch, the shim calls SetPolicy if and only if a Policy annotation is attached to the sandbox/pod. When creating a sandbox/pod that doesn't have an attached Policy annotation: 1. If the agent was built using AGENT_POLICY=yes, the new sandbox uses the default agent settings, that might include a default Policy too. 2. If the agent was built using AGENT_POLICY=no, the new sandbox is executed the same way as before this patch. Any SetPolicy calls from the shim to the agent fail if the agent was built using AGENT_POLICY=no. If the agent was built using AGENT_POLICY=yes: 1. The agent reads the contents of a default policy file during sandbox start-up. 2. The agent then connects to the OPA service on localhost and sends the default policy to OPA. 3. If the shim calls SetPolicy: a. The agent checks if SetPolicy is allowed by the current policy (the current policy is typically the default policy mentioned above). b. If SetPolicy is allowed, the agent deletes the current policy from OPA and replaces it with the new policy it received from the shim. A typical new policy from the shim doesn't allow any future SetPolicy calls. 4. For every agent rpc API call, the agent asks OPA if that call should be allowed. OPA allows or not a call based on the current policy, the name of the agent API, and the API call's inputs. The agent rejects any calls that are rejected by OPA. When building using AGENT_POLICY_DEBUG=yes, additional Policy logging gets enabled in the agent. In particular, information about the inputs for agent rpc API calls is logged in /tmp/policy.txt, on the Guest VM. These inputs can be useful for investigating API calls that might have been rejected by the Policy. Examples: 1. Load a failing policy file test1.rego on a different machine: opa run --server --addr 127.0.0.1:8181 test1.rego 2. Collect the API inputs from Guest's /tmp/policy.txt and test on the machine where the failing policy has been loaded: curl -X POST http://localhost:8181/v1/data/agent_policy/CreateContainerRequest \ --data-binary @test1-inputs.json Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-14 17:07:35 +00:00
Fabiano Fidêncio	831e73ff91	tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder Right now this file does nothing, as it's not even called by any GHA. However, it'll be populated later on as part of a different series, where we'll have kata-deploy specific tests running here. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-14 17:46:10 +02:00
Fabiano Fidêncio	af1b46bbf2	tests: Add gha-run-k8s-common.sh Let's split a good portion of `tests/integration/kuberentes/gha-run.sh` out, and put them in a place where they can be used to the soon-to-come kata-deploy specific tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-14 17:45:58 +02:00
Jeremi Piotrowski	a57e7ffe14	Merge pull request #7211 from stevenhorsman/propogate-secrets Propogate secrets, config maps etc into guest if sharedFS not available	2023-08-14 11:24:47 +02:00
Manabu Sugimoto	416445e7eb	docs: Remove installation step in virtcontainers doc Remove the installation step in the virtcontainers doc because the virtcontainers install/uninstall targets have been removed by `86723b51ae` and they are not used anymore. Fixes: #7637 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-14 15:15:24 +09:00
Fabiano Fidêncio	b975c27793	Merge pull request #7547 from stevefan1999-personal/patch-k0s kata-deploy: Preliminary k0s support	2023-08-12 14:28:13 +02:00
Fabiano Fidêncio	6ed57d1e9a	Merge pull request #7447 from fidencio/topic/gha-move-static-jenkins-to-azure-instances gha: static-checks: Move to the Azure instances	2023-08-12 13:31:54 +02:00
Steve Fan	72cbcf040b	kata-deploy: Add k0s support Add k0s support to kata-deploy, in the very same way kata-containers already supports k3s, and rke2. k0s support requires v1.27.1, which is noted as part of the kata-deploy documentation, as it's the way to use dynamic configuration on containerd CRI runtimes. This support will only be part of the `main` branch, as it's not a bug fix that can be backported to the `stable-3.2` branch, and this is also noted as part of the documentation. Fixes: #7548 Signed-off-by: Steve Fan <29133953+stevefan1999-personal@users.noreply.github.com>	2023-08-11 21:17:23 +02:00
David Esparza	767434d50a	metrics: fix the loop used to stop kata components #7629 This PR fixed the loop that stops the kata-shim and the hypervisors used in metrics checks. Fixes: #7628 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-11 12:32:41 -06:00
Gabriela Cervantes	5d0f0d43c7	metrics: Add cassandra statefulset yaml This PR adds cassandra statefulset yaml for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:39 +00:00
Gabriela Cervantes	c1dcc1396f	metrics: Add cassandra service yaml This PR adds the cassandra service yaml for the benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:36 +00:00
Gabriela Cervantes	2297a0d1c5	metrics: Add block loop pvc yaml for cassandra This PR adds block loop pvc yaml for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:33 +00:00
Gabriela Cervantes	e3d511946f	metrics: Add block loop pv yaml for cassandra test This PR adds the block loop pv yaml for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:29 +00:00
Gabriela Cervantes	9890271594	metrics: Add block loop pvc for cassandra test This PR adds the block loop pvc for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:19 +00:00
Gabriela Cervantes	349b89969a	metrics: Add Cassandra Kubernetes benchmark for kata metrics This PR adds Cassandra Kubernetes benchmark for kata metrics tests. Fixes #7625 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:21:48 +00:00
Fabiano Fidêncio	c52d090522	gha: static-checks: Move to the Azure instances The GHA runners are not exactly powerful, which makes the static-checks take way too long (almost an hour). Let's give a try and move those to the same size of Azure instances used as part of our CI, and probably have this time reduced. Fixes: #7446 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-11 18:47:47 +02:00
stevenhorsman	8815ed0665	runtime: Remove config warnings Remove configuration file shared_fs = none warnings now that there is a solution to updating configMaps, secrets etc Fixes: #7210 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-08-11 16:31:08 +01:00
Yohei Ueda	afe1a6ac5a	agent: support copying of directories and symlinks This patch allows copying of directories and symlinks when static file copying is used between host and guest. This change is necessary to support recursive file copying between shim and agent. Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> (cherry picked from commit `de232b8030`)	2023-08-11 16:31:08 +01:00
Pradipta Banerjee	ab13ef87ee	runtime: propagate configmap/secrets etc changes for remote-hyp For remote hypervisor, the configmap, secrets, downward-api or project-volumes are copied from host to guest. This patch watches for changes to the host files and copies the changes to the guest. Note that configmap updates takes significantly longer than updates via downward-api. This is similar across runc and Kata runtimes. Fixes: #7210 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Signed-off-by: Julien Ropé <jrope@redhat.com> (cherry picked from commit `3081cd5f8e`) (cherry picked from commit 68ec673bc4d9cd853eee51b21a0e91fcec149aad)	2023-08-11 16:31:08 +01:00
Yohei Ueda	c074ec4df1	runtime: Copy shared files recursively This patch enables recursive file copying when filesystem sharing is not used. Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> (cherry picked from commit `5422a056f2`) (cherry picked from commit 16055ce040bbd724be2916bc518d89b69c9e0ca5) Fixes: #7210	2023-08-11 16:16:52 +01:00
Peng Tao	a39fd6c066	Merge pull request #7611 from ManaSugi/fix/fc-version versions: Update firecracker version to 1.4.0	2023-08-11 16:43:37 +08:00
Chao Wu	7031b5db07	Merge pull request #7535 from ManaSugi/fix/allow-redundant-clone agent: Allow clippy::redundant_clone in the unit tests	2023-08-11 14:17:56 +08:00
Gabriela Cervantes	fdcd52ff78	metrics: Add check containers are running in tensorflow mobilenet This PR adds check containers are running in tensorflow mobilenet that is being defined in common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:17:20 +00:00
Gabriela Cervantes	36337ee146	metrics: Add check containers are up in tensorflow script This PR adds the check containers are up function from common in tensorflow script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:15:18 +00:00
Gabriela Cervantes	f700f9b0ba	metrics: Remove unused variable in tensorflow script This PR removes an unused variable in tensorflow script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:13:37 +00:00
Gabriela Cervantes	833cf7a684	metrics: Add check containers are running function This PR adds the check containers are running function the common metrics script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:12:22 +00:00
Gabriela Cervantes	918c783084	metrics: Add check containers are up in tensorflow mobilenet script This PR adds the check containers are up in the common script in the tensorflow mobilenet script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:06:40 +00:00
Gabriela Cervantes	9d57a1fab4	metrics: Use check containers are up in tensorflow script This PR uses the check containers are up from the common script in the tensorflow script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:42:09 +00:00
Gabriela Cervantes	1c84680d8c	metrics: Add check containers are up in common script This PR adds check containers are up in common script for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:39:24 +00:00
Gabriela Cervantes	d3e57cf454	metrics: Use collect_results function in tensorflow mobilenet test This PR uses the collect results function defined in common for the tensorflow mobilenet test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:34:30 +00:00
Gabriela Cervantes	286de046af	metrics: Remove collect results function definition This PR removes the collect results function from tensorflow script as it is going to be referenced in the common metrics script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:31:23 +00:00
Gabriela Cervantes	9879709aae	metrics: Add common functions to the common script This PR adds the collect results function to the common metrics script. Fixes #7617 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:27:11 +00:00
Fabiano Fidêncio	a89c9cd620	Merge pull request #7557 from wedsonaf/no-new-vecs agent: avoid creating new `Vec` instances when easily avoidable	2023-08-10 18:43:46 +02:00
Manabu Sugimoto	4746fa3daa	docs: Specify supported Firecracker version using `versions.yaml` Specify the supported version of Firecracker using our `versions.yaml` to improve the maintainability of the documentation. Fixes: #7610 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-10 16:49:45 +09:00
Manabu Sugimoto	cc922be5ec	versions: Update firecracker version to 1.4.0 This patch upgrades Firecracker version from v1.1.0 to v1.4.0. * Generate swagger models for v1.4.0 (from `firecracker.yaml`) - The version of go-swagger used is v0.30.0 * The firecracker v1.4.0 includes the following changes. - Added * Added support for custom CPU templates allowing users to adjust vCPU features exposed to the guest via CPUID, MSRs and ARM registers. * Introduced V1N1 static CPU template for ARM to represent Neoverse V1 CPU as Neoverse N1. * Added support for the virtio-rng entropy device. The device is optional. A single device can be enabled per VM using the /entropy endpoint. * Added a cpu-template-helper tool for assisting with creating and managing custom CPU templates. - Changed * Set FDP_EXCPTN_ONLY bit (CPUID.7h.0:EBX[6]) and ZERO_FCS_FDS bit (CPUID.7h.0:EBX[13]) in Intel's CPUID normalization process. - Fixed * Fixed feature flags in T2S CPU template on Intel Ice Lake. * Fixed CPUID leaf 0xb to be exposed to guests running on AMD host. * Fixed a performance regression in the jailer logic for closing open file descriptors. * A race condition that has been identified between the API thread and the VMM thread due to a misconfiguration of the api_event_fd. * Fixed CPUID leaf 0x1 to disable perfmon and debug feature on x86 host. * Fixed passing through cache information from host in CPUID leaf 0x80000006. * Fixed the T2S CPU template to set the RRSBA bit of the IA32_ARCH_CAPABILITIES MSR to 1 in accordance with an Intel microcode update. * Fixed the T2CL CPU template to pass through the RSBA and RRSBA bits of the IA32_ARCH_CAPABILITIES MSR from the host in accordance with an Intel microcode update. * Fixed passing through cache information from host in CPUID leaf 0x80000005. * Fixed the T2A CPU template to disable SVM (nested virtualization). * Fixed the T2A CPU template to set EferLmsleUnsupported bit (CPUID.80000008h:EBX[20]), which indicates that EFER[LMSLE] is not supported. Fixes: #7610 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-10 16:48:13 +09:00
Fupan Li	39e67b06e9	dragonball: vsock add fifo/pipe stream support for passed fd hybridStream Since the passed fd through unix socket would be any stream fd such as pipe/fifo fd or any other socket fd, thus we should deal with it as a normal hybrid stream instead of a unix stream. Fixes:#7584 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2023-08-10 11:07:10 +08:00
David Esparza	7bf994827d	Merge pull request #7609 from dborquez/tensorflow_check_completion metrics: compute tensorflow statistics	2023-08-09 18:47:47 -06:00
David Esparza	dcdb3b067f	Merge pull request #7606 from GabyCT/topic/nginx metrics: Add network nginx benchmark	2023-08-09 16:14:13 -06:00
David Esparza	2defdcc598	Merge pull request #7579 from dborquez/simplify_gha_metrics_workflow metrics: install kata once and run multiple checks	2023-08-09 14:45:09 -06:00
David Esparza	473b0d3a31	metrics: compute tensorflow statistics This PR computes average results for TF bench. Additionally, it improves the data parsing from all running containers. Fixes: #7603 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-09 14:42:30 -06:00
Fabiano Fidêncio	0a8208c670	Merge pull request #7608 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-3 ci: unencrypted-image: Fix build context	2023-08-09 21:00:46 +02:00
Fabiano Fidêncio	03d1fa67b1	ci: unencrypted-image: Fix build context The build context should be the folder where the Dockerfile is present, otherwise the files copied into the image won't be found. Fixes: #7595 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 20:32:36 +02:00
Fabiano Fidêncio	eb463b38ec	ci: unencrypted-image: Don't fail to build on s390x Let's make sure that we don't fail in case we're building non x86_64. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 20:32:36 +02:00
Fabiano Fidêncio	ebc86091d1	Merge pull request #7607 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-2 ci: create-confidential-image: Add dependent actions	2023-08-09 19:53:49 +02:00
Fabiano Fidêncio	a2d731ad26	ci: create-confidential-image: Add dependent actions Following the example on https://github.com/docker/build-push-action, it's clear that the actions to "Set up QEMU" and "Set up Docker Buildx" are missing. Let's add them, and also take the advantage to bump the build-push-action to its v4, which, by the way, had a typo on its name (build-and-push-action does NOT exist, build-push-action does). Fixes: #7595 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 18:36:51 +02:00
Gabriela Cervantes	d1a6296221	metrics: Add nginx documentation to network README This PR adds nginx documentation to network README for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-09 16:17:46 +00:00
Gabriela Cervantes	498f7c0549	metrics: Add nginx kubernetes yaml This PR adds the nginx kubernetes yaml. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-09 16:14:04 +00:00
Gabriela Cervantes	f8a5255cf7	metrics: Add network nginx benchmark This PR adds the network nginx benchmark for kata metrics. Fixes #7605 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-09 16:12:21 +00:00
Fabiano Fidêncio	86f705d98b	Merge pull request #7604 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-1 Follow up fixes for https://github.com/kata-containers/kata-containers/pull/7596	2023-08-09 18:05:46 +02:00
Fabiano Fidêncio	43fe5d1b90	ci: k8s: tees: Ensure PR_NUMBER is exported Right now this is not being used, but it'll as the image generated for the confidential tests have that as part of their tag. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 17:45:42 +02:00
Fabiano Fidêncio	54f6a78500	ci: {{ pr-number }} should be {{ inputs.pr-number }} One of the joys to rely on the `pull_request_target` is to only be able to catch those after those are merged. Fixes: #7595 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 17:41:07 +02:00
Fabiano Fidêncio	5cdf981a2b	Merge pull request #7596 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests tests: Create image that will be used in the unencrypted confidential tests	2023-08-09 17:06:07 +02:00
Fabiano Fidêncio	c932369f42	Merge pull request #7492 from fidencio/topic/adapt-tests-to-the-new-kata-deploy-env-vars kata-deploy: Ensure we cover SHIMS / DEFAULT_SHIM as part of our tests	2023-08-09 12:55:03 +02:00
Fabiano Fidêncio	034d7aab87	tests: k8s: Ensure the runtime classes are properly created With these 2 simple checks we can ensure that we do not regress on the behaviour of allowing the runtime classes / default runtime class to be created by the kata-deploy payload. Fixes: #7491 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 11:46:04 +02:00
Fabiano Fidêncio	fac8ccf5cd	ci: Add build-and-publish-tee-confidential-unencrypted-image This will be done before running TEE tests, and it's a hard dependency fr them. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 11:36:10 +02:00
Fabiano Fidêncio	ab5f603ffa	ci: k8s: Add the image used for unencrypted confidential tests Let's add here the image we'll be using for unencrypted confidential tests. Later on, we'll make sure to build and use this image as part of our CI. The image can easily be built as a multi-arch image, and has `cpuid` installed in case of `x86_64` build, so it can be used to detect whether we're running on a TEE guest without having to rely on `dmesg \| grep ...`. Fixes: #7595 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 11:33:18 +02:00
Fabiano Fidêncio	36d53dd2af	Merge pull request #7598 from UnmeshDeodhar/upgrade-bats-version tests: upgrade bats version	2023-08-09 11:18:56 +02:00
Fabiano Fidêncio	1e8fe131bd	k8s: tests: Take advantage of `SHIMS` and `DEFAULT_SHIM` env vars We don't have to do any sed to replace the runtimeclass being used by the moment we start taking advantage of the `DEFAULT_SHIM` environment variable exposed merged in the previous commits. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 11:15:34 +02:00
Wedson Almeida Filho	729b2dd611	agent: avoid creating new `Vec` instances when easily avoidable There are many places where the code currently creates new `Vec` instances when it's not really needed. The result is a perf hit because it allocates memory, copies all elements, then frees the memory; in some cases, copying elements also involves extra allocations (e.g., when elements are strings, or structs containing strings). This patch addresses a number of these cases. Fixes: #7203 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-09 02:38:36 -03:00
Jiang Liu	311671abb5	Merge pull request #7552 from jiangliu/agent-r1 Fix mimor bugs and improve coding stype of agent rpc/sandbox/mount	2023-08-09 13:19:02 +08:00
Unmesh Deodhar	aeaec9dae9	tests: upgrade bats version Instead of using package manager to install bats, building this from source. This gives us the updated version of bats which supports functions such as setup_file and teardown_file. We can use these functions into our current tests. Fixes: #7597 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-08-08 18:16:39 -05:00
David Esparza	e664969862	metrics: install kata once and run multiple checks This PR changes the metrics workflow in order to just install kata once, and run the checks for multiple hypervisor variations. In this way we save time avoiding installing kata for each hypervisor to be tested. Fixes: #7578 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-08 10:25:13 -06:00
Jiang Liu	baabfa9f1f	agent: refine implementation of mount related code Refine implementation of mount by: - log message with `path.display()` instead of `{:?}` - add prefix "_" to unused variables - pass by reference instead of by value to avoid creating redundant array - exactly matching prefix "fsgid=" instead of "fsgid" - avoid redundant clone() operations Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:03:03 +08:00
Jiang Liu	98ba211a34	agent: fix a bug in update_ephemeral_mounts() There's a bug in function update_ephemeral_mounts() which only handles the first storage object and ignores all other storage objects. Fixes: #7551 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:03:02 +08:00
Jiang Liu	5333618d70	agent: make add_storage() take &[Storage] instead of Vec<Storage> Simplify add_storage() by taking &[Storage] instead of Vec<Storage>. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:03:01 +08:00
Jiang Liu	37f34781d1	agent: simplify function online_cpu_memory() Simplify function online_cpu_memory() by on calling update_cpuset_path() for containers with cpuset configured. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:03:00 +08:00
Jiang Liu	d3c5422379	agent: refine style of code related to sandbox Refine style of code related to sandbox by: - remove unnecessary comments for caller to take lock, we have already taken `&mut self`. - change "count < 1 " to "count == 0", `count` is type of u32. - make remove_sandbox_storage() to take `&mut self` instead of `&self`. - group related function to each others - avoid search the map twice in function find_process() - avoid unwrap() in function run_oom_event_monitor() - avoid unwrap() in online_resources() Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:02:59 +08:00
Jiang Liu	71a9f67781	agent: avoid unwrap() in function do_remove_container() Avoid unwrap() in function do_remove_container(), and also make implmementation symmetric for both timeout and non-timeout cases. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:02:58 +08:00
Jiang Liu	84badd89d7	agent: avoid clone objects when possible Optimize agent rpc implementation by: - avoid clone objects when possible - avoid unwrap() when possible - explictly drop object to ensure order Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:02:56 +08:00
Chao Wu	b098960442	Merge pull request #7581 from justxuewei/bump-versions deps: Bump dependent crate versions	2023-08-08 15:16:57 +08:00
Chao Wu	24bf637835	Merge pull request #7500 from pmores/fix-queue-num-in-dragonball-share-fs fix number of queues handling in dragonball share fs device	2023-08-08 12:07:25 +08:00
Xuewei Niu	b23c5ed155	deps: Bump dependent crate versions This pull request is mainly for updating vm-memory and vmm-sys-util. The affacted crates include: - vm-memory: from 0.9.0 to 0.10.0 - vmm-sys-util: from 0.10.0 to 0.11.0 - virtio-queue: from 0.6.0 to 0.7.0 - fuse-backend-rs: from 0.10.4 to 0.10.5 - linux-loader: from 0.6.0 to 0.8.0 - nydus-api: from 0.3.0 to 0.3.1 - nydus-rafs: from 0.3.1 to 0.3.2 - nydus-storage: from 0.6.3 to 0.6.4 Fixes: #0000 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-08-08 11:54:09 +08:00
Fupan Li	5a20d8dcaf	Merge pull request #7383 from justxuewei/dan runtime-rs: Introduce directly attachable network	2023-08-08 09:54:28 +08:00
Chelsea Mafrica	553fd79ea9	Merge pull request #7572 from GabyCT/topic/resnet50fp32 metrics: General improvements to mobilenet tensorflow test	2023-08-07 13:33:28 -07:00
GabyCT	194120b679	Merge pull request #7540 from GabyCT/topic/enableiperf gha: Add iperf network metrics	2023-08-07 13:40:02 -06:00
Gabriela Cervantes	863283716d	metrics: General improvements to mobilenet tensorflow test This PR renames the mobilenet tensorflow test to have a more specific tensorflow name mainly because tensorflow has different configurations and we will add more tensorflow tests so we want to distinguish each tensorflow test. Fixes #7571 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-07 16:50:00 +00:00
Gabriela Cervantes	3c319d8d4c	metrics: Add iperf to gha run script This PR adds iperf to gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-07 16:20:00 +00:00
Gabriela Cervantes	5b5caf8908	gha: Add iperf network metrics This PR adds the iperf network metrics to the github actions for kata metrics. Fixes #7535 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-07 16:20:00 +00:00
Chelsea Mafrica	4559caf619	Merge pull request #7467 from ManaSugi/doc/use-k8-control-plane docs: Use control-plane term instead of master	2023-08-06 23:40:51 -07:00
Fabiano Fidêncio	b365bef570	Merge pull request #7191 from wedsonaf/avoid-clones agent: avoid unnecessary calls to `Arc::clone`	2023-08-06 15:34:07 +02:00
GabyCT	7144acb2a5	Merge pull request #7527 from GabyCT/topic/latency metrics: Add network latency test	2023-08-04 15:54:07 -06:00
Gabriela Cervantes	66db5b5350	metrics: Add latency test to network README This PR adds latency test to network README for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-04 20:27:27 +00:00
Wedson Almeida Filho	c36572418f	agent: avoid unnecessary calls to `Arc::clone` These calls cause two extra atomic instructions each time they're used, one to increment and another one to decrement the refcount. Since we don't need them because the referred value is guaranteed to outlive the function, remove the calls. Fixes: #7190 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 20:53:05 -03:00
Fabiano Fidêncio	8c03deac3a	Merge pull request #7106 from wedsonaf/image-pulling Image pulling on the host	2023-08-04 01:08:42 +02:00
Wedson Almeida Filho	4fbe0a3a53	runtime: bind-mount mounted block device into container When the mounted block device isn't a layer, we want to mount it into containers, but since it's already mounted with the correct fs (e.g., tar, ext4, etc.) in the pod, we just bind-mount it into the container. Fixes: #7536 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 17:58:39 -03:00
Wedson Almeida Filho	7e1b1949d4	runtime: add support for kata overlays When at least one `io.katacontainers.fs-opt.layer` option is added to the rootfs, it gets inserted into the VM as a layer, and the file system is mounted as an overlay of all layers using the overlayfs driver. Additionally, if the `io.katacontainers.fs-opt.block_device=file` option is present in a layer, it is mounted as a block device backed by a file on the host. Fixes: #7536 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 17:58:39 -03:00
Wedson Almeida Filho	6c867d9e86	agent: add io.katacontainers.fs-opt.overlay-rw option This causes the overlay-fs driver to add the `upperdir` and `workdir` options to an overlay-fs mount so that the mount becomes writable using a discardable directory under the container id. Fixes: #7536 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 17:58:39 -03:00
Wedson Almeida Filho	6163c35657	agent: skip mount options that start with "io.katacontainers." This is so that file systems don't fail when we pass kata-specific options from the snapshotter to kata. Fixes: #7536 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 17:58:39 -03:00
Fabiano Fidêncio	fa35afa982	Merge pull request #7542 from wedsonaf/ci-fix Use version 0.10.4 of `fuse-backend-rs`	2023-08-03 22:50:11 +02:00
Wedson Almeida Filho	b2ff97aa01	dragonball: use version 0.10.4 of `fuse-backend-rs` Version 0.10.5, which was just released, breaks `nydus-storage`. This is a workaround to fix the CI which is blocking other PRs. Fixes: #7541 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 14:15:17 -03:00
Fabiano Fidêncio	ebdae7cfdf	Merge pull request #7520 from jepio/host-systemctl kata-deploy: Use host's systemctl	2023-08-03 13:53:28 +02:00
Manabu Sugimoto	845eeb4d7b	agent: Allow clippy::redundant_clone in the unit tests Allow `clippy::redundant_clone` in the agent's unit tests because rustc>=1.70 shows the errors as false-negatives. These `clone()` are required because the following codes refer to the variable, but the clippy analyzes them by mistake, using the conservative and limited approach. Ref. https://rust-lang.github.io/rust-clippy/master/index.html#/redundant_clone Fixes: #7534 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-03 19:07:40 +09:00
Fabiano Fidêncio	e2755a47b8	Merge pull request #7524 from fidencio/revert-kata-deploy-changes-after-3.2.0-rc0-release release: Revert kata-deploy changes after 3.2.0-rc0 release	2023-08-03 11:28:43 +02:00
Fabiano Fidêncio	1163fc9de2	release: Revert kata-deploy changes after 3.2.0-rc0 release As 3.2.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup tags back to "latest", and re-add the kata-deploy-stable and the kata-cleanup-stable files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-03 10:08:20 +02:00
Xuewei Niu	3958a39d07	runtime-rs: Introduce directly attachable network Kata containers as VM-based containers are allowed to run in the host netns. That is, the network is able to isolate in the L2. The network performance will benefit from this architecture, which eliminates as many hops as possible. We called it a Directly Attachable Network (DAN for short). The network devices are placed at the host netns by the CNI plugins. The configs are saved at {dan_conf}/{sandbox_id}.json in the format of JSON, including device name, type, and network info. At the very beginning stage, the DAN only supports host tap devices. More devices, like the DPDK, will be supported in later versions. The format of file looks like as below: ```json { "netns": "/path/to/netns", "devices": [{ "name": "eth0", "guest_mac": "xx:xx:xx:xx:xx", "device": { "type": "vhost-user", "path": "/tmp/test", "queue_num": 1, "queue_size": 1 }, "network_info": { "interface": { "ip_addresses": ["192.168.0.1/24"], "mtu": 1500, "ntype": "tuntap", "flags": 0 }, "routes": [{ "dest": "172.18.0.0/16", "source": "172.18.0.1", "gateway": "172.18.31.1", "scope": 0, "flags": 0 }], "neighbors": [{ "ip_address": "192.168.0.3/16", "device": "", "state": 0, "flags": 0, "hardware_addr": "xx:xx:xx:xx:xx" }] } }] } ``` Fixes: #1922 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-08-03 15:33:34 +08:00
David Esparza	7d1c48c881	Merge pull request #7530 from dborquez/fix_check_running_processes metrics: stop kata components before start a metric test.	2023-08-02 23:51:27 -06:00
Zhongtao Hu	e719423262	Merge pull request #7127 from cmaf/runtime-rs-ch-blk-2 runtime-rs: Add block device handling for cloud hypervisor	2023-08-03 09:46:32 +08:00
David Esparza	1e15369e59	metrics: Improve naming testing containers in launch times test This commit provides a new way to name the containers used in the launch-times-test in this form: 'kata_launch_times_RANDOM_NUMBER', where RANDOM_NUMBER is in the 0-1000 range. Fixes: #7529 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-02 17:04:55 -06:00
David Esparza	5dbe88330f	metrics: Clean kata components before start a metric test. This PR kills all kata components before start a new metric test. Fixes: #7528 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-02 17:04:51 -06:00
Fabiano Fidêncio	d424f3c595	Merge pull request #7523 from fidencio/3.2.0-rc0-branch-bump # Kata Containers 3.2.0-rc0	2023-08-02 20:04:37 +02:00
Zvonko Kaiser	cf8899f260	Merge pull request #7494 from zvonkok/vfio-mode vfio: Fix vfio device ordering	2023-08-02 19:45:22 +02:00
Gabriela Cervantes	3b45060b61	metrics: Add latency server yaml This PR adds latency server yaml for kubernetes test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-02 16:52:17 +00:00
Gabriela Cervantes	9bb8451df5	metrics: Add latency client yaml This PR adds latency client yaml for the kubernetes test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-02 16:50:51 +00:00
Gabriela Cervantes	64fdb98704	metrics: Add network latency test This PR adds network latency test for kata metrics. Fixes #7526 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-02 16:46:48 +00:00
Chelsea Mafrica	a81ad3b587	runtime-rs: Add block device handling in cloud hypervisor Add functions for adding a block device to a container for CH. Fixes #6690 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-08-02 09:18:48 -07:00
David Esparza	542012c8be	Merge pull request #7503 from GabyCT/topic/ghafio metrics: Add FIO test to gha for kata metrics CI	2023-08-02 10:05:09 -06:00
David Esparza	5979f3790b	Merge pull request #7516 from GabyCT/topic/addiperf metrics: Add iperf3 network test	2023-08-02 10:04:51 -06:00
Fabiano Fidêncio	006ecce49a	release: Kata Containers 3.2.0-rc0 - ci-on-push: Make the CI also run for the stable-* branches - ci: k8s: Do not fail when gathering info on AKS nodes - kata-deploy: enable cross build for non-x86 - runtime-rs: add support for gather metrics in runtime-rs - kata-ctl: add monitor subcommand for runtime-rs - release: release-note.sh: Fix typos and reference to images - metrics: Add sysbench performance test - Simplify implementation of runtime-rs/service `6ad16d497` release: Adapt kata-deploy for 3.2.0-rc0 `025596b28` ci-on-push: Make the CI also run for the stable-* branches `7ffc0c122` static-build: enable cross build for qemu `35d6d86ab` static-build: enable cross-build for image build `2205fb9d0` static-build: enable cross build for virtiofsd `11631c681` static-build: enable cross build for shim-v2 `7923de899` static-build: cross build kernel `e2c31fce2` kata-deploy: enable cross build for kata deploy script `2fc5f0e2e` kata-depoly: prepare env for cross build in lib.sh `f5e9985af` release: release-note.sh: Fix typos and reference to images `f910c66d6` ci: k8s: Do not fail when gathering info on AKS nodes `632818176` metrics: Add k8s sysbench documentation `b3901c46d` runtime-rs: ignore errors during clean up sandbox resources `5a1b5d367` metrics: Add sysbench pod yaml `ad413d164` metrics: Add sysbench dockerfile `151256011` metrics: Add sysbench performance test `62e328ca5` runtime-rs: refine implementation of TaskService `458e1bc71` runtime-rs: make send_message() as an method of ServiceManager `1cc1c81c9` runtime-rs: fix possibe bug in ServiceManager::run() `1a5f90dc3` runtime-rs: simplify implementation of service crate `731e7c763` kata-ctl: add monitor subcommand for runtime-rs The previous kata-monitor in golang could not communicate with runtime-rs to gather metrics due to different sandbox addresses. This PR adds the subcommand monitor in kata-ctl to gather metrics from runtime-rs and monitor itself. `d74639d8c` kata-ctl: provide the global TIMEOUT for creating MgmtClient `02cc4fe9d` runtime-rs: add support for gather metrics in runtime-rs Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-02 16:59:41 +02:00
Fabiano Fidêncio	6ad16d4977	release: Adapt kata-deploy for 3.2.0-rc0 kata-deploy files must be adapted to a new release. The cases where it happens are when the release goes from -> to: * main -> stable: * kata-deploy-stable / kata-cleanup-stable: are removed * stable -> stable: * kata-deploy / kata-cleanup: bump the release to the new one. There are no changes when doing an alpha release, as the files on the "main" branch always point to the "latest" and "stable" tags. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-02 16:59:41 +02:00
Fabiano Fidêncio	4e812009f5	Merge pull request #7519 from fidencio/topic/gha-ci-run-on-stable-branches ci-on-push: Make the CI also run for the stable-* branches	2023-08-02 16:13:06 +02:00
Jeremi Piotrowski	3230dec950	kata-deploy: Use host's systemctl when interacting with systemd. We have occasionally faced issues with compatibility between the systemctl version used inside the kata-deploy container and the systemd version on the host. Instead of using a containerized systemctl with bind mounted sockets, nsenter the host and run systemctl from there. This provides less coupling between the kata-deploy container and the host. Fixes: #7511 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-08-02 15:32:01 +02:00
Fabiano Fidêncio	29855ed0c6	Merge pull request #7510 from fidencio/topic/ci-k8s-aks-do-not-fail-gathering-info ci: k8s: Do not fail when gathering info on AKS nodes	2023-08-02 09:44:19 +02:00
Fabiano Fidêncio	025596b289	ci-on-push: Make the CI also run for the stable-* branches As we only support one stable branch, it'll be used as part of the stable-3.2 and onwards. Fixes: #7518 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-02 09:26:24 +02:00
Fabiano Fidêncio	e1a69c0c92	Merge pull request #6586 from jongwu/cross_build kata-deploy: enable cross build for non-x86	2023-08-02 09:11:56 +02:00
Fupan Li	1a6b27bf6a	Merge pull request #5797 from Yuan-Zhuo/add-metrics-for-runtime-rs runtime-rs: add support for gather metrics in runtime-rs	2023-08-02 13:40:22 +08:00
Fupan Li	a536d4a7bf	Merge pull request #6672 from Yuan-Zhuo/add-monitor-in-kata-ctl kata-ctl: add monitor subcommand for runtime-rs	2023-08-02 13:39:02 +08:00
Gabriela Cervantes	ad6e53c399	metrics: Modify boot time values This PR modifies boot time values limit. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-01 23:34:15 +00:00
Jianyong Wu	7ffc0c1225	static-build: enable cross build for qemu Depends on mutiarch feature of ubuntu, we can set up cross build environment easily and achive as good build performance as native build. Fixes: #6557 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-08-01 23:28:52 +02:00
Jianyong Wu	35d6d86ab5	static-build: enable cross-build for image build It's too long a time to cross build agent based on docker buildx, thus we cross build rootfs based on a container with cross compile toolchain of gcc and rust with musl libc. Then we get fast build just like native build. rootfs initrd cross build is disabled as no cross compile tolchain for rust with musl lib if found for alpine and based on docker buildx takes too long a time. Fixes: #6557 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-08-01 23:28:52 +02:00
Gabriela Cervantes	f764248095	gha: Add FIO test to run metrics yaml This PR adds FIO test to run metrics yaml. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-01 20:29:16 +00:00
Jianyong Wu	2205fb9d05	static-build: enable cross build for virtiofsd Based on messense/rust-musl-cross which offer cross build musl lib environment to cross compile virtiofsd. Fixes: #6557 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-08-01 22:10:46 +02:00
Jianyong Wu	11631c681a	static-build: enable cross build for shim-v2 shim-v2 has go and rust code. For rust code, we use messense/rust-musl-cross to build for speed up as it doesn't depends on qemu emulation. Build go code based on docker buildx as it doesn't support cross build now. Fixes: #6557 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-08-01 22:10:46 +02:00
Jianyong Wu	7923de8999	static-build: cross build kernel Prepare cross build environment based on current Dockerfile. Fixes: #6557 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-08-01 22:10:46 +02:00
Jianyong Wu	e2c31fce23	kata-deploy: enable cross build for kata deploy script kata-deploy-binaries-in-docker.sh is the entry to build kata components. set some environment to facilitate the following cross build work. Fixes: #6557 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-08-01 22:10:46 +02:00
Jianyong Wu	2fc5f0e2e0	kata-depoly: prepare env for cross build in lib.sh We leverage three env, TARGET_ARCH means the buid target tuple; ARCH nearly the same meaning with TARGET_ARCH but has been widely used in kata; CROSS_BUILD means if you want to do cross compile. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-08-01 22:10:46 +02:00
Fabiano Fidêncio	c0171ea0a7	Merge pull request #7508 from fidencio/topic/fix-release-notes-typos-and-references release: release-note.sh: Fix typos and reference to images	2023-08-01 22:05:32 +02:00
Gabriela Cervantes	58f9a57c20	metrics: Add network reference to general README metrics This PR adds network reference to the general metrics README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-01 16:54:00 +00:00
Gabriela Cervantes	07694ef3ae	metrics: Add Kata Containers network metrics README This PR adds the Kata Containers network metrics README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-01 16:49:09 +00:00
Gabriela Cervantes	d8439dba89	metrics: Add iperf3 deployment yaml This PR adds the iperf3 deployment yaml. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-01 16:45:01 +00:00
Gabriela Cervantes	bda83cee5d	metrics: Add iperf3 daemonset for k8s This PR adds the iperf3 daemonset for k8s. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-01 16:42:15 +00:00
Gabriela Cervantes	badff23c71	metrics: Add iperf3 service yaml for k8s This PR adds the iperf3 service yaml for k8s. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-01 16:37:19 +00:00
Gabriela Cervantes	27c02367f9	metrics: Add iperf3 network test This PR adds the iperf3 benchmark test for kata metrics. Fixes #7515 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-01 16:30:46 +00:00
GabyCT	a0a524efc2	Merge pull request #7486 from kata-containers/topic/addsysbench metrics: Add sysbench performance test	2023-08-01 10:17:48 -06:00
Fabiano Fidêncio	f5e9985afe	release: release-note.sh: Fix typos and reference to images diferent -> different And also let's make sure we escape the backticks around the kata-deploy environment variables, otherwise bash will try to interpret those. Fixes: #7497 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-01 12:42:03 +02:00
Fabiano Fidêncio	f910c66d6f	ci: k8s: Do not fail when gathering info on AKS nodes Otherwise the VM deletion may not delete, leaving us with several machines behind. Fixes: #7509 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-01 12:36:33 +02:00
Manabu Sugimoto	1b21a46246	docs: Use control-plane term instead of master Replace `master` with `control-plane` in the context of K8s because `master` is a legacy term and haven't been used any more. Ref. https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint Fixes: #7466 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-01 17:41:40 +09:00
Chao Wu	1a94aad44f	Merge pull request #7480 from jiangliu/rt-service Simplify implementation of runtime-rs/service	2023-08-01 16:05:33 +08:00
Chao Wu	2d13e2d71c	Merge pull request #7504 from fidencio/topic/gha-release-fix-upload-versions-yaml release: Fix upload-versions-yaml	2023-08-01 13:58:07 +08:00
GabyCT	b77d69aeee	Merge pull request #7396 from GabyCT/topic/addghatensorflow metrics: Enable Tensorflow metrics for kata CI	2023-07-31 17:13:24 -06:00
Fabiano Fidêncio	743291c6c4	release: Fix upload-versions-yaml This requires the GITHUB_UPLOAD_TOKEN. While we're here, let's also fix the name of the action and remove the "-tarball" suffix, as it's not really a tarball. Fixes: #7497 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-31 23:57:33 +02:00
Fabiano Fidêncio	a71d35c764	Merge pull request #7499 from fidencio/topic/gha-release-ensure-stage-is-defined-for-amr64-s300x gha: release: `stage` must be defined for arm64 / s390x yamls	2023-07-31 22:55:54 +02:00
Gabriela Cervantes	6328181762	metrics: Add k8s sysbench documentation This PR adds k8s sysbench documentation at general density documentation. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-31 20:28:37 +00:00
Chelsea Mafrica	f74b7aba18	Merge pull request #7488 from cmaf/docs-k8s-links docs: Update links for pods and kubelet	2023-07-31 12:44:24 -07:00
Gabriela Cervantes	8933d54428	metrics: Add FIO to gha run script This PR adds FIO to gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-31 17:51:11 +00:00
Gabriela Cervantes	8a584589ff	metrics: Add DAX FIO README This PR adds DAX FIO README information. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-31 17:42:44 +00:00
Gabriela Cervantes	21f5b65233	metrics: Add FIO information in storage general README This PR adds FIO information in storage general README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-31 17:33:39 +00:00
Gabriela Cervantes	69f05cf9e6	metrics: Add FIO general README This PR adds FIO general README information. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-31 17:30:05 +00:00
Gabriela Cervantes	87d41b3dfa	metrics: Add FIO test to gha for kata metrics CI This PR adds FIO test to gha for kata metrics CI. Fixes #7502 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-31 16:50:16 +00:00
Pavel Mores	28e5e9c86e	runtime-rs: fix number of queues handling in dragonball share fs device Looks like a copy/paste error... Fixes #7501 Signed-off-by: Pavel Mores <pmores@redhat.com>	2023-07-31 17:25:47 +02:00
Fabiano Fidêncio	ff8d7e7e41	Merge pull request #7496 from fidencio/topic/topic/kata-deploy-take-nfd-into-consideration-pre-work k8s: Rely on the USING_NFD environment variable passed by the jobs	2023-07-31 14:56:15 +02:00
Fabiano Fidêncio	1b111a9aab	gha: release: `stage` must be defined for arm64 / s390x yamls `stage` has been added, but only hooked up to the amd64 logic, leaving arm64 and s390x behind. Let's fix this right now, and make sure no error occurs when passing this down to the yaml files. Fixes: #7497 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-31 14:41:35 +02:00
Fabiano Fidêncio	684a6e1a55	Revert "gha: release: `stage` must be a string" This reverts commit `7c857d38c1`. I've misunderstood the error given by github action, let's fix this in the next commit. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-31 14:37:52 +02:00
Fabiano Fidêncio	99711f107f	Merge pull request #7498 from fidencio/topic/gha-release-stage-must-be-a-string gha: release: `stage` must be a string	2023-07-31 14:32:47 +02:00
Fabiano Fidêncio	7c857d38c1	gha: release: `stage` must be a string Otherwise we'll face the following error as part of our GHA: ``` The workflow is not valid. kata-containers/kata-containers/.github/workflows/release-$foo.yaml (Line: 13, Col: 14): Invalid input, stage is not defined in the referenced workflow. ``` Fixes: #7497 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-31 13:39:13 +02:00
Fabiano Fidêncio	28e171bf73	Merge pull request #7490 from fidencio/3.2.0-alpha4-branch-bump # Kata Containers 3.2.0-alpha4	2023-07-31 13:34:15 +02:00
Fabiano Fidêncio	91e1e612c3	k8s: Rely on the USING_NFD environment variable passed by the jobs Let's make sure we can rely on the tests passing down whether they want to be tested using Node Feataure Discovery or not. Right now, only the TDX job has this option set to "true", all the other jobs have this option set to "false". We can and have to merge this one before merging the NFD related patches as: 1) It causes no harm in exporting this environment variable, but not having it used 2) It will allow us to test the NFD after this one is merged, as changes in the yaml file, in the case of the pull_request_target event, are not taken into consideration before they're merged Fixes: #7495 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-31 13:30:18 +02:00
Zvonko Kaiser	cddcde1d40	vfio: Fix vfio device ordering If modeVFIO is enabled we need 1st to attach the VFIO control group device /dev/vfio/vfio an 2nd the actuall device(s) afterwards.Sort the devices starting with device #1 being the VFIO control group device and the next the actuall device(s) /dev/vfio/<group> Fixes: #7493 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-31 11:26:27 +00:00
Fabiano Fidêncio	7edc7172c0	release: Kata Containers 3.2.0-alpha4 - tests: Add `k8s-volume` and `k8s-file-volume` tests to GHA CI - metrics: Update boot time for kata metrics - metrics: Add FIO report files for kata metrics - kata-deploy: Allow runtimeclasses to be created by the daemonset - runtime-rs: change block index to 0 - agent: fix typo in constant - metrics: Add FIO benchmark for metrics tests - gha: dragonball: Run only on the dragonball labeled machine - tests: Fix `k8s-job` test - agent,libs: Remove unused 'mut' keywords - runtime-rs: remove unneeded 'mut' keywords - tests: QoL improvements for running tests locally - agent: exclude symlinks from recursive ownership change - cache: kernel: Fix kernel caching - runk: Add Docker guide to README - metrics: General improvements to json.bash script - kata-deploy: Allow shim creation based on what's passed to the daemonset - gha: ci: Add skeleton of vfio job - s390x: Fixing device.Bus assignment - release: Mention the container images used to build the project - kata-deploy-binaries: kernel_cache: Take module_dir into account - ci: nydus: Fix typo in "source" - gha: ci: Add no-op nydus tests to our CI - Dragonball: migrate dragonball-sandbox crates to Kata - ci: gha: Add cri-containerd tests (but still do not enable them) - packaging/tools: Add kata-debug and use it as part of our CI - cache: kernel: Consider changes in tools/packaging/kernel - kata-deploy: Properly get the path of the versions.yaml file - kata-deploy: Add VERSION and versions.yaml to the final tarball - metrics: Add C-Ray performance test - metrics: enable TensorFlow benchmark to be run on gha - metrics: Add function to memory inside container script - Revert "metrics: Replace backslashes used to escape double quoted key in jq expr" - versions: Bump virtiofsd to v1.7.0 - metrics: stop hypervirsor and shim at init_env stage - ci: k8s: Adapt "source ..." to the new location of gha-run.sh - ci: Move `tests/integration/gha-run.sh` to `tests/integration/kuberentes/` ... and also remove KUBECONFIG from the tdx envs - versions: Update kernel to version v6.1.x - agent: Fix exec hang issues with a backgroud process - agent: Ignore already mounted dev/fs/pseudo-fs - ci: k8s: Bring TDX tests back - metrics: Update machine learning documentation - gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo - tests: Add MobileNet Tensorflow performance benchmark - metrics: replace backslashes used to escape double quoted jq key expr. - runtime-rs: enhancement of Device Manager for network endpoints. - feat(Tracing): tracing in Rust runtime - runtime-rs: ignore unconfigured network interfaces - metrics: Stop running kata-env before kata is properly installed. - metrics: use rm -f to remove the oldest continerd config file. - kernel: Update kernel config name - kata-deploy: Add a debug option to kata-deploy (and also use it as part of our CI) - runtime-rs: add parameter for propagation of (u)mount events - kata-ctl: Move GuestProtection code to kata-sys-util - tests: Add function before function name in common.bash for metrics - tests: Add metrics storage documentation - metrics: Fix metrics ts generator to treat numbers as decimals - gha: ci: Add cri-containerd tests skeleton -- follow up 1 - dragonball/agent: Add some optimization for Makefile and bugfixes of unit tests on aarch64 - metrics: Enable blogbench test - tests: Add machine learning performance tests - tests: gha: ci: Add cri-containerd tests skeleton - metrics: Enable memory inside container metrics - tools: Use a consistent target name when building mariner initrd - gha: ci: Gather info about the node / pods - runtime-rs: Do not scan network if network model is "none" - gha: k8s: tdx: Temporarily disable TDX tests - metrics: Update memory usage script - gha: Cancel previous jobs if a PR is updated - gha: nightly: Fix long name of AKS clusters issue and make the CI easier to test - README: Add badge for our Nightly CI - gha: Do not run all the tests if only docs are updated - bugfix: plus default_memory when calculating mem size - gha: ci: Use github.sha to get the last commit reference - dragonball: Don't fail if a request asks for more CPUs than allowed - gha: ci: Fix refernce passed to checkout@v3 - gha: ci: Avoid using env also in the ci-nightly and payload-after-push - gha: k8s: Ensure cluster doesn't exist before creating it - gha: ci: More follow up fixes after adding a nightly CI - tests: Enable running k8s tests on Mariner - gha: ci: Avoid using env unless it's really needed - gha: ci: Follow up fixes for the nightly jobs - tests: Enable memory usage metrics tests - gha: Add nightly jobs - metrics: storing metrics workflow artifacts - gha: k8s: Ensure tests are running on a specific namespace - metrics: Adds blogbench and webtool metrics tests - gha: dragonball: Correctly propagate PATH update - versions: Upgrade to Cloud Hypervisor v33.0 - Convert `is_allowed`, `ttrpc_error` and `sl` to functions - gha: release: Use a specific release of hub - metrics: Add checkmetrics to gha-run.sh for metrics CI - packaging: Fix indentation of build.sh script at ovmf - doc: Add documentation for the virtualization reference architecture - gpu: Update kernel building to the latest changes - runtime: fix PCIe topology for GPUDirect use-case - metrics: Add memory footprint tests - runtime: Add "none" as a shared_fs option - metrics: Uniformity across function names in gha-run.sh - runtime-rs: support physical endpoint using device manager - runtime-rs: bugfix for direct volume path's validation. - metrics: Fix retrieving hypervisor version on metrics - runtime-rs: fix build error on AArch64 - checkmetrics: Add checkmetrics makefile and documentation - docs: Add boot time metrics documentation - runtime-rs: add support spdk/vhost-user based volume. - static-build: Remove kata-version parameter - dragonball: avoid obtaining lock twice in create_stdio_console - metrics: Add checkmetrics for kata metrics CI - metrics: enable launch-times test on gha-run metrics script - docs: Add general metrics documentation - add support vfio device manager - gha: Don't automatically trigger CI - kata-ctl: Check for vm capability - docs: fix spelling of "crate" - packaging: Fix indentation in init.sh script - gha: Fix gha actions - metrics: install kata and launch-times test - tests: Move tests helper script to this repo - tests: Add json script for metrics tests - Cherry pick initramfs caching updates from CCv0 - gha: Fix format for run launchtimes metrics yaml - tests: Add tests lib common script - Fix deprecated virtiofsd args (go shim only) - gha: Add base branch on SHA on pull requst - gha: ci-on-push: Run metrics tests - docs: Update Developer Guide - runtime-rs: Enhance flexibility of virtio-fs config - versions: Update firecracker version to 1.3.3 - tools: Fix no-op builds - runtime-rs: update Cargo.lock - gha: Fix `stage` definition in matrix - feat(runtime): vcpu resize capability - packaging: Remove snap package - gha: Add new build targets for Mariner - Dragonball: support resize memory - Port Measured rootfs feature from CCv0 branch to main - add support direct volume and refactor device manager - gha: Fix gha-run.sh and unbreak CI - kata-ctl: Switch to slog logging; add --log-level and --json-logging arguments - log-parser: Update log parser link at README - gha: aks: Extract `run` commands to a script - runtime-rs: handle copy files when share_fs is not available - agent-ctl: fix the compile error - agent: fix the issue of exec hang with a backgroud process - runtime-rs: bugfix: update Cargo.lock - gha: aks: Use short SHA in cluster name - README: Display badge for the "Publish Artefacts" job and update the Kata Containers logo - kata-deploy: Change how we get the Ubuntu k8s key - gha: aks: Ensure host_os is used everywhere needed - kubernetes: add agnhost command in pod yaml - main \| release: Standardize kata static file name - packaging: make BUILDER_REGISTRY configurable - gha: aks: Add the host_os as part of the aks cluster's name - kernel: Modify build-kernel.sh to accomodate for changes in version.yaml - gha: Fix Mariner cluster creation - gha: Unbreak CI and fix cluster creation step - Dragonball: support vcpu hotplug on aarch64 - runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts - runtime-rs/kata-ctl: Enhancement of DirectVolumeMount. - gha: Create Mariner host as part of k8s tests - netlink: Fix the issue of update_interface - gha: Increase timeout for AKS jobs and give more time to start running the tests - runtime: sending SIGKILL to qemu - dragonball: convert BlockDeviceMgr and VirtioNetDeviceMgr functions to methods - dragonball: Remove virtio-net and vsock devices gracefully - kata-deploy: Improve shim backup / restore - doc: Update git commands - kata-deploy: Fix indentation on kata deploy merge script `8353aae41` ci: k8s: Rework get_nodes_and_pods_info() `6ad5d7112` ci: k8s: Do not gather node info before running the tests `5261e3a60` ci: k8s: Group messages to improve readability `9cc6b5f46` ci: k8s: Get logs from kata-deploy `9d285c622` ci: k8s: Let kata-deploy take care of the runtimeclasses `87568ed98` gha: Test split out runtimeclasses are in sync with all-in-one file `39192c608` kata-deploy: Print variables passed to the script `0e157be6f` kata-deploy: Allow runtimeclasses to be created by the daemonset `a27433324` kata-deploy: Change default values of DEBUG `69535b808` kata-deploy: runtimeclass: Split out entries `9e1710674` kata-runtimeClasses: Alphabetically sort the enrties `6222bd910` tests: Add k8s-file-volume test `187a72d38` tests: Add k8s-volume test `0c8427035` metrics: Add boot time value for qemu `6520dfee3` metrics: Update boot time for kata metrics `ff2279061` metrics: Update runtime and configuration paths `a5d4e3388` metrics: Add compare virtiofsd dax script `5e937fa62` metrics: Update general FIO tests `b0bea47c5` metrics: Add makefile to report generator `73c57b9a1` metrics: Add FIO report files for kata metrics `c8fcd29d9` runtime-rs: use device manager to handle virtio-pmem `901c19225` runtime-rs: support configure vm_rootfs_driver `5d6199f9b` runtime-rs: use device manager to handle vm rootfs `20f1f62a2` runtime-rs: change block index to 0 `662f87539` metrics: Add general FIO makefile `c5a87eed2` tests: gha: Add timeout to cluster creation `6daeb08e6` tests: k8s: Clean up node debuggers after running `3aa6c77a0` gha: dragonball: Run only on the dragonball labeled machine `37641a543` metrics: Add example config for fio jobs `314aec73d` agent: fix typo in constant `4703434b1` tests: k8s: Allow using custom resource group `350f3f70b` tests: Import `common.bash` in `run_kubernetes_tests.sh` `d7f04a64a` tests: k8s: Leave `runtimeclass_workloads/` alone `bdde6aa94` tests: k8s: Split deployment and testing commands `91a0b3b40` tests: aks: Simply delete cluster when cleaning up `3c1044d9d` metrics: Update FIO paths for k8s runner `6177a0db3` metrics: Add env files for FIO `a45900324` metrics: Add fio exec `ea198fddc` metrics: Add FIO runner k8s `8f7ef41c1` metrics: Add FIO vendor code `6293c17bd` metrics: Add FIO benchmark for metrics tests `ff4cfcd8a` runk: Add Docker guide to README `c8ac56569` cache: kernel: Harmonize commit with fetching side `81775ab1b` cache: kernel: Fix SEV kernel caching `717f775f3` gha: ci: Add skeleton of vfio job `b9f100b39` agent,libs: Remove unused 'mut' keywords `a56f96bb2` kata-deploy: Allow shim creation based on what's passed to the daemonset `4a5ab38f1` metrics: General improvements to json.bash script `d4eba3698` kata-deploy-binaries: kernel_cache: Take module_dir into account `b7c9867d6` release: Mention the container images used to build the project `7c4b59781` ci: nydus: Fix typo in "source" `6a680e241` gha: ci: Add placeholder for the nydus tests as part of the CI `fb4f7a002` gha: nydus: Add a no-op GHA for nydus `4a207a16f` gha: nydus: Bring tests as they are from the tests repo `2c8f83424` runtime-rs: remove unneeded 'mut' keywords `1fc715bc6` s390x: Add AP Attach/Detach test `e91f5edba` ci: cri-containerd: Fix default typo for testContainerStart() `8b8aef09a` ci: cri-containerd: Temporarily disable TestContainerSwap `56767001c` ci: cri-containerd: Add namespace / uid to the pods `a84773652` ci: cri-containerd: Always use sudo to call crictl `99ba86a1b` ci: cri-containerd: Add /usr/local/go/bin to the PATH `7f3b30999` ci: cri-containerd: Add `function` before each function `fde22d6bc` ci: cri-containerd: Assume podman is always used `9465a0496` ci: cri-containerd: Adapt "source ..." to this repo `df8d14411` ci: cri-containerd: Remove CI variable `f90570aef` ci: cri-containerd: Remove unused runc_runtime_bin `c3637039f` ci: cri-containerd: Remove KILL_VMM_TEST env var `bc4919f9b` ci: cri-containerd: Always run shim-v2 tests `f9e332c6d` ci: cri-containerd: Stop cloning containerd `cfd662fee` ci: cri-containerd: Remove ununsed SNAP_CI var `d36c3395c` ci: cri-containerd: Update copyright `b5be8a4a8` ci: cri-containerd: Move integration-tests.sh as it was `f2e00c95c` ci: cri-containerd: Populate install_dependencies() `897955252` versions: Add "latest" field for cri-tools `1bbcbafa6` ci: Add clone_cri_container() `f66c68a2b` ci: Add install_cri_tools() `4dd828414` ci: Add install_cri_containerd() `ad47d1b9f` ci: Add download_github_project_tarball() `788c562a9` ci: Add get_latest_patch_release_from_a_github_project() `6742f3a89` ci: Use `function` before each install_go.sh function `5eacecffc` ci: Adjust paths for install_go.sh `8ed1595f9` ci: Update copyright for install_go.sh `6123d0db2` ci: Move install_go.sh as it was `8653be71b` ci: Do not take cross-build into consideration for kata-arch.sh `6a76bf92c` ci: Fix style / identation if kata-arch.sh `72743851c` ci: Add `function` before each kata-arch.sh function `9f6d4892c` ci: Update copyright for kata-arch.sh `6f73a7283` ci: Move kata-arch.sh as it was `3615d7343` ci: Add get_from_kata_deps() `34779491e` gha: kubernetes: Avoid declaring repo_root_dir `f3738beac` tests: Use $HOME/go as fallback for $GOPATH `b87ed2741` tests: Move `ensure_yq` to common.bash `124e39033` tests: common: Fix quoting when globbing `db77c9a43` tests: Make install_kata take care of the links `13715db1f` tests: Do not call `install_check_metrics` when installing kata `630634c5d` ci: k8s: Group logs to make them easier to read `228b30f31` ci: k8s: Gather node info during the cleanup `81f99543e` ci: k8s: Cleanup cluster before deleting it `38a7b5325` packaging/tools: Add kata-debug `ae6e8d2b3` kata-deploy: Properly get the path of the versions.yaml file `309e23255` cache: kernel: Consider changes in tools/packaging/kernel `59fdd69b8` kata-deploy: Add VERSION and versions.yaml to the final tarball `5dddd7c5d` release: Upload versions.yaml as part of the release `bad3ac84b` metrics: Rename C-Ray to cpu performance tests `87d99a71e` versions: Remove "kernel-experimental" `545de5042` vfio: Fix tests `62aa6750e` vfio: Added better handling of VFIO Control Devices `dd422ccb6` vfio: Remove obsolete HotplugVFIOonRootBus `114542e2b` s390x: Fixing device.Bus assignment `371a118ad` agent: exclude symlinks from recursive ownership change `e64edf41e` metrics: Add tensorflow function in gha-run script `67a6fff4f` metrics: Enable tensorflow benchmark on gha `01450deb6` Revert "metrics: Replace backslashes used to escape double quoted key in jq expr." `843006805` metrics: Add function to memory inside container script `bbd3c1b6a` Dragonball: migrate dragonball-sandbox crates to Kata `fad801d0f` ci: k8s: Adapt "source ..." to the new location of gha-run.sh `55e2f0955` metrics: stop hypervirsor and shim at init_env stage `556e663fc` metrics: Add disk link to general metrics README `98c121709` metrics: Add C-Ray README `8e7d9926e` metrics: Add C-Ray Dockerfile `e2ee76978` metrics: Add C-Ray performance test `2ee2cd307` ci: k8s: Move gha-run.sh to the kubernetes dir `88eaff533` ci: tdx: Adjust KUBECONFIG `c09e268a1` versions: Downgrade SEV(-SNP) kernel back to v5.19.x `6a7a32365` versions: Bump virtiofsd to v1.7.0 `ac5f5353b` ci: k8s: Bring TDX tests back `950b89ffa` versions: Update kernel to version v6.1.38 `8ccc1e5c9` metrics: Update machine learning documentation `f50d2b066` gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo `620b94597` metrics: Add Tensorflow Mobilenet documentation `6c91af0a2` agent: Fix exec hang issues with a backgroud process `59f4731bb` metrics: Stop running kata-env before kata is properly installed. `468f017e2` metrics: Replace backslashes used to escape double quoted key in jq expr. `64f013f3b` ci: k8s: Enable debug when running the tests `8f4b1df9c` kata-deploy: Give users the ability to run it on DEBUG mode `2c8dfde16` kernel: Update kernel config name `150e54d02` runtime-rs: ignore unconfigured network interfaces `3ae02f920` metrics: use rm -f to remove older continerd config file. `a864d0e34` tests: Add tensorflow mobilenet dockerfile `788d2a254` tests: Add tensorflow mobilenet performance test `3fed61e7a` tests: Add storage link to general metrics documentation `b34dda4ca` tests: Add storage blogbench metrics documentation `6787c6390` runtime-rs: add parameter for propagation of (u)mount events `6e5679bc4` tests: Add function before function name in common.bash for metrics `62080f83c` kata-sys-util: Fix compilation errors `02d99caf6` static-checks: Make cargo clippy pass. `982420682` agent: Make the static checks pass for agent `61e4032b0` kata-ctl: Remove all utility functions to get platform protection `a24dbdc78` kata-sys-util: Move utilities to get platform protection `dacdf7c28` kata-ctl: Remove cpu related functions from kata-ctl `f5d195717` kata-sys-util: Move additional functionality to cpu.rs `304b9d914` kata-sys-util: Move CPU info functions `7319cff77` ci: cri-containerd: Add LTS / Active versions for containerd `2a957d41c` ci: cri-containerd: Export GOPATH `75a294b74` ci: cri-containerd: Ensure deps are installed `6924d14df` metrics: Fix metrics ts generator to treat numbers as decimals `9e048c8ee` checkmetrics: Add blogbench read value for qemu `2935aeb7d` checkmetrics: Add blogbench write value for qemu `02031e29a` checkmetrics: Add blogbench read value for clh `107fae033` checkmetrics: Add blogbench write value for clh `8c75c2f4b` metrics: Update blogbench Dockerfile `49723a9ec` metrics: Add double quotes to variables `dc67d902e` metrics: Enable blogbench test `438fe3b82` gha: ci: Add cri-containerd tests skeleton `bd08d745f` tests: metrics: Move metrics specific function to metrics gha-run.sh `3ffd48bc1` tests: common: Move a few utility functions to common.bash `7f961461b` tests: Add machine learning README `bb2ef4ca3` tests: Add `function` before each function `063f7aa7c` tests: Add Pytorch Dockerfile `1af03b9b3` tests: Add Pytorch performance test `4cecd6237` tests: Add tensorflow Dockerfile `c4094f62c` tests: Add metrics machine learning performance tests `89b622dcb` gha: k8s: tdx: Temporarily disable TDX tests `8c9d08e87` gha: ci: Gather info about the node / pods `283f809dd` runtime-rs: Enhancing Device Manager for network endpoints. `a65291ad7` agent: rustjail: update test_mknod_dev `46b81dd7d` agent: clippy: fix cargo clippy warnings `c4771d9e8` agent: Makefile: enable set SECCOMP dynamically `a88212e2c` utils.mk: update BUILD_TYPE argument `883b4db38` dragonball: fix cargo test on aarch64 `6822029c8` runtime-rs: Do not scan network if network model is "none" `ce54e43eb` metrics: Update memory usage script `fbc2a91ab` gha: Cancel previous jobs if a PR is updated `307cfc8f7` tools: Use a consistent target name when building mariner initrd `d780cc08f` gha: nightly: Also use `workflow_dispatch` to trigger it `b99ff3026` gha: nightly: Fix name size limit for AKS `aedc586e1` dragonball: Makefile: add coverage target `310e069f7` checkmetrics: Enable checkmetrics for memory inside test `1363fbbf1` README: Add badge for our Nightly CI `1776b18fa` gha: Do not run all the tests if only docs are updated `28c29b248` bugfix: plus default_memory when calculating mem size `0c1cbd01d` gha: ci: after-push: Use github.sha to get the last commit reference `37a955678` gha: ci: nightly: Use github.sha to get the last commit reference `ed23b47c7` tracing: Add tracing to runtime-rs `96e9374d4` dragonball: Don't fail if a request asks for more CPUs than allowed `38f0aaa51` Revert "gha: k8s: dragonball: Skip k8s-number-cpus" `828a72183` gha: k8s: dragonball: Skip k8s-oom `a79505b66` gha: k8s: dragonball: Skip k8s-number-cpus `275c84e7b` Revert "agent: fix the issue of exec hang with a backgroud process" `2be342023` checkmetrics: Add memory usage inside container value for qemu `6ca34f949` checkmetrics: Add memory inside container value for clh `6c6892423` metrics: Enable memory inside container metrics `0ad298895` gha: ci: Fix refernce passed to checkout@v3 `86904909a` gha: ci: Avoid using env also in the ci-nightly and payload-after-push `f72cb2fc1` agent: Remove shadowed function, add slog-term `1d05b9cc7` gha: ci: Pass down secrets to ci-on-push / ci-nightly `c5b4164cb` gha: ci: Fix tarball-suffix passed to the metrics tests `07810bf71` agent: Ignore already mounted dev/fs/pseudo-fs `11e3ccfa4` gha: ci: Avoid using env unless it's really needed `c45f646b9` gha: k8s: Ensure cluster doesn't exist before creating it `1a7bbcd39` gha: ci: Fix typo pull_requesst -> pull_request `ddf4afb96` gha: ci: Fix set-fake-pr-number job `8a0a66655` gha: ci: schedule expects a list, not a map `5c0269dc5` gha: ci: Add pr-number input to the correct job `de83cd9de` gha: ci: Use $VAR instead of ${{ env.VAR }} `6acce83e1` metrics: Fix the call to check_metrics function `e067d1833` gha: Add a nightly CI job `7c0de8703` gha: k8s: Ensure tests are running on a specific namespace `106e30571` gha: Create a re-usable `ci.yaml` file `cc3993d86` gha: Pass event specific info from the caller workflow `4e396e728` metrics: Add function keyword to to helper metrics functions `1ca17c2f7` metrics: storing metrics workflow artifacts `5a61065ab` checkmetrics: Add checkmetrics value for memory usage in qemu `78086ed1f` checkmetrics: Add memory usage value for clh `1c3dbafbf` metrics: Fix function of how to retrieve multiple values `18968f428` metrics: Add function to have uniformity `35d096b60` metrics: Adds blogbench and webtool metrics tests `d8f90e89d` metrics: Rename function at memory usage script `b9d66e0d5` metrics: Fix double quotes variables in memory usage script `476a11194` tests: Enable memory usage metrics tests `b568c7f7d` tests/integration: Provide default value for KATA_HOST_OS `d6e96ea06` tests/integration: Use AzureLinux instead of Mariner `40c46c75e` tests/integration: Perform yq install in run_tests() `d8b8f7e94` metrics: Enable launch tests time metrics `72fd562bd` gha: release: Use a specific release of hub `0502354b4` checkmetrics: Add checkmetrics json for qemu `b481ef188` makefile: Add -buildvcs=false flag to go build `e94aaed3c` ci_worker: Add checkmetrics ci worker for cloud hypervisor `917576e6f` metrics: Add double quotes in all variables `cc8f0a24e` metrics: Add checkmetrics to gha-run.sh for metrics CI `477856c1e` gha: dragonball: Correctly propagate PATH update `1c211cd73` gha: Swap asset/release in build matrix `0152c9aba` tools: Introduce `USE_CACHE` environment variable `2b5975689` tests: Build CLH with glibc for Mariner `80c78eadc` tests: Use baked-in kernel with Mariner `532755ce3` tests: Build Mariner rootfs initrd `6a21e20c6` runtime: Add "none" as a shared_fs option `5681caad5` versions: Upgrade to Cloud Hypervisor v33.0 `b2ce8b4d6` metrics: Add memory footprint tests to the CI `d035955ef` doc: Add documentation for the virtualization reference architecture `0f454d0c0` gpu: Fixing typos for PCIe topology changes `6bb2ea819` packaging: Fix indentation of build.sh script at ovmf `0504bd725` agent: convert the `sl` macros to functions `0860fbd41` agent: convert the `ttrpc_error` macro to a function `0e5d6ce6d` agent: convert the `is_allowed` macro to a function `f680fc52b` agent: change `AGENT_CONFIG`'s lazy type to just `AgentConfig` `beb706368` metrics: Uniformity across function names `1f3e837e4` runtime-rs: fix build error on AArch64 `6fd25968c` runtime-rs: bugfix for direct volume path's validation. `415578cf3` docs: Add general README `bff4672f7` runtime-rs: support physical endpoint using device manager `32cba7e44` metrics: Fix retrieving hypervisor version on metrics `aa7946de4` checkmetrics: Add general checkmetrics documentation `2fac2b72f` checkmetrics: Add checkmetrics makefile `e45899ae0` docs: Add time tests documentation reference `28130d3ce` docs: Add boot time metrics documentation `0df2fc270` runtime-rs: add support spdk/vhost-user based volume. `17198089e` vendor: Add vendor checkmetrics dependencies `f1dfea6e8` docs: Add metrics documentation reference `8330fb8ee` gpu: Update unit tests `859359424` metrics: enable launch-times test on gha-run metrics script `c4ee601bf` metrics: Add checkmetrics for kata metrics CI `e0d6475b4` gha: Don't automatically trigger CI `b535c7cbd` tests: Enable running k8s tests on Mariner `71071bdb6` docs: Add general metrics documentation `610f7986e` check: Relax the unrestricted_guest check when running in a VM `1b406b9d0` kata-ctl:Implement functionality to check host is capable of running VM `adf88eaa8` static-build: Remove kata-version parameter `09720babc` docs: fix spelling of "crate" `7185afc50` gha: Fix gha actions `21294b868` packaging: Fix indentation in init.sh script `fad3ac9f5` metrics: install kata and launch-times test `4bbfcfaf1` tests: Move tests helper script to this repo `f152f0e8c` metrics: Add launch-times to metrics tests `59510cfee` runtime-rs: add support vfio device based volume `1e3b372bb` runtime-rs: add support vfio device manager `6b0848930` gha: Fix format for run launchtimes metrics yaml `3cefa43e7` tests: Add json script for metrics tests `6a3710055` initramfs: Build dependencies as part of the Dockerfile `aa2380fdd` packaging: Add infra to push the initramfs builder image `1c7fcc6cb` packaging: Use existing image to build the initramfs `a43ea24df` virtiofsd: Convert legacy `-o` sub-options to their `--` replacement `8e00dc694` virtiofsd: Drop `-o no_posix_lock` `2a15ad978` virtiofsd: Stop using deprecated `-f` option `c3043a6c6` tests: Add tests lib common script `b16e0de73` gha: Add base branch on SHA on pull requst `72f2cb84e` gpu: Reset cold or hot plug after overriding `fbacc0964` gpu: PCIe topology, consider vhost-user-block in Virt `bc152b114` gha: ci-on-push: Run metrics tests `dad731d5c` docs: Update Developer Guide `b11246c3a` gpu: Various fixes for virt machine type `40101ea7d` vfio: Added annotation for hot(cold) plug `8f0d4e261` vfio: Cleanup of Cold and Hot Plug `b5c4677e0` vfio: Rearrange the bus assignemnt `b1aa8c8a2` gpu: Moved the PCIe configs to drivers `55a66eb7f` gpu: Add config to TOML `da42801c3` gpu: Add config settings tests for hot-plug `de39fb7d3` runtime: Add support for GPUDirect and GPUDirect RDMA PCIe topology `9318e022a` gpu: Add CC relates configs `b7932be4b` gpu: Add Arm64 Kernel Settings `211b0ab26` gpu: Update Kernel Config `5f103003d` gpu: Update kernel building to the latest changes `35e4938e8` tools: Fix no-op builds `347385b4e` runtime-rs: Enhance flexibility of virtio-fs config `21d227853` versions: Update firecracker version to 1.3.3 `0e2379909` gha: Fix `stage` definition in matrix `ae2cfa826` doc: add vcpu handlint doc for runtime-rs `7b1e67819` fix(clippy): fix clippy error `67972ec48` feat(runtime-rs): calculate initial size `aaa96c749` feat(runtime-rs): modify onlineCpuMemRequest `d66f7572d` feat(runtime-rs): clear cpuset in runtime side `a0385e138` feat(runtime-rs): update linux resource when stop_process `a39e1e6cd` feat(runtime-rs): merge the update_cgroups in update_linux_resources `fa6dff9f7` feat(runtime-rs): support vcpu resizing on runtime side `8cb4238b4` packaging: Remove snap package `213773998` runtime-rs: update Cargo.lock `56d2ea9b7` kata-ctl: Refactor kernel module check `9f7a45996` gha: Add `rootfs-initrd-mariner` build target `f28a62164` gha: Add `cloud-hypervisor-glibc` build target `8fb7ab751` dragonball: introduce virtio-balloon device `7ed949497` dragonball: introduce virtio-mem device `776a15e09` runtime-rs: add support direct volume. `a8e0f51c5` dragonball: extend DeviceOpContext `abae11404` runtime-rs: refactor device manager implementation `210a15794` dragonball: avoid obtaining lock twice in create_stdio_console `69668ce87` tests: gha-run: Use correct env variable for repo `f487199ed` gha: aks: Fix argument in call to gha-run.sh `f6afae9c7` packaging: Add rootfs-image-tdx-tarball target `f62b2670c` config: Add root hash value and measure config to kernel params `008058807` kernel: Integrate initramfs into Guest kernel `28b264562` initramfs: Add build script to generate initramfs `5cb02a806` image-build: generate root hash as an separate partition for rootfs `31c0ad207` packaging: Add cryptsetup support in Guest kernel and rootfs `980d084f4` log-parser: Update log parser link at README `410bc1814` agent-ctl: fix the compile error `77519fd12` kata-ctl: Switch to slog logging; add --log-level, --json-logging args `aab603096` gha: aks: Extract `run` commands to a script `e4eb664d2` runtime-rs: update rust to 1.69.0 `ed37715e0` runtime-rs: handle copy files when share_fs is not available `5f6fc3ed7` runtime-rs: bugfix: update Cargo.lock `1c6d22c80` gha: aks: Use short SHA in cluster name `3c1f6d36d` readme: Update Kata Containers logo `388684113` readme: Add status badge for the "Publish Artefacts" job `26f752038` kata-deploy: Change how we get the Ubuntu k8s key `aebd3b47d` gha: aks: Ensure host_os is used everywhere needed `0c8282c22` gha: aks: Add the host_os as part of the aks cluster's name `4b89a6bda` release: Standardize kata static file name `9228815ad` kernel: Modify build-kernel.sh to accomodate for changes in version.yaml `03027a739` gha: Fix Mariner cluster creation `43e73bdef` packaging: make BUILDER_REGISTRY configurable `ffe3157a4` dragonball: add arm64 patches for upcall `560442e6e` dragonball: add vcpu_boot_onlined vector `e31772cfe` dragonball: add support resize_vcpu on aarch64 `64c764c14` dragonball: update dbs-boot to v0.4.0 `fd9b41464` dragonball: update comment for init_microvm `af16d3fca` gha: Unbreak CI and fix cluster creation step `5ddc4f94c` runtime-rs/kata-ctl: Enhancement of DirectVolumeMount. `25d2fb0fd` agent: fix the issue of exec hang with a backgroud process `4af4ced1a` gha: Create Mariner host as part of k8s tests `eee7aae71` runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts `557b84081` gha: aks: Wait longer to start running the tests `c04c872c4` gha: aks: Increase the timeout time `428041624` kata-deploy: Improve shim backup / restore `14c3f1e9f` kata-deploy: Fix indentation on kata deploy merge script `0e47cfc4c` runtime: sending SIGKILL to qemu `6a0035e41` doc: Update git commands `433b5add4` kubernetes: add agnhost command in pod yaml `c477ac551` dragonball: Convert VirtioNetDeviceMgr function to method `4659facb7` dragonball: Convert BlockDeviceMgr function to method `ee6deef09` dragonball: Remove virtio-net and vsock devices gracefully `2bda92fac` netlink: Fix the issue of update_interface Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-31 09:02:07 +02:00
Jiang Liu	b3901c46d6	runtime-rs: ignore errors during clean up sandbox resources Ignore errors during clean up sandbox resources as much as we can. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-07-31 13:07:43 +08:00
Chelsea Mafrica	8a2c201719	docs: Update links for pods and kubelet The links for pods and kubelets no longer work so update to new links with relevant info. Fixes #7487 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-07-29 00:38:35 +00:00
Gabriela Cervantes	5a1b5d3672	metrics: Add sysbench pod yaml This PR adds the sysbench pod yaml for the sysbench performance test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 20:03:15 +00:00
Gabriela Cervantes	ad413d1646	metrics: Add sysbench dockerfile This PR adds sysbench dockerfile. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 19:58:10 +00:00
Gabriela Cervantes	1512560111	metrics: Add sysbench performance test This PR adds the sysbench performance test for kata CI. Fixes #7485 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 19:54:12 +00:00
Gabriela Cervantes	bee1a628bd	metrics: Fix json result for tensorflow This PR fixes the json result for tensorflow.i Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 17:02:16 +00:00
Jiang Liu	62e328ca5c	runtime-rs: refine implementation of TaskService Refine implementation of TaskService, making handler_message() as a method. Fixes: #7479 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-07-29 00:47:33 +08:00
Jiang Liu	458e1bc712	runtime-rs: make send_message() as an method of ServiceManager Simplify implementation by making send_message() as an method of ServiceManager. Fixes: #7479 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-07-29 00:47:31 +08:00
Jiang Liu	1cc1c81c9a	runtime-rs: fix possibe bug in ServiceManager::run() Multiple instances of task service may get registered by ServiceManager::run(), fix it by making operation symmetric. Fixes: #7479 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-07-29 00:47:30 +08:00
Jiang Liu	1a5f90dc3f	runtime-rs: simplify implementation of service crate Simplify implementation of service crate. Fixes: #7479 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-07-29 00:47:28 +08:00
Gabriela Cervantes	51cd99c927	metrics: Round axelnet and resnet results This PR rounds the axelnet and resnet results in order to extract properly the result. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Gabriela Cervantes	3b883bf5a7	metrics: Fix atoi invalid syntax This PR will avoid to have the strconv.atoi parsing error when we are retrieving the results from the json. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Gabriela Cervantes	f9dec11a8f	checkmetrics: Move checkmetrics to gha-run script This PR moves the checkmetrics to gha-run script to gathered tensorflow information. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Gabriela Cervantes	53af71cfd0	checkmetrics: Add AlexNet value for qemu This PR adds AlexNet value for qemu for checkmetrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Gabriela Cervantes	a435d36fe1	checkmetrics: Add Resnet value for qemu This PR adds the Resnet value for qemu for checkmetrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Gabriela Cervantes	a79a3a8e1d	checkmetrics: Add alexnet value for clh This PR adds the AlexNet value for clh for checkmetrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Gabriela Cervantes	3c32875046	checkmetrics: Add Resnet value for clh This PR adds the checkmetrics Resnet value for clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Gabriela Cervantes	08dfaa97aa	metrics: General improvements to the tensorflow script This PR adds general improvements to the tensorflow script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Gabriela Cervantes	63b8534b41	metrics: Enable Tensorflow metrics for kata CI This PR enables the Tensorflow benchmark metrics for kata CI. Fixes #7395 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-28 16:15:22 +00:00
Aurélien	e8f8641988	Merge pull request #7132 from sprt/aks-volume-tests tests: Add `k8s-volume` and `k8s-file-volume` tests to GHA CI	2023-07-28 08:58:03 -07:00
Fabiano Fidêncio	68b9acfd02	Merge pull request #7474 from GabyCT/topic/upboo metrics: Update boot time for kata metrics	2023-07-28 17:55:43 +02:00
David Esparza	f89abcbad8	Merge pull request #7473 from GabyCT/topic/addfioreport metrics: Add FIO report files for kata metrics	2023-07-28 09:37:21 -06:00
Fabiano Fidêncio	c9742d6fa9	Merge pull request #7411 from fidencio/topic/kata-deploy-create-runtime-classes kata-deploy: Allow runtimeclasses to be created by the daemonset	2023-07-28 16:05:49 +02:00
Yuan-Zhuo	731e7c763f	kata-ctl: add monitor subcommand for runtime-rs The previous kata-monitor in golang could not communicate with runtime-rs to gather metrics due to different sandbox addresses. This PR adds the subcommand monitor in kata-ctl to gather metrics from runtime-rs and monitor itself. Fixes: #5017 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2023-07-28 17:30:08 +08:00
Yuan-Zhuo	d74639d8c6	kata-ctl: provide the global TIMEOUT for creating MgmtClient Several functions in kata-ctl need to establish a connection with runtime-rs through MgmtClient. This PR provides a global TIMEOUT to avoid multiple definitions. Fixes: #5017 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2023-07-28 17:23:37 +08:00
Yuan-Zhuo	02cc4fe9db	runtime-rs: add support for gather metrics in runtime-rs 1. Implemented metrics collection for runtime-rs shim and dragonball hypervisor. 2. Described the current supported metrics in runtime-rs.(docs/design/kata-metrics-in-runtime-rs.md) Fixes: #5017 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2023-07-28 17:16:51 +08:00
Fabiano Fidêncio	8353aae41a	ci: k8s: Rework get_nodes_and_pods_info() The amount of info we've added seemed unnecessary, and ends up making our lives even harder when trying to find errors. Let's just rely on the kata-debug container to collect the needed info for us. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 10:04:33 +02:00
Fabiano Fidêncio	6ad5d7112e	ci: k8s: Do not gather node info before running the tests It's been proven to not be useful, and ends up making things more confusing due to the amount of logs printed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 10:04:33 +02:00
Fabiano Fidêncio	5261e3a60c	ci: k8s: Group messages to improve readability Right now is getting way too easy to get lost in the logs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 10:04:33 +02:00
Fabiano Fidêncio	9cc6b5f461	ci: k8s: Get logs from kata-deploy Let's make sure we can debug kata-deploy in case something goes wrong during its execution. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 10:04:33 +02:00
Fabiano Fidêncio	9d285c6226	ci: k8s: Let kata-deploy take care of the runtimeclasses By doing this we can test the change done for the daemonset. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 10:04:33 +02:00
Fabiano Fidêncio	87568ed985	gha: Test split out runtimeclasses are in sync with all-in-one file This is needed in order to not lose track of what's been created and what's been added here and there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 10:04:33 +02:00
Fabiano Fidêncio	39192c6084	kata-deploy: Print variables passed to the script This will help folks to debug / understand what's been passed to the kata-deploy.sh script. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 10:04:33 +02:00
Fabiano Fidêncio	0e157be6f2	kata-deploy: Allow runtimeclasses to be created by the daemonset Let's allow the daemonset to create the runtimeclasses, which will decrease one manual step a user of kata-deploy should take, and also help us in the Confidential Containers land as the Operator can just delegate it to this script. Fixes: #7409 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 10:04:33 +02:00
Fabiano Fidêncio	a274333248	kata-deploy: Change default values of DEBUG This can be easily done as there was no official release with the previous values. The reason we're doing so is because when using `yq` to replace the value, even when forcing `--tag '!!str' "yes"`, the content is placed without quotes, causing errors in our CI. While here, we're also removing the fallback value for DEBUG, as it is always set in the kata-deploy.yaml file. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 09:50:39 +02:00
Fabiano Fidêncio	69535b8089	kata-deploy: runtimeclass: Split out entries This will make things simpler to only create the handlers defined by the kata-deploy user. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 09:43:45 +02:00
Fabiano Fidêncio	9e1710674a	kata-runtimeClasses: Alphabetically sort the enrties This will become handy in the near future, as we want to have separate enrties for each file, while still keeping this one. Having the entries sorted will make our lives easier to test those are always in sync. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-28 09:43:45 +02:00
Zhongtao Hu	61a8eabf8e	Merge pull request #7139 from openanolis/fix/devmanager runtime-rs: change block index to 0	2023-07-28 14:04:19 +08:00
Aurélien Bombo	6222bd9103	tests: Add k8s-file-volume test This imports the k8s-file-volume test from the tests repo and modifies it slightly to set up the host volume on the AKS host. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-27 14:07:55 -07:00
Aurélien Bombo	187a72d381	tests: Add k8s-volume test This imports the k8s-volume test from the tests repo and modifies it slightly to set up the host volume on the AKS host. Fixes: #6566 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-27 14:06:43 -07:00
Gabriela Cervantes	0c84270357	metrics: Add boot time value for qemu This PR adds the boot time value and limit for qemu. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-27 20:06:24 +00:00
Gabriela Cervantes	6520dfee37	metrics: Update boot time for kata metrics This PR updates the boot time limit for kata metrics. Fixes #7475 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-27 19:14:19 +00:00
Gabriela Cervantes	ff22790617	metrics: Update runtime and configuration paths This PR updates the runtime and configuration paths for kata containers. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-27 17:14:03 +00:00
Gabriela Cervantes	a5d4e33880	metrics: Add compare virtiofsd dax script This PR adds the compare virtiofsd dax script for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-27 16:53:50 +00:00
Gabriela Cervantes	5e937fa622	metrics: Update general FIO tests This PR updates general FIO tests by adding the recent date of a change. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-27 16:47:17 +00:00
Gabriela Cervantes	b0bea47c53	metrics: Add makefile to report generator This PR adds the makefile to report generator for the FIO test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-27 16:42:11 +00:00
Gabriela Cervantes	73c57b9a19	metrics: Add FIO report files for kata metrics This PR adds FIO report files for kata metrics. Fixes #7472 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-27 16:39:35 +00:00
Chelsea Mafrica	e941b3a094	Merge pull request #7456 from alakesh/agent-fix-typo agent: fix typo in constant	2023-07-27 09:31:24 -07:00
David Esparza	ba8a8fcbf2	Merge pull request #7442 from GabyCT/topic/addgofilesfio metrics: Add FIO benchmark for metrics tests	2023-07-27 10:20:43 -06:00
Zhongtao Hu	c8fcd29d9b	runtime-rs: use device manager to handle virtio-pmem use device manager to handle virtio-pmem device Fixes: #7119 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-07-27 20:18:49 +08:00
Zhongtao Hu	901c192251	runtime-rs: support configure vm_rootfs_driver support configure vm_rootfs_driver in toml config Fixes: #7119 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-07-27 20:12:53 +08:00
Zhongtao Hu	5d6199f9bc	runtime-rs: use device manager to handle vm rootfs use device manager to handle vm rootfs, after attach the block device of vm rootfs, we need to increase index number Fixes: #7119 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-07-27 20:12:45 +08:00
James O. D. Hunt	20f1f62a2a	runtime-rs: change block index to 0 Change block index in SharedInfo to 0 for vda. Fixes #7119 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-07-27 20:11:44 +08:00
Chao Wu	ede1dae65d	Merge pull request #7465 from fidencio/topic/fix-dragonball-static-check-runner-selector gha: dragonball: Run only on the dragonball labeled machine	2023-07-27 10:19:26 +08:00
Gabriela Cervantes	662f87539e	metrics: Add general FIO makefile This PR adds a general FIO makefile for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-26 20:46:02 +00:00
Fabiano Fidêncio	f28af98ac6	Merge pull request #7453 from sprt/fix-ci-node-debugger tests: Fix `k8s-job` test	2023-07-26 22:27:21 +02:00
Fabiano Fidêncio	8a22b5f075	Merge pull request #7439 from ManaSugi/fix/remove-unused-mut agent,libs: Remove unused 'mut' keywords	2023-07-26 21:25:41 +02:00
Fabiano Fidêncio	9792ac49fe	Merge pull request #7425 from jongwu/remove_mut runtime-rs: remove unneeded 'mut' keywords	2023-07-26 21:24:40 +02:00
Fabiano Fidêncio	24564a8499	Merge pull request #7455 from sprt/local-tests tests: QoL improvements for running tests locally	2023-07-26 21:23:43 +02:00
Aurélien Bombo	c5a87eed29	tests: gha: Add timeout to cluster creation This has been intermittently taking a while lately so let's add a timeout. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-26 10:19:07 -07:00
Aurélien Bombo	6daeb08e69	tests: k8s: Clean up node debuggers after running This deletes node debugger pods after execution since their presence may affect tests that assume only test workloads pods are present. For example, in `k8s-job` we wait for any pod to be in the `Succeeded` state before proceeding, which causes failures. Fixes: #7452 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-26 10:19:07 -07:00
Fabiano Fidêncio	3aa6c77a01	gha: dragonball: Run only on the dragonball labeled machine Static checks for dragonball are landing on any of the self-hosted runners, and the reason for that is because "self-hosted" was the label selector used. Let's use "dragonball" instead, as the machine has that label as well. Fixes: #7464 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-26 18:15:04 +02:00
Gabriela Cervantes	37641a5430	metrics: Add example config for fio jobs This PR adds example config for fio jobs. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-26 16:03:12 +00:00
Alakesh Haloi	314aec73d4	agent: fix typo in constant It fixes a constant name to have the right spelling Fixes: #7457 Signed-off-by: Alakesh Haloi <a_haloi@apple.com>	2023-07-26 00:06:34 -05:00
Aurélien Bombo	4703434b12	tests: k8s: Allow using custom resource group This simply allows setting a custom resource group when debugging locally, so as to prevent name collisions and not pollute the namespace. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-25 15:45:44 -07:00
Aurélien Bombo	350f3f70b7	tests: Import `common.bash` in `run_kubernetes_tests.sh` Not sure why this works in GHA, but the `info` call on line 65 would fail locally. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-25 15:45:44 -07:00
Aurélien Bombo	d7f04a64a0	tests: k8s: Leave `runtimeclass_workloads/` alone Makes it so that `setup.sh` doesn't make changes in `runtimeclass_workloads/` directly. Instead we treat that as a template directory and we use the new directory `runtimeclass_workloads_work/` as a work dir. This has two advantages: * Allows rerunning tests without the assumption that `setup.sh` must be idempotent. E.g. the `set_runtime_class()` step would break. * Doesn't pollute your git environment with a bunch of changes when developing. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-25 15:45:44 -07:00
Aurélien Bombo	bdde6aa948	tests: k8s: Split deployment and testing commands This splits deploying Kata and running the tests into separate commands to make it possible to rerun tests locally without having to redeploy Kata each time. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-25 15:44:46 -07:00
Aurélien Bombo	91a0b3b406	tests: aks: Simply delete cluster when cleaning up If we're going to delete the cluster anyway, no need to call kata-cleanup. Fixes: #7454 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-25 15:44:46 -07:00
Gabriela Cervantes	3c1044d9d5	metrics: Update FIO paths for k8s runner This PR updates the FIO paths for k8s runner. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-25 20:50:03 +00:00
Eric Ernst	5385ddc560	Merge pull request #7365 from alakesh/symlink-fix agent: exclude symlinks from recursive ownership change	2023-07-25 11:27:48 -07:00
Gabriela Cervantes	6177a0db3e	metrics: Add env files for FIO This PR adds the env files for FIO for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-25 17:48:45 +00:00
Gabriela Cervantes	a45900324d	metrics: Add fio exec This PR adds fio exec for the FIO benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-25 17:36:08 +00:00
Gabriela Cervantes	ea198fddcc	metrics: Add FIO runner k8s Add program to execute FIO workloads using k8s. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-25 17:34:29 +00:00
Gabriela Cervantes	8f7ef41c14	metrics: Add FIO vendor code This PR adds the FIO vendor code. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-25 17:24:29 +00:00
Gabriela Cervantes	6293c17bde	metrics: Add FIO benchmark for metrics tests This PR adds the FIO benchmark scripts and resources for the metrics tests section. Fixes #7441 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-25 16:36:33 +00:00
Fabiano Fidêncio	cdf04e5018	Merge pull request #7437 from jepio/fix-sev-kernel-cache cache: kernel: Fix kernel caching	2023-07-25 18:10:03 +02:00
GabyCT	7a3b55ce67	Merge pull request #7432 from ManaSugi/runk/doc-docker runk: Add Docker guide to README	2023-07-25 09:56:02 -06:00
GabyCT	c1bd527163	Merge pull request #7430 from GabyCT/topic/fixjson metrics: General improvements to json.bash script	2023-07-25 09:45:53 -06:00
Fabiano Fidêncio	6efd684a46	Merge pull request #7408 from fidencio/topic/kata-deploy-add-SHIMS-and-SHIM_DEFAULT-as-env kata-deploy: Allow shim creation based on what's passed to the daemonset	2023-07-25 16:56:46 +02:00
Fabiano Fidêncio	5b82268d2c	Merge pull request #7436 from jepio/vfio-gha gha: ci: Add skeleton of vfio job	2023-07-25 14:44:04 +02:00
Manabu Sugimoto	ff4cfcd8a2	runk: Add Docker guide to README `runk` can launch containers using Docker, so add the guide to it's README. ```sh $ sudo dockerd --experimental --add-runtime="runk=/usr/local/bin/runk" $ sudo docker run -it --rm --runtime runk busybox echo hello runk hello runk ``` Fixes: #7431 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-07-25 20:10:49 +09:00
Jeremi Piotrowski	c8ac56569a	cache: kernel: Harmonize commit with fetching side kata-deploy-binaries.sh uses the last commit in tools/packaging/static-build/kernel for its version check, while the cache generation uses tools/packaging/kernel. Use tools/packaging/static-build/kernel as $kata_config_version is already part of the version string and covers any changes to tools/packaging/kernel. Fixes: #7403 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-25 12:23:05 +02:00
Jeremi Piotrowski	81775ab1b3	cache: kernel: Fix SEV kernel caching The SEV kernel cache calls create_cache_asset() twice, once for the kernel and once for modules. Both calls need to use the same version string, otherwise the second call overwrites the "latest" file of the first one and the cache is not used. Fixes: #7403 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-25 11:58:19 +02:00
Jeremi Piotrowski	717f775f30	gha: ci: Add skeleton of vfio job This job will run on a nested virt capable Azure VM (improving test concurrency). This is just a placeholder while we adapt the test to GHA. Fixes: #6555 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-25 11:13:04 +02:00
Manabu Sugimoto	b9f100b391	agent,libs: Remove unused 'mut' keywords Remove unused `mut` because the agent compilation fails when the rust compiler is >= 1.71. This is related to #7425 Fixes: #7438 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-07-25 17:41:08 +09:00
Fabiano Fidêncio	a56f96bb2b	kata-deploy: Allow shim creation based on what's passed to the daemonset Instead of hardcoding shims as part of the script, let's ensure we can allow them to be created based on environment variables passed to the daemonset. This change brings no functionality change as the default values in the daemonset are exactly what has been used as part of the scripts. Fixes: #7407 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-25 08:30:00 +02:00
Fabiano Fidêncio	5ce0b4743f	Merge pull request #7382 from zvonkok/vfio-ap-debug s390x: Fixing device.Bus assignment	2023-07-25 08:26:25 +02:00
David Esparza	b11d618a3f	Merge pull request #7413 from fidencio/topic/release-publish-builder-images release: Mention the container images used to build the project	2023-07-24 15:46:31 -06:00
Fabiano Fidêncio	56fdeb1247	Merge pull request #7417 from fidencio/topic/kata-deploy-binaries-cached-kernel-fix kata-deploy-binaries: kernel_cache: Take module_dir into account	2023-07-24 22:26:09 +02:00
Gabriela Cervantes	4a5ab38f16	metrics: General improvements to json.bash script This PR adds general improvements like putting function before function name and consistency in how we declare variables and so on to have uniformity across the metrics scripts. Fixes #7429 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-24 16:51:38 +00:00
Fabiano Fidêncio	d4eba36980	kata-deploy-binaries: kernel_cache: Take module_dir into account `module_dir` has been passed to the function but was never assigned to a var, leading to errors when trying to use it. Fixes: #7416 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-24 18:19:13 +02:00
Fabiano Fidêncio	b7c9867d60	release: Mention the container images used to build the project This is a small step towards build reproducibility. Fixes: #7412 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-24 18:01:57 +02:00
Wainer Moschetta	2e9853c761	Merge pull request #7427 from fidencio/topic/gha-port-nydus-tests-follow-up-1 ci: nydus: Fix typo in "source"	2023-07-24 11:20:05 -03:00
Fabiano Fidêncio	7c4b597816	ci: nydus: Fix typo in "source" We should source from `nydus_dir`, instead of `cri_containerd_dir`, and that was a leftover from `fb4f7a002c`. Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-24 14:55:09 +02:00
Fabiano Fidêncio	589672d510	Merge pull request #7426 from fidencio/topic/gha-port-nydus-tests gha: ci: Add no-op nydus tests to our CI	2023-07-24 13:56:57 +02:00
Fabiano Fidêncio	6a680e241b	gha: ci: Add placeholder for the nydus tests as part of the CI This will triger the nydus tests, but as they currently are they'll just return "okay" without actually executing. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-24 13:37:36 +02:00
Fabiano Fidêncio	fb4f7a002c	gha: nydus: Add a no-op GHA for nydus This newly added GHA does nothing, is not even triggered, and it's just a placeholder that we'll grow in the next commits / PRs, so we can actually start running the nydus tests as part of our CI. Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-24 13:37:33 +02:00
Fupan Li	0ae987973b	Merge pull request #7367 from openanolis/chao/migrate_dragonball_sandbox Dragonball: migrate dragonball-sandbox crates to Kata	2023-07-24 17:52:11 +08:00
Fabiano Fidêncio	4a207a16f9	gha: nydus: Bring tests as they are from the tests repo Let's bring the nydus tests, without any kind of modification, from the tests repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-24 10:56:41 +02:00
Jianyong Wu	2c8f83424d	runtime-rs: remove unneeded 'mut' keywords These unneeded 'mut' keywords blocks built by rust 1.71.0. Remove them. Fixes: #7424 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-07-24 08:47:15 +00:00
Zvonko Kaiser	1fc715bc65	s390x: Add AP Attach/Detach test Now that we have propper AP device support add a unit test for testing the correct Attach/Detach of AP devices. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-23 13:44:19 +00:00
Fabiano Fidêncio	e1a4040a6c	Merge pull request #7326 from fidencio/topic/gha-ci-add-cri-containerd-tests ci: gha: Add cri-containerd tests (but still do not enable them)	2023-07-21 19:29:38 +02:00
Fabiano Fidêncio	6a59e227b6	Merge pull request #7399 from fidencio/topic/add-kata-debug packaging/tools: Add kata-debug and use it as part of our CI	2023-07-21 17:05:27 +02:00
Fabiano Fidêncio	e91f5edba0	ci: cri-containerd: Fix default typo for testContainerStart() It must but {1:-0}, instead of {1-0}. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	8b8aef09af	ci: cri-containerd: Temporarily disable TestContainerSwap The test is currently failing with GHA, and I don't think it makes sense to block all the other tests to get merged while it's happening. For now, let's disable it and re-enable it as soon as we have it passing. Reference: https://github.com/kata-containers/kata-containers/issues/7410 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	56767001cb	ci: cri-containerd: Add namespace / uid to the pods Otherwise crictl will fail to remove them with: ``` getting sandbox status of pod "$pod": metadata.Name, metadata.Namespace or metadata.Uid is not in metadata "..." ``` A huge shout out to Steven Horsman for helping to debug this one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	a84773652c	ci: cri-containerd: Always use sudo to call crictl Otherwise we may get the following error: ``` time="2023-07-15T21:12:13Z" level=fatal msg="validate service connection: validate CRI v1 runtime API for endpoint \"unix:///run/containerd/containerd.sock\": rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: permission denied\"" ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	99ba86a1b2	ci: cri-containerd: Add /usr/local/go/bin to the PATH Otherwise go is not picked up. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	7f3b309997	ci: cri-containerd: Add `function` before each function We've been doing this for all files moved to this repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	fde22d6bce	ci: cri-containerd: Assume podman is always used For this set of tests, we'll always be using podman in order to avoid having containerd pulled in by docker. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	9465a04963	ci: cri-containerd: Adapt "source ..." to this repo Let's adapt what we "source" to the kata-containers repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	df8d144119	ci: cri-containerd: Remove CI variable We always want to run the tests using as much debug as possible. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	f90570aef0	ci: cri-containerd: Remove unused runc_runtime_bin The variable is not used anywhere in our tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	c3637039f4	ci: cri-containerd: Remove KILL_VMM_TEST env var We don't need the env var, we just need to restrict the test according to the KATA_HYPERVISOR used, as right now it's very specifict to QEMU. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	bc4919f9b2	ci: cri-containerd: Always run shim-v2 tests We only have shim-v2 as the runtime type, so we always need to run tests using it. :-) We had to adjust the script in order to properly run the tests with the current logic. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	f9e332c6db	ci: cri-containerd: Stop cloning containerd It's already done as part of the install_dependencies() Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	cfd662fee9	ci: cri-containerd: Remove ununsed SNAP_CI var We don't support SNAP anymore, thus we can remove the var. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	d36c3395c0	ci: cri-containerd: Update copyright As we're touching the file already, let's update its Copyright info. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	b5be8a4a8f	ci: cri-containerd: Move integration-tests.sh as it was Let's move the `integration/containerd/cri/integration-tests.sh` file from the tests repo to this one. The file has been moved as it is, it's not used, and in the following commits we'll clean it up before actually using it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	f2e00c95c0	ci: cri-containerd: Populate install_dependencies() Let's install all the dependencies needed for running the `cri-containerd` tests. The list of dependencies we have are: * From the system - build-essential - jq - podman-docker * From our own repo - yq - go * From GitHub projects - containerd - cri-tools Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	8979552527	versions: Add "latest" field for cri-tools As we don't want to disrupt what we have on the `tests` repo, let's create a "latest" entry and use that for the GitHub actions tests. Once we deprecate the `tests` repo we can decide whether we want to stick to using "latest" or switch back to "version". Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	1bbcbafa67	ci: Add clone_cri_container() This function will simply clone containerd repo, specifically on a tag we want to use to test. This can be expanded for different projects, and it will be the case as soon as we grow the tests. But, for now, let's keep it simple. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	f66c68a2bf	ci: Add install_cri_tools() This function will install cri-tools in the host, and soon enough (as part of this PR) we'll be using it to install cri-tools as part of the cri-containerd tests. I've decided to have this as part of the `common.bash` as other tests that will be added in the future will require cri-tools to be installed as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	4dd828414f	ci: Add install_cri_containerd() This function will install cri-containerd in the host, and soon enough (as part of this PR) we'll be using it to install cri-containerd as part of the cri-containerd tests. I've decided to have this as part of the `common.bash` as other tests that will be added in the future will require cri-containerd to be installed as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	ad47d1b9f8	ci: Add download_github_project_tarball() This function will hel us to get the tarball, from a github project, that we're going to use as part of our tests. Right now this is not used anywhere, but it'll soon enough (as part of this series) be used to download the cri-containerd / cri-tools / cni tarballs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	788c562a95	ci: Add get_latest_patch_release_from_a_github_project() This function will help us to get the latest patch release from a GitHub project. The idea behind this function is that we don't have to keep updating versions.yaml that frequently (or worse, have it outdated as it currently is), and always test against the latest patch release of a given project's version that we care about. Although right now this is not used anywhere, this will be used with the coming cri-containerd tests, which will be part of this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	6742f3a898	ci: Use `function` before each install_go.sh function We've been doing this for all files moved to this repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	5eacecffc3	ci: Adjust paths for install_go.sh Let's adjust paths for what we source and the scripts we call, after moving from the tests repo to this one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	8ed1595f96	ci: Update copyright for install_go.sh As we're touching the file already, let's update its Copyright info. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	6123d0db2c	ci: Move install_go.sh as it was Let's move `.ci/install_go.sh` file from the tests repo to this one. The file has been moved as it is, it's not used, and in the following commits we'll clean it up before actually using it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	8653be71b2	ci: Do not take cross-build into consideration for kata-arch.sh Right now we'd need to import lib.sh just in order to get cross-build information for rust, and it seems a little bit premature to do so at this stage and only for rust. Let's skip it and keep this transition simple. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	6a76bf92cb	ci: Fix style / identation if kata-arch.sh We've been using: ``` function foo() { } ``` instead of ``` function foo() { } ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	72743851c1	ci: Add `function` before each kata-arch.sh function We've been doing this for all files moved to this repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	9f6d4892c8	ci: Update copyright for kata-arch.sh As we're touching the file already, let's update its Copyright info. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	6f73a72839	ci: Move kata-arch.sh as it was Let's move `.ci/kata-arch.sh` file from the tests repo to this one. The file has been moved as it is, it's not used, and in the following commits we'll clean it up before actually using it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	3615d73433	ci: Add get_from_kata_deps() First of all, I'm 100% aware that I'm duplicating this function here as I've copied it from the packaging stuff, and I'm not exactly proud of that. However, right now it seems a little bit premature to combine that set of scripts with this set of scripts in a single one and make them used by both pieces of our project. Anyways, this functions helps to get information from the `versions.yaml` file, and it'll be used as part of the cri-containerd tests and a few others in the future. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	34779491e0	gha: kubernetes: Avoid declaring repo_root_dir This is already declared as part of the `common.bash` file, so let's just make sure we use it from there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	f3738beaca	tests: Use $HOME/go as fallback for $GOPATH Considering that someone may want to run the tests locally, we shouldn't rely on having GITHUB_WORKSPACE exported, and fallback to $HOME/go if needed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	b87ed27416	tests: Move `ensure_yq` to common.bash As this function will be used by different scripts, let's move it to a common place. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Jeremi Piotrowski	124e390333	tests: common: Fix quoting when globbing When the glob star is inside quotes, there is only one iteration of the loop and b holds all matches at once. Move the glob out of the quotes so that we actually iterate over matched paths. Fixes: #6543 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	db77c9a438	tests: Make install_kata take care of the links It makes the kata-containers installation more complete. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	13715db1f8	tests: Do not call `install_check_metrics` when installing kata The `install_kata` function was moved from the metrics' `gha-run.sh` file to the `common.bash` in the commit `3ffd48bc16`, but I didn't notice that it brought with it a call to `install_check_metrics`, which is totally unrelated to installing Kata Containers. Let's remove the call so the function is a little bit less specific, and move the call to install_check_metrics to the metrics `gha-run.sh` file. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 16:54:27 +02:00
Fabiano Fidêncio	e149a3c783	Merge pull request #7404 from fidencio/topic/cache-consider-changes-in-the-scripts-used-to-build-the-kernel cache: kernel: Consider changes in tools/packaging/kernel	2023-07-21 15:05:01 +02:00
Fabiano Fidêncio	630634c5df	ci: k8s: Group logs to make them easier to read Otherwise it becomes really hard to find the info you're looking for. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 14:05:30 +02:00
Fabiano Fidêncio	228b30f31c	ci: k8s: Gather node info during the cleanup This will make our lives easier to debug issues with the CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 14:05:30 +02:00
Fabiano Fidêncio	81f99543ec	ci: k8s: Cleanup cluster before deleting it This will help us to in two fronts: * catching possible issues related to kata-deploy cleanup * do more (like, in the future, collect logs) after the tests run Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 14:05:30 +02:00
Fabiano Fidêncio	38a7b5325f	packaging/tools: Add kata-debug kata-debug is a tool that is used as part of the Kata Containers CI to gather information from the node, in order to help debugging issues with Kata Containers. As one can imagine, this can be expanded and used outside of the CI context, and any contribution back to the script is very much welcome. The resulting container is stored at the [Kata Containers quay.io space](https://quay.io/repository/kata-containers/kata-debug) and can be used as shown below: ```sh kubectl debug $NODE_NAME -it --image=quay.io/kata-containers/kata-debug:latest ``` Fixes: #7397 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 14:05:30 +02:00
Fabiano Fidêncio	a0fd41fd37	Merge pull request #7406 from fidencio/topic/merge-tarball-fix-version-yaml-not-found kata-deploy: Properly get the path of the versions.yaml file	2023-07-21 14:04:18 +02:00
Fabiano Fidêncio	ae6e8d2b38	kata-deploy: Properly get the path of the versions.yaml file We need to correctly get the full path of the versions.yaml file as part of the merge-builds.sh script, as we do a `pushd` there and that leads to a fail merging the artefacts as the `versions.yaml` file does not exists in that path. Fixes: #7405 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 12:02:11 +02:00
Fabiano Fidêncio	309e232553	cache: kernel: Consider changes in tools/packaging/kernel Any change in the script used to build the kernel should invalidate the cache. Fixes: #7403 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-21 11:48:29 +02:00
GabyCT	f95a7896b1	Merge pull request #7394 from fidencio/topic/ship-VERSIOB-and-versions.yaml-as-part-of-release-tarball kata-deploy: Add VERSION and versions.yaml to the final tarball	2023-07-20 14:38:21 -06:00
GabyCT	14025baafe	Merge pull request #7376 from GabyCT/topic/addcray metrics: Add C-Ray performance test	2023-07-20 14:37:53 -06:00
GabyCT	b629f6a822	Merge pull request #7363 from GabyCT/topic/enabletensorflow metrics: enable TensorFlow benchmark to be run on gha	2023-07-20 13:36:55 -06:00
Fabiano Fidêncio	59fdd69b85	kata-deploy: Add VERSION and versions.yaml to the final tarball Let's make things simpler to figure out which version of Kata Containers has been deployed, and also which artefacts come with it. This will help us immensely in the future, for the TEEs use case, so we can easily know whether we can deploy a specific guest kernel for a specific host kernel. Fixes: #7394 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-20 18:33:14 +02:00
Fabiano Fidêncio	5dddd7c5d1	release: Upload versions.yaml as part of the release Although this file is far away from being a SBOM, it'll help folks to easily visualise which components are part of a release, and even have SBOMs generated from that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-20 18:31:21 +02:00
Gabriela Cervantes	bad3ac84b0	metrics: Rename C-Ray to cpu performance tests This PR renames C-Ray tests to cpu category. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-20 15:56:02 +00:00
Fabiano Fidêncio	87d99a71ec	versions: Remove "kernel-experimental" We've not been using nor shipping this kernel for a very long time. Regardless, we're leaving behind the logic in the kernel scripts to build it, in case it becomes necessary in the future. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-20 17:14:22 +02:00
Zvonko Kaiser	545de5042a	vfio: Fix tests Now with more elaborate checking of cold\|hot plug ports we needed to update some of the tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-20 13:42:44 +00:00
Zvonko Kaiser	62aa6750ec	vfio: Added better handling of VFIO Control Devices Depending on the vfio_mode we need to mount the VFIO control device additionally into the container. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-20 13:42:42 +00:00
Fabiano Fidêncio	fe07ac662d	Merge pull request #7387 from GabyCT/topic/fixmemoryinsidec metrics: Add function to memory inside container script	2023-07-20 10:06:15 +02:00
Zvonko Kaiser	dd422ccb69	vfio: Remove obsolete HotplugVFIOonRootBus Removing HotplugVFIOonRootBus which is obsolete with the latest PCI topology changes, users can set cold_plug_vfio or hot_plug_vfio either in the configuration.toml or via annotations. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-20 07:25:40 +00:00
Zvonko Kaiser	114542e2ba	s390x: Fixing device.Bus assignment The device.Bus was reset if a specific combination of configuration parameters were not met. With the new PCIe topology this should not happen anymore Fixes: #7381 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-20 07:24:26 +00:00
Alakesh Haloi	371a118ad0	agent: exclude symlinks from recursive ownership change currently when fsGroup is used with direct-assign, kata agent recursively changes ownership and permission for each file including symlinks. However the problem with symlinks is, the permission of the symlink itself may not be same as the underlying file. So while doing recursive ownership and permission changes we should skip symlinks. Fixes: #7364 Signed-off-by: Alakesh Haloi <a_haloi@apple.com>	2023-07-19 20:42:55 -07:00
Gabriela Cervantes	e64edf41e5	metrics: Add tensorflow function in gha-run script This PR adds the tensorflow function in gha-run script in order to be triggered in the gha. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-19 21:31:51 +00:00
Gabriela Cervantes	67a6fff4f7	metrics: Enable tensorflow benchmark on gha This PR enables the TensorFlow benchmark on gha for the kata metrics CI. Fixes #7362 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-19 21:31:51 +00:00
GabyCT	c3f21c36f3	Merge pull request #7388 from dborquez/revert-commit-broke-checkmetrics-baseline-values Revert "metrics: Replace backslashes used to escape double quoted key in jq expr"	2023-07-19 14:36:16 -06:00
David Esparza	01450deb6a	Revert "metrics: Replace backslashes used to escape double quoted key in jq expr." This reverts commit `468f017e21`. Fixes: #7385 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-19 10:07:11 -06:00
Gabriela Cervantes	8430068058	metrics: Add function to memory inside container script This PR adds function before function of the variables at the memory inside container script in order to have uniformity across the script. Fixes #7386 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-19 16:00:53 +00:00
Chao Wu	bbd3c1b6ab	Dragonball: migrate dragonball-sandbox crates to Kata In order to make it easier for developers to contribute to Dragonball, we decide to migrate all dragonball-sandbox crates to Kata. fixes: #7262 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-07-19 19:41:57 +08:00
Chao Wu	7153b51578	Merge pull request #7372 from fidencio/topic/bump-virtiofsd-to-v1.7.0 versions: Bump virtiofsd to v1.7.0	2023-07-19 10:51:49 +08:00
GabyCT	8c662916ab	Merge pull request #7377 from dborquez/add_verbosity_to_blogbench metrics: stop hypervirsor and shim at init_env stage	2023-07-18 15:57:54 -06:00
Fabiano Fidêncio	5f7da301fd	Merge pull request #7378 from fidencio/topic/ci-k8s-fix-source-path ci: k8s: Adapt "source ..." to the new location of gha-run.sh	2023-07-18 22:30:55 +02:00
Fabiano Fidêncio	fad801d0fb	ci: k8s: Adapt "source ..." to the new location of gha-run.sh This is a follow up of `2ee2cd307b`, which changed the location of gha-run.sh Fixes: #7373 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-18 21:26:41 +02:00
David Esparza	55e2f0955b	metrics: stop hypervirsor and shim at init_env stage This PR kills the hypervisor and the kata shim in the init_env stage prior to launch any metric test. Additionally this PR adds info messages in the main blocks of the blogbench test to help in debugging. Fixes: #7366 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-18 12:05:29 -06:00
Gabriela Cervantes	556e663fce	metrics: Add disk link to general metrics README This PR adds the disk link information to the general metrics README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-18 16:42:35 +00:00
Gabriela Cervantes	98c1217093	metrics: Add C-Ray README This PR adds the C-Ray documentation at the README file. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-18 16:35:54 +00:00
Gabriela Cervantes	8e7d9926e4	metrics: Add C-Ray Dockerfile This PR adds the C-Ray Dockerfile for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-18 16:33:55 +00:00
Gabriela Cervantes	e2ee769783	metrics: Add C-Ray performance test This PR adds C-Ray performance test in order to be part of the kata metrics CI. Fixes #7375 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-18 16:32:23 +00:00
Fabiano Fidêncio	2011e3d72a	Merge pull request #7374 from fidencio/topic/ci-tdx-adjust-kubeconfig-path ci: Move `tests/integration/gha-run.sh` to `tests/integration/kuberentes/` ... and also remove KUBECONFIG from the tdx envs	2023-07-18 17:32:57 +02:00
Fabiano Fidêncio	8e09e04f48	Merge pull request #6788 from jepio/kernel-update-6.1-lts versions: Update kernel to version v6.1.x	2023-07-18 17:29:21 +02:00
Chao Wu	935432c36d	Merge pull request #7352 from justxuewei/exec-hang agent: Fix exec hang issues with a backgroud process	2023-07-18 23:02:18 +08:00
Fabiano Fidêncio	2ee2cd307b	ci: k8s: Move gha-run.sh to the kubernetes dir The file belongs there, as it's only used for k8s related tests. Fixes: #7373 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-18 15:45:06 +02:00
Fabiano Fidêncio	88eaff5330	ci: tdx: Adjust KUBECONFIG We don't need to export KUBECONFIG there. Let's just make sure we have the server correctly setup and avoid doing that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-18 15:39:52 +02:00
Jeremi Piotrowski	c09e268a1b	versions: Downgrade SEV(-SNP) kernel back to v5.19.x CC-GPU seems to have issues with v6.1, so downgrade the kernels used for SEV-SNP to a known-working version. It is worth mentioning that TDX is also still on 5.19. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-18 15:29:46 +02:00
Fabiano Fidêncio	25d80fcec2	Merge pull request #6993 from zvonkok/kata-agent-init-mount agent: Ignore already mounted dev/fs/pseudo-fs	2023-07-18 14:11:44 +02:00
Fabiano Fidêncio	4687f2bf9d	Merge pull request #7369 from fidencio/topic/gha-ci-bring-tdx-back ci: k8s: Bring TDX tests back	2023-07-18 13:28:33 +02:00
Fabiano Fidêncio	6a7a323656	versions: Bump virtiofsd to v1.7.0 https://gitlab.com/virtio-fs/virtiofsd/-/releases/v1.7.0 was released Today. Fixes: #7371 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-18 12:33:13 +02:00
Fabiano Fidêncio	ac5f5353ba	ci: k8s: Bring TDX tests back Now that we have a new TDX machine plugged into our CI, let's re-enable the TDX tests. Fixes: #7368 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-18 10:33:43 +02:00
Jeremi Piotrowski	950b89ffac	versions: Update kernel to version v6.1.38 Kernel v6.1.38 is the current latest LTS version, switch to it. No patches should be necessary. Some CONFIG options have been removed: - CONFIG_MEMCG_SWAP is covered by CONFIG_SWAP and CONFIG_MEMCG - CONFIG_ARCH_RANDOM is unconditionally compiled in - CONFIG_ARM64_CRYPTO is covered by CONFIG_CRYPTO and ARCH=arm64 Fixes: #6086 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-18 10:04:21 +02:00
GabyCT	7729d82e6e	Merge pull request #7360 from GabyCT/topic/updategraldoc metrics: Update machine learning documentation	2023-07-17 15:30:13 -06:00
Fabiano Fidêncio	26d525fcf3	Merge pull request #7361 from fidencio/topic/gha-ci-add-cri-containerd-tests-skeleton-follow-up-2 gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo	2023-07-17 22:38:50 +02:00
GabyCT	b4852c8544	Merge pull request #7335 from kata-containers/topic/addmobilenet tests: Add MobileNet Tensorflow performance benchmark	2023-07-17 14:36:59 -06:00
Gabriela Cervantes	8ccc1e5c93	metrics: Update machine learning documentation This PR updates the machine learning documentation related with Tensorflow and Pytorch benchmarks. Fixes #7359 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-17 20:32:49 +00:00
Fabiano Fidêncio	f50d2b0664	gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo KATA_HYPERVSIOR should be KATA_HYPERVISOR Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-17 21:56:51 +02:00
David Esparza	687596ae41	Merge pull request #7320 from dborquez/fix_jq_checkmetrics_checkvar_expression metrics: replace backslashes used to escape double quoted jq key expr.	2023-07-17 13:50:18 -06:00
Gabriela Cervantes	620b945975	metrics: Add Tensorflow Mobilenet documentation This PR adds the Tensorflow mobilinet documentation for the machine learning README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-17 17:39:05 +00:00
Zhongtao Hu	d50f3888af	Merge pull request #7219 from Apokleos/network-refactor runtime-rs: enhancement of Device Manager for network endpoints.	2023-07-17 14:13:51 +08:00
QuanweiZhou	ce14f26d82	Merge pull request #5450 from openanolis/trace_rs feat(Tracing): tracing in Rust runtime	2023-07-17 09:27:13 +08:00
Manabu Sugimoto	f1d8de9be6	runk: Allow runk to launch a container without pid namespace Allow runk to launch a container even though users don't specify the pid namespace in `config.json` because general container runtimes such as runc also can launch a container without the namespace. On the other hand, Kata Containers doesn't allow it due to security issue so this feature should be enabled in only runk. Fixes: #7168 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-07-16 23:31:14 +05:30
Zhongtao Hu	419f8a5db7	Merge pull request #7021 from cheriL/7020/ignore-unconfigured-netinterface runtime-rs: ignore unconfigured network interfaces	2023-07-16 10:11:15 +08:00
Xuewei Niu	6c91af0a26	agent: Fix exec hang issues with a backgroud process Issue #4747 and pull request #4748 fix exec hang issues where the exec command hangs when a process's stdout is not closed. However, the PR might cause the exec command not to work as expected, leading to CI failure. The PR was reverted in #7042. This PR resolves the exec hang issues and has undergone 1000 rounds of testing to verify that it would not cause any CI failures. Fixes: #4747 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-07-16 08:32:45 +08:00
David Esparza	5a9829996c	Merge pull request #7349 from dborquez/fix_extract_kata_env_for_metrics metrics: Stop running kata-env before kata is properly installed.	2023-07-14 15:20:52 -06:00
David Esparza	59f4731bb2	metrics: Stop running kata-env before kata is properly installed. This PR makes kata-env is called only after some metrics have completed his workload. This fixes a bug that occurs when kata-env was being called before kata is already installed on the testing platform. Fixes: #7348 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-14 13:40:48 -06:00
David Esparza	468f017e21	metrics: Replace backslashes used to escape double quoted key in jq expr. This PR uses squared brackets in a jq expression to access key values corresponding to metric results in json format. The values are the data inputs into the checkmetrics tool. Fixes: #7319 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-14 18:41:41 +00:00
GabyCT	b9535fb187	Merge pull request #7337 from dborquez/fix_remove_old_metrics_config metrics: use rm -f to remove the oldest continerd config file.	2023-07-14 09:19:41 -06:00
Fabiano Fidêncio	7a854507cc	Merge pull request #7333 from zvonkok/main kernel: Update kernel config name	2023-07-14 13:49:27 +02:00
Fabiano Fidêncio	cfc90fad84	Merge pull request #7344 from fidencio/topic/kata-deploy-add-a-debug-option kata-deploy: Add a debug option to kata-deploy (and also use it as part of our CI)	2023-07-14 13:16:55 +02:00
Fabiano Fidêncio	64f013f3bf	ci: k8s: Enable debug when running the tests This will help us to gather more information about Kata Containers in case of failure. Fixes: #7343 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-14 12:18:11 +02:00
Fabiano Fidêncio	8f4b1df9cf	kata-deploy: Give users the ability to run it on DEBUG mode The DEBUG env var introduced to the kata-deploy / kata-cleanup yaml file will be responsible for: * Setting up the CRI Engine to run with the debug log level set to debug * The default is usually info * Setting up Kata Containers to enable: * debug logs * debug console * agent logs This will help a lot folks trying to debug Kata Containers while using kata-deploy, and also help us to always run with DEBUG=yes as part of our CI. Fixes: #7342 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-14 12:18:08 +02:00
Chao Wu	9b3dc572ae	Merge pull request #7018 from nubificus/feat_bindmount_propagation runtime-rs: add parameter for propagation of (u)mount events	2023-07-14 15:21:41 +08:00
Zvonko Kaiser	2c8dfde168	kernel: Update kernel config name Fixes: #7294 When installing the kernel config adjust the name like the vmlinuz and vmlinux files so that any added suffixes are also reflected in the kernel config name. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-14 06:50:35 +00:00
Archana Shinde	b9b8ccca0c	Merge pull request #7236 from amshinde/move-guestprotection kata-ctl: Move GuestProtection code to kata-sys-util	2023-07-13 23:50:17 -07:00
soup	150e54d02b	runtime-rs: ignore unconfigured network interfaces Fixes: #7020 Signed-off-by: soup <lqh348659137@outlook.com>	2023-07-14 14:16:03 +08:00
David Esparza	3ae02f9202	metrics: use rm -f to remove older continerd config file. In order to run kata metrics we need to check that the containerd config file is properly set. When this is not the case, we need to remove that file, and generate a valid one. This PR runs rm -f in order to ignore errors in case the file to delete does not exist. Fixes: #7336 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-13 16:20:03 -06:00
David Esparza	22d4e4c5a6	Merge pull request #7328 from GabyCT/topic/updatecommon tests: Add function before function name in common.bash for metrics	2023-07-13 16:11:30 -06:00
Gabriela Cervantes	a864d0e349	tests: Add tensorflow mobilenet dockerfile This PR adds the tensorflow mobilenet dockerfile. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-13 21:24:40 +00:00
Gabriela Cervantes	788d2a254e	tests: Add tensorflow mobilenet performance test This PR adds tensorflow mobilenet performance test for kata metrics. Fixes #7334 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-13 21:18:25 +00:00
David Esparza	e8917d7321	Merge pull request #7330 from GabyCT/topic/storagedoc tests: Add metrics storage documentation	2023-07-13 15:10:53 -06:00
GabyCT	8db43eae44	Merge pull request #7318 from dborquez/fix_timestamp_generator_on_metrics metrics: Fix metrics ts generator to treat numbers as decimals	2023-07-13 11:21:09 -06:00
Gabriela Cervantes	3fed61e7a4	tests: Add storage link to general metrics documentation This PR adds storage link to general metrics README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-13 16:03:49 +00:00
Gabriela Cervantes	b34dda4ca6	tests: Add storage blogbench metrics documentation This PR adds the storage metrics documentation for blogbench for kata metrics. Fixes #7329 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-13 16:00:14 +00:00
Anastassios Nanos	6787c63900	runtime-rs: add parameter for propagation of (u)mount events Add an extra parameter in `bind_mount_unchecked` to specify the propagation type: "shared" or "slave". Fixes: #7017 Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2023-07-13 15:58:22 +00:00
Gabriela Cervantes	6e5679bc46	tests: Add function before function name in common.bash for metrics This PR adds function before the function name in common.bash script in order to have uniformity across all the script. Fixes #7327 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-13 15:48:47 +00:00
Archana Shinde	62080f83cb	kata-sys-util: Fix compilation errors Fix compilation errors for aarch64 and s390x Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-07-13 20:09:43 +05:30
Archana Shinde	02d99caf6d	static-checks: Make cargo clippy pass. Get rid of cargo clippy warnings. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-07-13 20:08:13 +05:30
Archana Shinde	9824206820	agent: Make the static checks pass for agent The static checks for the agent require Cargo.lock to be updated. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-07-13 20:08:13 +05:30
Archana Shinde	61e4032b08	kata-ctl: Remove all utility functions to get platform protection Since these have been added to kata-sys-util, remove these from kata-ctl. Change all invocations to get platform protection to make use of kata-sys-util. Fixes: #7144 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-07-13 20:08:13 +05:30
Archana Shinde	a24dbdc781	kata-sys-util: Move utilities to get platform protection Add utilities to get platform protection to kata-sys-util Fixes: #7144 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-07-13 20:08:13 +05:30
Archana Shinde	dacdf7c282	kata-ctl: Remove cpu related functions from kata-ctl Remove cpu related functions which have been moved to kata-sys-util. Change invocations in kata-ctl to make use of functions now moved to kata-sys-util. Signed-off-by: Nathan Whyte <nathanwhyte35@gmail.com> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-07-13 20:08:13 +05:30
Archana Shinde	f5d1957174	kata-sys-util: Move additional functionality to cpu.rs Make certain imports architecture specific as these are not used on all architectures. Move additional constants and functionality to cpu.rs. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-07-13 20:08:13 +05:30
Nathan Whyte	304b9d9146	kata-sys-util: Move CPU info functions Move get_single_cpu_info and get_cpu_flags into kata-sys-util. Add new functions that get a list of flags and check if a flag exists in that list. Fixes #6383 Signed-off-by: Nathan Whyte <nathanwhyte35@gmail.com> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-07-13 20:08:13 +05:30
Fabiano Fidêncio	eed3c7c046	Merge pull request #7322 from fidencio/topic/gha-ci-add-cri-containerd-tests-skeleton-follow-up gha: ci: Add cri-containerd tests skeleton -- follow up 1	2023-07-13 13:53:48 +02:00
Fabiano Fidêncio	7319cff77a	ci: cri-containerd: Add LTS / Active versions for containerd As we'll be testing against the LTS and the Active versions of containers, let's add those entries to the versions.yaml file and make sure we export what we want to use for the tests as an env var. The approach taken should not break the current way of getting the containerd version. LTS and Active versions of containerd can be found at: https://containerd.io/releases/#support-horizon Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-13 12:05:47 +02:00
Fabiano Fidêncio	2a957d41c8	ci: cri-containerd: Export GOPATH Let's make sure this is exported, as it'll be needed in order to install `yq`, which will be used to get the versions of the dependencies to be installed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-13 12:05:47 +02:00
Fabiano Fidêncio	75a294b74b	ci: cri-containerd: Ensure deps are installed Let's make sure we install the needed dependencies for running the `cri-containerd` tests. Right now this commit is basically adding a placeholder, and later on, when we'll actually be able to test the job, we'll add the logic of installing the needed dependencies. The obvious dependencies we've spotted so far are: * From the OS * jq * curl (already present) * From our repo * yq (using the install_yq script) * From GitHub * cri-containerd * cri-tools * cni plugins We may need a few more packages, but we will only figure this out as part of the actual work. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-13 12:04:22 +02:00
Zhongtao Hu	b69cdb5c21	Merge pull request #7286 from xuejun-xj/xuejun/up-fix dragonball/agent: Add some optimization for Makefile and bugfixes of unit tests on aarch64	2023-07-13 09:39:23 +08:00
GabyCT	ee17097e88	Merge pull request #7282 from GabyCT/topic/enableblogbench metrics: Enable blogbench test	2023-07-12 16:35:52 -06:00
David Esparza	f63673838b	Merge pull request #7315 from GabyCT/topic/machinelearning tests: Add machine learning performance tests	2023-07-12 15:57:11 -06:00
David Esparza	6924d14df5	metrics: Fix metrics ts generator to treat numbers as decimals Use bc tool to perform math operations even when variables contain values with leading zero. Fixes: #7317 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-12 20:57:33 +00:00
Gabriela Cervantes	9e048c8ee0	checkmetrics: Add blogbench read value for qemu This PR adds the blogbench read value for qemu. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 20:38:27 +00:00
Gabriela Cervantes	2935aeb7d7	checkmetrics: Add blogbench write value for qemu This PR adds the blogbench write value for qemu limit. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 20:37:27 +00:00
Gabriela Cervantes	02031e29aa	checkmetrics: Add blogbench read value for clh This PR adds the blogbench read value for clh limit. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 20:37:27 +00:00
Gabriela Cervantes	107fae033b	checkmetrics: Add blogbench write value for clh This PR adds the blogbench write value limit for clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 20:37:27 +00:00
Gabriela Cervantes	8c75c2f4bd	metrics: Update blogbench Dockerfile This PR udpates the blogbench dockerfile to have non interactive mode. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 20:37:27 +00:00
Gabriela Cervantes	49723a9ecf	metrics: Add double quotes to variables This PR adds double quotes to variables in the blogbench script to have uniformity across all the tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 20:37:27 +00:00
Gabriela Cervantes	dc67d902eb	metrics: Enable blogbench test This PR enables the blogbench performance test for the kata metrics CI. Fixes #7281 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 20:37:24 +00:00
Fabiano Fidêncio	3f38f75918	Merge pull request #7314 from fidencio/topic/gha-ci-add-cri-containerd-tests-skeleton tests: gha: ci: Add cri-containerd tests skeleton	2023-07-12 22:21:47 +02:00
Fabiano Fidêncio	438fe3b829	gha: ci: Add cri-containerd tests skeleton This PR builds the foundation for us to start migrating the cri-containerd tests from Jenkins to GitHub Actions. Right now the test does nothing and should always finish successfully. The coming PRs will actually introduce logic to the `gha-run.sh` script where we'll be able to run the tests and make sure those pass before having them actually merged. Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-12 20:57:39 +02:00
Fabiano Fidêncio	bd08d745f4	tests: metrics: Move metrics specific function to metrics gha-run.sh `compress_metrics_results_dir()` is only used by the metrics GHA. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-12 20:56:55 +02:00
Fabiano Fidêncio	3ffd48bc16	tests: common: Move a few utility functions to common.bash Those functions were originally introduced as part of the `metrics/gha-run.sh` file, but those will be very hand at the time we start adding more tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-12 20:55:05 +02:00
Gabriela Cervantes	7f961461bd	tests: Add machine learning README This PR adds machine learning README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 16:37:15 +00:00
Fabiano Fidêncio	bb2ef4ca34	tests: Add `function` before each function Let's just keep this standardised. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-12 18:36:09 +02:00
Gabriela Cervantes	063f7aa7cb	tests: Add Pytorch Dockerfile This PR adds Pytorch Dockerfile for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 16:34:17 +00:00
Fabiano Fidêncio	b6282f7053	Merge pull request #7255 from GabyCT/topic/memoryinsideenabled metrics: Enable memory inside container metrics	2023-07-12 18:33:36 +02:00
Gabriela Cervantes	1af03b9b32	tests: Add Pytorch performance test This PR adds Pytorch performance test for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 16:33:02 +00:00
Gabriela Cervantes	4cecd62370	tests: Add tensorflow Dockerfile This PR adds the tensorflow Dockerfile. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 16:31:32 +00:00
Gabriela Cervantes	c4094f62c9	tests: Add metrics machine learning performance tests This PR adds metrics machine learning performance tests like Tensorflow and Pytorch. Fixes #7313 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-12 16:28:25 +00:00
Jeremi Piotrowski	b9a63d66a4	Merge pull request #7297 from jepio/fix-mariner-cache tools: Use a consistent target name when building mariner initrd	2023-07-12 13:43:47 +02:00
Fabiano Fidêncio	1ab99bd6bb	Merge pull request #7276 from fidencio/topic/gha-debug-gha-tests-start gha: ci: Gather info about the node / pods	2023-07-12 12:35:10 +02:00
Chao Wu	f6a51a8a78	Merge pull request #7306 from justxuewei/none-network-model runtime-rs: Do not scan network if network model is "none"	2023-07-12 14:53:52 +08:00
Zvonko Kaiser	4e352a73ee	Merge pull request #7308 from fidencio/topic/gha-temporarily-disable-tdx-runs gha: k8s: tdx: Temporarily disable TDX tests	2023-07-12 08:39:02 +02:00
Fabiano Fidêncio	89b622dcb8	gha: k8s: tdx: Temporarily disable TDX tests TDX tests need to be temporarily disabled as the current machine allocated for this will be off for some time, and a new machine only will be added next week. Fixes: #7307 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-12 08:26:10 +02:00
Fabiano Fidêncio	8c9d08e872	gha: ci: Gather info about the node / pods This is a very simple addition, that should be expanded by https://github.com/kata-containers/kata-containers/pull/7185, and it's targetting gathering more info that will help us to debug CI failures. Fixes: #7296 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-12 08:04:37 +02:00
alex.lyn	283f809dda	runtime-rs: Enhancing Device Manager for network endpoints. Currently, network endpoints are separate from the device manager and need to be included for proper management. In order to do so, we need to refactor the implementation of the network endpoints. The first step is to restructure the NetworkConfig and NetworkDevice structures. Next, we will implement the virtio-net driver and add the Network device to the Device Manager. Finally, we'll unify entries with do_handle_device for each endpoint. Fixes: #7215 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-07-12 11:27:12 +08:00
xuejun-xj	a65291ad72	agent: rustjail: update test_mknod_dev When running cargo test in container, test_mknod_dev may fail sometimes because of "Operation not permitted". Change the device path to "/dev/fifo-test" to avoid this case. Fixes: #7284 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-07-12 11:22:32 +08:00
xuejun-xj	46b81dd7d2	agent: clippy: fix cargo clippy warnings Replace "if let Ok(_) = ..." with ".is_ok()" method. Fixes: #7284 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-07-12 11:22:32 +08:00
xuejun-xj	c4771d9e89	agent: Makefile: enable set SECCOMP dynamically Change ":=" to "?:". Fixes: #7284 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-07-12 11:22:32 +08:00
xuejun-xj	a88212e2c5	utils.mk: update BUILD_TYPE argument Enable to dynamically set BUILD_TYPE argument. Fixes: #7284 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-07-12 11:22:32 +08:00
xuejun-xj	883b4db380	dragonball: fix cargo test on aarch64 1. Update memory end assert because address space layout differs between x86 and arm. 2. Set guest_addr for aarch64 in test_handler_insert_region case. Fixes: #7284 TODO: #7290 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-07-12 11:22:31 +08:00
Xuewei Niu	6822029c81	runtime-rs: Do not scan network if network model is "none" Skip to scan network from netns if the network model is specified to "none". Fixes: #7305 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-07-12 10:00:50 +08:00
Fabiano Fidêncio	ae55893deb	Merge pull request #7303 from GabyCT/topic/cleanupmemoryusage metrics: Update memory usage script	2023-07-11 23:52:05 +02:00
Gabriela Cervantes	ce54e43ebe	metrics: Update memory usage script This PR updates memory usage script by applying the clean_env_ctr at the main in order to avoid failures of leaving certain processes not removed. Fixes #7302 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-11 17:03:25 +00:00
Fabiano Fidêncio	ceb5c69ee8	Merge pull request #7299 from fidencio/topic/gha-stop-previous-workflows-if-a-pr-is-updated gha: Cancel previous jobs if a PR is updated	2023-07-11 16:22:47 +02:00
Fabiano Fidêncio	fbc2a91ab5	gha: Cancel previous jobs if a PR is updated Let's make sure we cancel previous runs, mainly as we have some of those that take a lot of time to run, whenever the PR is updated. This is based on the following stack overflow suggestion: https://stackoverflow.com/questions/66335225/how-to-cancel-previous-runs-in-the-pr-when-you-push-new-commitsupdate-the-curre This is very much needed as we don't want to wait for a long time to have access to a runner because of other runners are still being used performing a task that's meaningless due to the PR update. Fixes: #7298 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-11 14:37:10 +02:00
Jeremi Piotrowski	307cfc8f7a	tools: Use a consistent target name when building mariner initrd Currently a mixture of cbl-mariner and mariner is used when creating the mariner initrd. The kata-static tarball has mariner in the name, but the jenkins url uses cbl-mariner. This breaks cache usage. Use mariner as the target name throughout the build, so that caching works. Fixes: #7292 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-11 14:17:14 +02:00
Fabiano Fidêncio	aa484dc0e3	Merge pull request #7288 from fidencio/topic/add-nightly-jobs-follow-up-7 gha: nightly: Fix long name of AKS clusters issue and make the CI easier to test	2023-07-11 11:16:09 +02:00
Fabiano Fidêncio	d780cc08f4	gha: nightly: Also use `workflow_dispatch` to trigger it This is a very nice suggestion from Steve Horsman, as with that we can manually trigger the workflow anytime we need to test it, instead of waiting for a full day for it to be retriggered via the `schedule` event. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-11 10:42:40 +02:00
Fabiano Fidêncio	b99ff30267	gha: nightly: Fix name size limit for AKS Passing the commit hash as the "pr-number" has shown problematic as it would make the AKS cluster name longer than what's accepted by AKS. One easy way to solve this is just passing "nightly" as the PR number, as that's only used to create the cluster. Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-11 09:59:13 +02:00
xuejun-xj	aedc586e14	dragonball: Makefile: add coverage target Add "coverage" target to compute code coverage for dragonball. Fixes: #7284 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-07-11 14:36:25 +08:00
Fabiano Fidêncio	52100bb3dd	Merge pull request #7280 from fidencio/topic/gha-add-badge-for-our-tests README: Add badge for our Nightly CI	2023-07-10 19:35:33 +02:00
Gabriela Cervantes	310e069f73	checkmetrics: Enable checkmetrics for memory inside test This PR enables the checkmetrics to include the memory inside container test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-10 17:05:13 +00:00
Fabiano Fidêncio	b61b15aab6	Merge pull request #7259 from fidencio/topic/gha-restrict-job-run-according-to-files-touched gha: Do not run all the tests if only docs are updated	2023-07-10 18:12:29 +02:00
Fabiano Fidêncio	1363fbbf12	README: Add badge for our Nightly CI This will help folks to monitor the history of the failing tests, as we've done in Jenkins with the "Green Effort CI". Fixes: #7279 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-10 17:31:51 +02:00
Fabiano Fidêncio	9dc63fe338	Merge pull request #7273 from openanolis/runtime-rs-fix-mem-ci bugfix: plus default_memory when calculating mem size	2023-07-10 15:12:05 +02:00
Zvonko Kaiser	fab2e6a93f	Merge pull request #7277 from fidencio/topic/add-nightly-jobs-follow-up-6 gha: ci: Use github.sha to get the last commit reference	2023-07-10 13:36:31 +02:00
Fabiano Fidêncio	1776b18fa0	gha: Do not run all the tests if only docs are updated We should not go through the trouble of running all our tests on AKS / Azure / baremetal machines in case a PR only changes our documentation. Fixes: #7258 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-10 10:30:46 +02:00
Yushuo	28c29b248d	bugfix: plus default_memory when calculating mem size We've noticed this caused regressions with the k8s-oom tests, and then decided to take a step back and do this in the same way it was done before `67972ec48a`. Moreover, this step back is also more reasonable in terms of the controlling logic. And by doing this we can re-enable the k8s-oom.bats tests, which is done as part of this PR. Fixes: #7271 Depends-on: github.com/kata-containers/tests#5705 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-07-10 15:53:04 +08:00
Fabiano Fidêncio	0c1cbd01d8	gha: ci: after-push: Use github.sha to get the last commit reference As we need to pass down the commit sha to the jobs that will be triggered from the `push` event, we must be careful on what exactly we're using there. At first we were using ${{ github.ref }}, but this turns out to be the branch name, rather than the commit hash. In order to actually get the commit hash, Let's use ${{ github.sha }} instead. Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-10 09:39:33 +02:00
Fabiano Fidêncio	37a9556789	gha: ci: nightly: Use github.sha to get the last commit reference As we need to pass down the commit sha to the jobs that will be triggered from the `schedule` event, we must be careful on what exactly we're using there. At first we were using ${{ github.ref }}, but this turns out to be the branch name, rather than the commit hash. In order to actually get the commit hash, Let's use ${{ github.sha }} instead, as described by https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows# Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-10 09:39:26 +02:00
Fabiano Fidêncio	afbc1f94d7	Merge pull request #7272 from fidencio/topic/dragonball-k8s-number-cpus-fix dragonball: Don't fail if a request asks for more CPUs than allowed	2023-07-10 08:25:06 +02:00
Ji-Xinyou	ed23b47c71	tracing: Add tracing to runtime-rs Introduce tracing into runtime-rs, only some functions are instrumented. Fixes: #5239 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com> Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-07-09 22:09:43 +08:00
Fabiano Fidêncio	96e9374d4b	dragonball: Don't fail if a request asks for more CPUs than allowed Let's take the same approach of the go runtime, instead, and allocate the maximum allowed number of vcpus instead. Fixes: #7270 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-08 15:50:23 +02:00
Fabiano Fidêncio	38f0aaa516	Revert "gha: k8s: dragonball: Skip k8s-number-cpus" This reverts commit `a79505b667`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-08 14:43:49 +02:00
Fabiano Fidêncio	828a721838	gha: k8s: dragonball: Skip k8s-oom Let's skip the k8s-oom, as the test is currently failing. We've an issue opened for that, and we'll be working on re-enabling it as soon as possible. Reference: https://github.com/kata-containers/kata-containers/issues/7271 Fixes: #7253 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-08 14:27:49 +02:00
Fabiano Fidêncio	a79505b667	gha: k8s: dragonball: Skip k8s-number-cpus Let's skip the k8s-number-cpus, as the test is currently failing. We've an issue opened for that, and we'll be working on re-enabling it as soon as possible. Reference: https://github.com/kata-containers/kata-containers/issues/7270 Fixes: #7253 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-08 14:27:42 +02:00
Fabiano Fidêncio	275c84e7b5	Revert "agent: fix the issue of exec hang with a backgroud process" This reverts commit `25d2fb0fde`. The reason we're reverting the commit is because it to check whether it's the cause for the regression on devmapper tests. Fixes: #7253 Depends-on: github.com/kata-containers/tests#5705 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-08 14:27:40 +02:00
Gabriela Cervantes	2be342023b	checkmetrics: Add memory usage inside container value for qemu This PR adds the memory usage inside container value for qemu. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-07 16:28:28 +00:00
Gabriela Cervantes	6ca34f949e	checkmetrics: Add memory inside container value for clh Add memory inside container value for clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-07 16:28:28 +00:00
Gabriela Cervantes	6c68924230	metrics: Enable memory inside container metrics This PR will enable the memory inside container metrics for the Kata CI. Fixes #7254 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-07 16:28:28 +00:00
Fabiano Fidêncio	b7c58320a5	Merge pull request #7267 from fidencio/topic/add-nightly-jobs-follow-up-5 gha: ci: Fix refernce passed to checkout@v3	2023-07-07 18:26:44 +02:00
Fabiano Fidêncio	0ad298895e	gha: ci: Fix refernce passed to checkout@v3 On `cc3993d860` we introduced a regression, where we started passing inputs.commit-hash, instead of github.event.pull_request.head.sha. However, we have been setting commit-hash to github.event.pull_request.sha, meaning that we're mssing a `.head.` there. github.event.pull_request.sha is empty for the pull_request_target event, leading the CI to pull the content from `main` instead of the content from the PR. Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-07 17:55:11 +02:00
Fabiano Fidêncio	48d9f8769e	Merge pull request #7264 from fidencio/topic/add-nightly-jobs-follow-up-4 gha: ci: Avoid using env also in the ci-nightly and payload-after-push	2023-07-07 17:10:43 +02:00
Fabiano Fidêncio	86904909aa	gha: ci: Avoid using env also in the ci-nightly and payload-after-push The latter workflow is breaking as it doesn't recognise ${GITHUB_REF}, the former would most likely break as well, but it didn't get triggered yet. The error we're facing is: ``` Determining the checkout info /usr/bin/git branch --list --remote origin/${GITHUB_REF} /usr/bin/git tag --list ${GITHUB_REF} Error: A branch or tag with the name '${GITHUB_REF}' could not be found ``` Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-07 14:46:30 +02:00
Fabiano Fidêncio	48c3cec1f4	Merge pull request #7243 from sprt/ensure-cluster-no-exist gha: k8s: Ensure cluster doesn't exist before creating it	2023-07-07 14:03:41 +02:00
Fabiano Fidêncio	3e2b723487	Merge pull request #7263 from fidencio/topic/add-nightly-jobs-follow-up-3 gha: ci: More follow up fixes after adding a nightly CI	2023-07-07 13:58:26 +02:00
Fabiano Fidêncio	18bd2d6e4a	Merge pull request #6839 from sprt/sprt/mariner-ci-tests tests: Enable running k8s tests on Mariner	2023-07-07 13:36:28 +02:00
Zvonko Kaiser	f72cb2fc12	agent: Remove shadowed function, add slog-term Remove shadowed get_mounts(), added slog-term as a new crate, slog can directly log to stdout and we can capture output in the test-cases that are created in the function to be tested. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-07 11:28:14 +00:00
Fabiano Fidêncio	1d05b9cc71	gha: ci: Pass down secrets to ci-on-push / ci-nightly We have to do this, otherwise we cannot log into azure. This is a regression introduced by `106e305717`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-07 12:00:33 +02:00
Fabiano Fidêncio	c5b4164cb1	gha: ci: Fix tarball-suffix passed to the metrics tests Instead of passing "-${{ inputs.tag }}-amd64", we must only pass "-${{ inputs.tag }}". This is a regression introduced by `106e305717`. Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-07 12:00:24 +02:00
Fabiano Fidêncio	fa0f9954a1	Merge pull request #7261 from fidencio/topic/add-nightly-jobs-follow-up-2 gha: ci: Avoid using env unless it's really needed	2023-07-07 10:13:25 +02:00
Zvonko Kaiser	07810bf71f	agent: Ignore already mounted dev/fs/pseudo-fs Using an initrd and setting KATA_INIT=yes meaning we're using the kata-agent as the init process we need to make sure that the agent is not segfaulting if mounts are already happened. Some workloads need to configure several things in the initrd before the kata-agent starts which involves having /proc or /sys already mounted. Fixes: #6992 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-07-07 07:36:04 +00:00
Fabiano Fidêncio	11e3ccfa4d	gha: ci: Avoid using env unless it's really needed `de83cd9de7` tried to solve an issue, but it clearly seems that I'm using env wrongly, as what ended up being passed as input was "$VAR", instead of the content of the VAR variable. As we can simply avoid using those here, let's do it and save us a headache. Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-07 07:31:10 +02:00
Aurélien Bombo	c45f646b9d	gha: k8s: Ensure cluster doesn't exist before creating it The cluster cleanup step will sometimes fail to run, meaning the next run would fail in the cluster creation step. This PR addresses that. Example: https://github.com/kata-containers/kata-containers/actions/runs/5349582743/jobs/9867845852 Fixes: #7242 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-07-06 15:06:30 -07:00
GabyCT	58e921eace	Merge pull request #7260 from fidencio/topic/add-nightly-jobs-follow-up-1 gha: ci: Follow up fixes for the nightly jobs	2023-07-06 15:45:13 -06:00
GabyCT	54da0d7c91	Merge pull request #7230 from GabyCT/topic/enabmemory tests: Enable memory usage metrics tests	2023-07-06 14:30:56 -06:00
Fabiano Fidêncio	1a7bbcd398	gha: ci: Fix typo pull_requesst -> pull_request Thanks David Esparza for pointing this one out. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 22:29:00 +02:00
Fabiano Fidêncio	ddf4afb961	gha: ci: Fix set-fake-pr-number job It has to have steps declared, and we need to make it a dependency for the nightly kata-containers-ci-on-push job. Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 22:02:08 +02:00
Fabiano Fidêncio	8a0a66655d	gha: ci: schedule expects a list, not a map And because of that we need to declare '- cron', instead of 'cron'. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 22:02:08 +02:00
Fabiano Fidêncio	5c0269dc5a	gha: ci: Add pr-number input to the correct job It must have been an input for the AKS jobs, not the SNP one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 22:02:08 +02:00
Fabiano Fidêncio	de83cd9de7	gha: ci: Use $VAR instead of ${{ env.VAR }} Otherwise we'll get the following error from the workflow: ``` The workflow is not valid. .github/workflows/ci-on-push.yaml (Line: 24, Col: 20): Unrecognized named-value: 'env'. Located at position 1 within expression: env.COMMIT_HASH .github/workflows/ci-on-push.yaml (Line: 25, Col: 18): Unrecognized named-value: 'env'. Located at position 1 within expression: env.PR_NUMBER ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 22:02:08 +02:00
Wainer Moschetta	1a4ae1ef47	Merge pull request #6953 from fidencio/topic/add-nightly-jobs gha: Add nightly jobs	2023-07-06 14:50:10 -03:00
Gabriela Cervantes	6acce83e12	metrics: Fix the call to check_metrics function This PR fixes the call to check_metrics function as KATA_HYPERVISOR is not needed to be passed. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-06 17:22:49 +00:00
David Esparza	0bd21c173a	Merge pull request #7240 from dborquez/storing_metrics_artifacts metrics: storing metrics workflow artifacts	2023-07-06 09:49:45 -06:00
Fabiano Fidêncio	152e2509ca	Merge pull request #7238 from fidencio/topic/gha-run-tests-on-specific-namespace gha: k8s: Ensure tests are running on a specific namespace	2023-07-06 17:25:00 +02:00
Fabiano Fidêncio	e067d18333	gha: Add a nightly CI job The idea is to mimic what's been done with Jenkins and the "Green CI" effort, but now using our GHA and the GHA infrastructure. Fixes: #7247 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 14:39:49 +02:00
Fabiano Fidêncio	7c0de8703c	gha: k8s: Ensure tests are running on a specific namespace Let's make sure we run our tests in a specific namespace, as in case of any kind of issue, we will just get rid of the namespace itself, which will take care of cleaning up any leftover from failing tests. One important thing to mention is why we can get rid of the `namespace: ${namespace}` on the tests that are already using it, and let's do it in parts: * namespace: default We can easily get rid of this as that's the default namespace where pods are created, so it was a no-op so far. * namespace: test-quota-ns My understanding is that we'd need this in order to get a clean namespace where we'd be setting a quota for. Doing this in the namespace that's only used for tests should not cause any side-effect on the tests, as we're running those in serial and there's no other pods running on the `kata-containers-k8s-tests` namespace Last but not least, we're not dynamically creating namespaces as the tests are not running in parallel, never, not in the case of having 2 tests being ran at same time, neither in the case of having 2 jobs being scheduled to the same machine. Fixes: #6864 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 14:14:50 +02:00
Fabiano Fidêncio	106e305717	gha: Create a re-usable `ci.yaml` file This is based on the `ci-on-push.yaml` file, and it's called from ther The reason to split on a new file is that we can easily introduce a `ci-nightly.yaml` file and re-use the `ci.yaml` file there as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 13:07:59 +02:00
Fabiano Fidêncio	cc3993d860	gha: Pass event specific info from the caller workflow Let's ensure we're not relying, on any of the called workflows, on event specific information. Right now, the two information we've been relying on are: * PR number, coming from github.event.pull_request.number * Commit hash, coming from github.event.pull_request.head.sha As we want to, in the future, add nightly jobs, which will be triggered by a different event (thus, having different fields populated), we should ensure that those are not used unless it's in the "top action" that's trigerred by the event. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-06 11:23:17 +02:00
David Esparza	4e396e7285	metrics: Add function keyword to to helper metrics functions Use the 'function' keyword to prevent bash aliases from colliding with other function's name. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-05 20:59:21 -06:00
David Esparza	1ca17c2f70	metrics: storing metrics workflow artifacts This PR enables storing metrics workflow artifacts in two separated flavours: clh and qemu. Fixes: #7239 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-05 20:57:10 -06:00
David Esparza	a3fc673121	Merge pull request #7181 from dborquez/add_blogbench_and_webtooling metrics: Adds blogbench and webtool metrics tests	2023-07-05 20:37:33 -06:00
Gabriela Cervantes	5a61065ab7	checkmetrics: Add checkmetrics value for memory usage in qemu This PR adds the checkmetrics value for memory usage in qemu. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-05 19:22:12 +00:00
Gabriela Cervantes	78086ed1fe	checkmetrics: Add memory usage value for clh This PR adds the memory usage value for clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-05 19:19:04 +00:00
Gabriela Cervantes	1c3dbafbf0	metrics: Fix function of how to retrieve multiple values This PR fixes the function of how to add multiple values of pss memory. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-05 18:19:36 +00:00
Gabriela Cervantes	18968f428f	metrics: Add function to have uniformity This PR adds the function name before the function to have uniformity across all the test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-05 18:15:31 +00:00
David Esparza	35d096b607	metrics: Adds blogbench and webtool metrics tests This PR adds blogbench and webtooling metrics checks to this repo. The function running the test intentionally returns zero, so the test will be enabled in another PR once the workflow is green. Fixes: #7069 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-07-04 14:38:52 -06:00
Gabriela Cervantes	d8f90e89d5	metrics: Rename function at memory usage script This PR renames the function name for the memory usage script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-04 19:58:09 +00:00
Gabriela Cervantes	b9d66e0d53	metrics: Fix double quotes variables in memory usage script This PR usses double quotes in all the variables as well as general fixes to the memory usage script in order to have uniformity. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-04 19:51:36 +00:00
Gabriela Cervantes	476a11194a	tests: Enable memory usage metrics tests This PR enables the memory usage metrics tests for kata CI. Fixes #7229 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-04 16:11:54 +00:00
Fabiano Fidêncio	a25d5b9807	Merge pull request #7222 from jepio/fix-dragonball-check gha: dragonball: Correctly propagate PATH update	2023-07-04 15:59:13 +02:00
Jeremi Piotrowski	b568c7f7d8	tests/integration: Provide default value for KATA_HOST_OS Non AKS k8s tests (SEV/SNP/TDX) don't currently set KATA_HOST_OS, so provide a default empty value for the variable so that those tests can run. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-04 14:28:29 +02:00
Fabiano Fidêncio	6d2e6ed7b6	Merge pull request #7217 from likebreath/0630/clh_v33.0 versions: Upgrade to Cloud Hypervisor v33.0	2023-07-04 12:52:26 +02:00
Jeremi Piotrowski	d6e96ea06d	tests/integration: Use AzureLinux instead of Mariner as OSSKU value, to get rid of this warning when creating the AKS cluster: WARNING: The osSKU "AzureLinux" should be used going forward instead of "CBLMariner" or "Mariner". The osSKUs "CBLMariner" and "Mariner" will eventually be deprecated. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-04 12:49:07 +02:00
Jeremi Piotrowski	40c46c75ed	tests/integration: Perform yq install in run_tests() We only need to install in run_tests() so that the yq install is picked up by kubernets/setup.sh as well. We also need to either use (sudo && INSTALL_IN_GOPATH=false) \|\| (INSTALL_IN_GOPATH=true). Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-04 12:49:07 +02:00
Bin Liu	f214058b07	Merge pull request #7202 from wedsonaf/macros Convert `is_allowed`, `ttrpc_error` and `sl` to functions	2023-07-04 14:23:08 +08:00
Peng Tao	f5658c7833	Merge pull request #7224 from fidencio/topic/gha-release-fix-hub-download gha: release: Use a specific release of hub	2023-07-04 10:21:17 +08:00
GabyCT	5950df7d95	Merge pull request #7199 from GabyCT/topic/installchem metrics: Add checkmetrics to gha-run.sh for metrics CI	2023-07-03 17:49:18 -06:00
Gabriela Cervantes	d8b8f7e94d	metrics: Enable launch tests time metrics This PR enables the launch tests metrics for kata CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-03 22:38:04 +00:00
Fabiano Fidêncio	72fd562bd6	gha: release: Use a specific release of hub ideally we should never ever use hub again, and switch to a supported / release tool instead. However, in order to get v3.1.3 released, let's just stick to the last released version of hub, as trying to get its release is leading to: ``` curl -s "https://api.github.com/repos/github/hub/releases/latest" { "message": "Moved Permanently", "url": "https://api.github.com/repositories/401025/releases/latest", "documentation_url": "https://docs.github.com/v3/#http-redirects" } ``` And that breaks the release process. :-/ Fixes: #7223 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-03 22:00:55 +02:00
Fabiano Fidêncio	a7340a63a4	Merge pull request #7209 from GabyCT/topic/fixbuildovmf packaging: Fix indentation of build.sh script at ovmf	2023-07-03 20:06:29 +02:00
Gabriela Cervantes	0502354b42	checkmetrics: Add checkmetrics json for qemu This PR adds checkmetrics json file for qemu metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-03 16:47:03 +00:00
Gabriela Cervantes	b481ef1883	makefile: Add -buildvcs=false flag to go build This PR adds the -buildvcs=false flag to the go build of checkmetrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-03 16:42:51 +00:00
Gabriela Cervantes	e94aaed3c7	ci_worker: Add checkmetrics ci worker for cloud hypervisor This PR adds the checkmetrics ci worker file for cloud hypervisor in order to check the boot times limit. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-03 16:42:51 +00:00
Gabriela Cervantes	917576e6fb	metrics: Add double quotes in all variables This PR adds double quotes in all variables to have uniformity across all the gha-run.sh script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-03 16:42:50 +00:00
Gabriela Cervantes	cc8f0a24e4	metrics: Add checkmetrics to gha-run.sh for metrics CI This PR adds checkmetrics installation for gha-run.sh in order to compare results limits as part of the metrics CI. Fixes #7198 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-07-03 16:41:31 +00:00
Jeremi Piotrowski	477856c1e3	gha: dragonball: Correctly propagate PATH update cargo/rust is installed in one step, we need to write the PATH update to GITHUBENV so that it becomes visible in the next steps. Fixes: #7221 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-07-03 17:05:12 +02:00
Fupan Li	b6307c2744	Merge pull request #5444 from zvonkok/vra doc: Add documentation for the virtualization reference architecture	2023-07-03 10:14:20 +08:00
Peng Tao	c85aff7ef4	Merge pull request #6949 from zvonkok/kernel-fixes gpu: Update kernel building to the latest changes	2023-07-03 09:53:08 +08:00
Peng Tao	581be92b25	Merge pull request #4492 from zvonkok/pcie-topology runtime: fix PCIe topology for GPUDirect use-case	2023-07-03 09:17:12 +08:00
David Esparza	d01762dc35	Merge pull request #7174 from dborquez/add_memory_footprint_test metrics: Add memory footprint tests	2023-06-30 16:32:10 -06:00
Fabiano Fidêncio	00b0755e3e	Merge pull request #7200 from fidencio/topic/add-virtiofs-none-option runtime: Add "none" as a shared_fs option	2023-06-30 22:45:39 +02:00
Aurélien Bombo	1c211cd730	gha: Swap asset/release in build matrix This simply displays the asset name first in GH's UI, so that the release name (always "test") is truncated rather than the asset name. Makes things slightly easier to read. e.g. build-asset (cloud-hypervisor-glibc, te... instead of build-asset (test, cloud-hypervisor-gli... Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-30 12:51:40 -07:00
Aurélien Bombo	0152c9aba5	tools: Introduce `USE_CACHE` environment variable This allows setting `USE_CACHE=no` to test building e2e during developmet without having to comment code blocks and so forth. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-30 12:51:40 -07:00
Aurélien Bombo	2b59756894	tests: Build CLH with glibc for Mariner This enables building CLH with glibc and the mshv feature as required for Mariner. At test time, it also configures Kata to use that CLH flavor when running Mariner. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-30 12:51:40 -07:00
Aurélien Bombo	80c78eadce	tests: Use baked-in kernel with Mariner Mariner ships a bleeding-edge kernel that might be ahead of upstream, so we use that to guarantee compatibility with the host. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-30 12:51:40 -07:00
Aurélien Bombo	532755ce31	tests: Build Mariner rootfs initrd * Adds a new `rootfs-initrd-mariner` build target. * Sets the custom initrd path via annotation in `setup.sh` at test time. * Adapts versions.yaml to specify a `cbl-mariner` initrd variant. * Introduces env variable `HOST_OS` at deploy time to enable using a custom initrd. * Refactors the image builder so that its caller specifies the desired guest OS. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-30 12:51:40 -07:00
Fabiano Fidêncio	6a21e20c63	runtime: Add "none" as a shared_fs option Currently, even when using devmapper, if the VMM supports virtio-fs / virtio-9p, that's used to share a few files between the host and the guest. This needed, as we need to share with the guest contents like secrets, certificates, and configurations, via Kubernetes objects like configMaps or secrets, and those are rotated and must be updated into the guest whenever the rotation happens. However, there are still use-cases users can live with just copying those files into the guest at the pod creation time, and for those there's absolutely no need to have a shared filesystem process running with no extra obvious benefit, consuming memory and even increasing the attack surface used by Kata Containers. For the case mentioned above, we should allow users, making it very clear which limitations it'll bring, to run Kata Containers with devmapper without actually having to use a shared file system, which is already the approach taken when using Firecracker as the VMM. Fixes: #7207 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-06-30 20:45:00 +02:00
Bo Chen	5681caad5c	versions: Upgrade to Cloud Hypervisor v33.0 Details of this release can be found in ourroadmap project as iteration v33.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #7216 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-06-30 09:37:27 -07:00
David Esparza	b2ce8b4d61	metrics: Add memory footprint tests to the CI This PR adds memory foot print metrics to tests/metrics/density folder. Intentionally, each test exits w/ zero in all test cases to ensure that tests would be green when added, and will be enabled in a subsequent PR. A workflow matrix was added to define hypervisor variation on each job, in order to run them sequentially. The launch-times test was updated to make use of the matrix environment variables. Fixes: #7066 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-06-30 09:52:27 -06:00
David Esparza	5e3f617cb6	Merge pull request #7197 from GabyCT/topic/fixfunctionname metrics: Uniformity across function names in gha-run.sh	2023-06-30 09:37:15 -06:00
Zvonko Kaiser	d035955ef5	doc: Add documentation for the virtualization reference architecture Fixes: #4041 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-30 12:30:37 +00:00
Zvonko Kaiser	0f454d0c04	gpu: Fixing typos for PCIe topology changes Some comments and functions had typos and wrong capitalization. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-30 08:42:55 +00:00
Gabriela Cervantes	6bb2ea8195	packaging: Fix indentation of build.sh script at ovmf This PR fixes the indentation of build.sh script at ovmf. Fixes #7208 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-29 15:46:54 +00:00
Fupan Li	4288b935e1	Merge pull request #7104 from openanolis/physical/endpoint runtime-rs: support physical endpoint using device manager	2023-06-29 14:43:44 +08:00
GabyCT	19890133e9	Merge pull request #7189 from Apokleos/direct-vol-bugfix runtime-rs: bugfix for direct volume path's validation.	2023-06-28 12:26:22 -06:00
Wedson Almeida Filho	0504bd7254	agent: convert the `sl` macros to functions There is nothing in them that requires them to be macros. Converting them to functions allows for better error messages. Fixes: #7201 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-06-28 14:05:32 -03:00
Wedson Almeida Filho	0860fbd410	agent: convert the `ttrpc_error` macro to a function There is nothing in it that requires it to be a macro. Converting it to a function allows for better error messages. Fixes: #7201 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-06-28 14:05:32 -03:00
Wedson Almeida Filho	0e5d6ce6d7	agent: convert the `is_allowed` macro to a function Having a function allows for better error messages from the type checker and it makes it clearer to callers what can happen. For example: is_allowed!(req); Gives no indication that it may result in an early return, and no simple way for callers to modify the behaviour. It also makes it look like ownership of `req` is being transferred. On the other hand, is_allowed(&req)?; Indicates that `req` is being borrowed (immutably) and may fail. The question mark indicates that the caller wants an early return on failure. Fixes: #7201 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-06-28 14:05:32 -03:00
Wedson Almeida Filho	f680fc52be	agent: change `AGENT_CONFIG`'s lazy type to just `AgentConfig` Since it is never modified, it doesn't really need a lock of any kind. Removing the `RwLock` wrapper allows us to remove all `.read().await` calls when accessing it. Additionally, `AGENT_CONFIG` already has a static lifetime, so there is no need to wrap it in a ref-counted heap allocation. Fixes: #5409 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-06-28 14:05:27 -03:00
GabyCT	3f87d0fbfe	Merge pull request #7180 from dborquez/run_ret_hypervisor_version_w_sudo metrics: Fix retrieving hypervisor version on metrics	2023-06-28 10:54:23 -06:00
Gabriela Cervantes	beb7063683	metrics: Uniformity across function names This PR adds the word function before the function names in order to have uniformity across the script as some are using this and some are not. Fixes #7196 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-28 16:09:19 +00:00
Fabiano Fidêncio	c8d33da8a4	Merge pull request #7188 from jongwu/fix_vfio runtime-rs: fix build error on AArch64	2023-06-28 15:43:14 +02:00
Jianyong Wu	1f3e837e4b	runtime-rs: fix build error on AArch64 Vfio support introduce build error on AArch64. Remove arch related annotation can avoid this error. Fixes: #7187 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-06-28 07:10:43 +00:00
alex.lyn	6fd25968c6	runtime-rs: bugfix for direct volume path's validation. The failure mainly caused by the encoded volume path and the mount/src. As the src will be validated with stat,but it's not a full path and encoded, which causes the stat mount source failed. Fixes: #7186 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-06-28 10:07:07 +08:00
GabyCT	3885ba4910	Merge pull request #7173 from GabyCT/topic/addcheckm checkmetrics: Add checkmetrics makefile and documentation	2023-06-27 16:30:44 -06:00
Gabriela Cervantes	415578cf3b	docs: Add general README This PR adds link to the unreference docs in the cmd path to make them more discoverable. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-27 20:29:37 +00:00
Zhongtao Hu	c76583a08f	Merge pull request #7171 from GabyCT/topic/enabletimedoc docs: Add boot time metrics documentation	2023-06-27 10:28:56 +08:00
Zhongtao Hu	bff4672f7d	runtime-rs: support physical endpoint using device manager use device manager to attach physical endpoint Fixes: #7103 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-06-27 10:25:51 +08:00
David Esparza	32cba7e44a	metrics: Fix retrieving hypervisor version on metrics This PR makes use of sudo to retrieve the hypervisor version. Fixes: #7178 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-06-26 16:26:27 -06:00
Gabriela Cervantes	aa7946de47	checkmetrics: Add general checkmetrics documentation This PR adds the general checkmetrics documentation for kata metrics tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-26 17:07:57 +00:00
Gabriela Cervantes	2fac2b72fe	checkmetrics: Add checkmetrics makefile This PR adds checkmetrics makefile which is used to process the metrics json results files. Fixes #7172 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-26 16:31:55 +00:00
Gabriela Cervantes	e45899ae0e	docs: Add time tests documentation reference This PR adds time tests documentation reference in the general README for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-26 16:30:20 +00:00
Gabriela Cervantes	28130d3cef	docs: Add boot time metrics documentation This PR adds boot time metrics documentation for kata metrics tests. Fixes #7170 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-26 16:19:28 +00:00
Zhongtao Hu	ce8e3cc091	Merge pull request #7073 from Apokleos/spdk-vol runtime-rs: add support spdk/vhost-user based volume.	2023-06-26 11:34:44 +08:00
alex.lyn	0df2fc2702	runtime-rs: add support spdk/vhost-user based volume. Unlike the previous usage which requires creating /dev/xxx by mknod on the host, the new approach will fully utilize the DirectVolume-related usage method, and pass the spdk controller to vmm. And a user guide about using the spdk volume when run a kata-containers. it can be found in docs/how-to. Fixes: #6526 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-06-25 16:23:19 +08:00
GabyCT	4cf552c151	Merge pull request #7097 from stevenhorsman/remove-unecessary-kata-versions static-build: Remove kata-version parameter	2023-06-23 16:53:57 -06:00
GabyCT	388b55175e	Merge pull request #7056 from FuuuOverclocking/fuu/fix-console_manager dragonball: avoid obtaining lock twice in create_stdio_console	2023-06-23 16:47:00 -06:00
GabyCT	1a80fd66a2	Merge pull request #7161 from GabyCT/topic/enablemetricslimits metrics: Add checkmetrics for kata metrics CI	2023-06-23 16:45:16 -06:00
Gabriela Cervantes	17198089ee	vendor: Add vendor checkmetrics dependencies This PR adds the vendor for the checkmetrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-23 20:55:30 +00:00
David Esparza	cfd6da9467	Merge pull request #7159 from dborquez/enable_launchtimes_test metrics: enable launch-times test on gha-run metrics script	2023-06-23 12:59:46 -06:00
GabyCT	d6ff48f4e7	Merge pull request #7158 from GabyCT/topic/addmetricsreadme docs: Add general metrics documentation	2023-06-23 11:28:00 -06:00
Gabriela Cervantes	f1dfea6e87	docs: Add metrics documentation reference This PR adds the metrics documentation as a general reference in the main README for kata containers. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-23 16:26:34 +00:00
Zvonko Kaiser	8330fb8ee7	gpu: Update unit tests Some tests are now failing due to the changes how PCIe is handled. Update the test accordingly. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-23 11:16:25 +00:00
David Esparza	8593594247	metrics: enable launch-times test on gha-run metrics script This PR enables launch-times test on gha metrics workflow. Fixes: #7049 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-06-22 18:05:46 -06:00
Fupan Li	469c678425	Merge pull request #7058 from Apokleos/vfio-dev add support vfio device manager	2023-06-22 17:51:22 -06:00
Gabriela Cervantes	c4ee601bf4	metrics: Add checkmetrics for kata metrics CI This PR adds the checkmetrics scripts that will be used for the kata metrics CI. Fixes #7160 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-22 21:06:46 +00:00
Steve Horsman	267e97f9c0	Merge pull request #7162 from sprt/trusted-pr-authors gha: Don't automatically trigger CI	2023-06-22 20:55:10 +01:00
Aurélien Bombo	e0d6475b49	gha: Don't automatically trigger CI We have GH configured so that manual approval is required for CI runs triggered by outside contributors. However, because CI is triggered by the `pull_request_target` event, this setting isn't being honored (see [1]). This means that an attacker could trivially extracts secrets by submitting a PR. This change aims to mititgate this issue by preventing PRs from triggering CI unless the `ok-to-test` label is set. Note: For further context, we use the `pull_request_target` event and manually check out the PR branch because it is the only way to both access secrets and test incoming code changes. Fixes: #7163 [1]: https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-22 11:05:53 -07:00
Aurélien Bombo	b535c7cbd8	tests: Enable running k8s tests on Mariner This removes the gate and lets CI run tests on Mariner. Fixes: #6840 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-22 10:30:52 -07:00
Archana Shinde	2d329125fd	Merge pull request #6800 from amshinde/check-vm-capability kata-ctl: Check for vm capability	2023-06-21 23:52:46 -07:00
Zhongtao Hu	4b793222ab	Merge pull request #7154 from cheriL/7153/fix_spellings docs: fix spelling of "crate"	2023-06-22 10:48:58 +08:00
Gabriela Cervantes	71071bdb63	docs: Add general metrics documentation This PR adds a general metrics introduction documentation for the kata CI. Fixes #7157 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-21 17:19:36 +00:00
Archana Shinde	610f7986e4	check: Relax the unrestricted_guest check when running in a VM When running on a VM, the kernel parameter "unrestricted_guest" for kernel module "kvm_intel" is not required. So, return success when running on a VM without checking value of this kernel parameter. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-06-21 07:30:35 -07:00
Archana Shinde	1b406b9d0c	kata-ctl:Implement functionality to check host is capable of running VM Implement functionality to add to the env output if the host is capable of running a VM. Fixes: #6727 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-06-21 07:30:22 -07:00
David Esparza	90408d66c0	Merge pull request #7148 from GabyCT/topic/fixtabsinitscript packaging: Fix indentation in init.sh script	2023-06-21 07:24:25 -06:00
stevenhorsman	adf88eaa89	static-build: Remove kata-version parameter - Remove the unnecessary kata-version passed as a second parameter Fixes: #7096 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-06-21 10:15:42 +01:00
soup	09720babc3	docs: fix spelling of "crate" Fixes: #7153 Signed-off-by: soup <lqh348659137@outlook.com>	2023-06-21 16:10:54 +08:00
David Esparza	84b214d9d2	Merge pull request #7150 from GabyCT/topic/fixworkflows gha: Fix gha actions	2023-06-20 18:08:23 -06:00
Gabriela Cervantes	7185afc50e	gha: Fix gha actions This PR removes an unrecognized value located in one of the yamls for the gha in order to make it work the CI again. Fixes #7149 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-20 23:13:25 +00:00
Gabriela Cervantes	21294b868d	packaging: Fix indentation in init.sh script This PR replaces single spaces for tabs in order to fix the indentation in the init.sh script. Fixes #7147 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-20 22:06:52 +00:00
GabyCT	90e36f43ff	Merge pull request #7138 from dborquez/setup-kata-and-configure-launchtimes-test metrics: install kata and launch-times test	2023-06-20 16:00:38 -06:00
David Esparza	fad3ac9f58	metrics: install kata and launch-times test This PR installs kata static tarball on metrics runner and run launch-times tests. Fixes: #7049 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-06-20 13:58:09 -06:00
David Esparza	d071a87c7b	Merge pull request #7109 from dborquez/add_common_libs_for_metrics tests: Move tests helper script to this repo	2023-06-19 19:02:37 -06:00
David Esparza	4bbfcfaf15	tests: Move tests helper script to this repo The common.sh script includes helper functions used in our metrics tests, so we are gradually adding more metrics used in kata. Fixes: #7108 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-06-19 12:14:25 -06:00
David Esparza	f152f0e8c3	metrics: Add launch-times to metrics tests This test measures the duration of a workload that starts, and then immediately stops the contianer. Also measures the workload period, the time to quit period, and the time to kernel period. Fixes: #7049 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-06-19 10:40:16 -06:00
GabyCT	decbe77e28	Merge pull request #7129 from GabyCT/topic/metrlibjson tests: Add json script for metrics tests	2023-06-19 09:59:41 -06:00
Fabiano Fidêncio	ef8b360711	Merge pull request #7085 from stevenhorsman/cherry-pick-initramfs Cherry pick initramfs caching updates from CCv0	2023-06-19 11:59:00 +02:00
alex.lyn	59510cfee0	runtime-rs: add support vfio device based volume A new choice of using vfio devic based volume for kata-containers. With the help of kata-ctl direct-volume, users are able to add a specified device which is BDF or IOMMU group ID. To help users to use it smoothly, A doc about howto added in docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes. Fixes: #6525 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-06-18 14:07:05 +08:00
alex.lyn	1e3b372bbb	runtime-rs: add support vfio device manager Limitations: As no ready rust vmm's vfio manager is ready, it only supports part of vfio in runtime-rs. And the left part is to call vmm interfaces related to vfio add/remove. So when vmm/vfio manager ready, a new PR will be pushed to narrow the gap. Fixes: #6525 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-06-18 14:05:59 +08:00
David Esparza	61e819ea8e	Merge pull request #7131 from GabyCT/topic/fixrunner gha: Fix format for run launchtimes metrics yaml	2023-06-16 18:30:57 -06:00
Gabriela Cervantes	6b08489301	gha: Fix format for run launchtimes metrics yaml This PR fixes the format for the run launchtimes metrics yaml which is causing to the workflow to fail. Fixes #7130 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-16 22:00:36 +00:00
Gabriela Cervantes	3cefa43e75	tests: Add json script for metrics tests This PR adds the json script which allow us to save the metrics results into a json file which will be used in the kata containers metrics. Fixes #7128 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-16 19:45:26 +00:00
GabyCT	7976a0ac72	Merge pull request #7114 from GabyCT/topic/libcommontests tests: Add tests lib common script	2023-06-16 11:48:19 -06:00
Greg Kurz	27045798bf	Merge pull request #7112 from gkurz/fix-virtiofsd-args Fix deprecated virtiofsd args (go shim only)	2023-06-16 18:13:24 +02:00
Fabiano Fidêncio	6a3710055b	initramfs: Build dependencies as part of the Dockerfile This will help to not have to build those on every CI run, and rather take advantage of the cached image. Fixes: #7084 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `c720869eef`)	2023-06-16 10:58:12 +01:00
Fabiano Fidêncio	aa2380fdd6	packaging: Add infra to push the initramfs builder image Let's add the needed infra for only building and pushing the initramfs builder image to the Kata Containers' quay.io registry. Fixes: #7084 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `111ad87828`)	2023-06-16 10:58:12 +01:00
Fabiano Fidêncio	1c7fcc6cbb	packaging: Use existing image to build the initramfs Let's first try to pull a pre-existing image, instead of building our own, to be used as a builder for the initramds. This will save us some CI time. Fixes: #7084 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `ebf6c83839`)	2023-06-16 10:58:12 +01:00
Greg Kurz	a43ea24dfc	virtiofsd: Convert legacy `-o` sub-options to their `--` replacement The `-o` option is the legacy way to configure virtiofsd, inherited from the C implementation. The rust implementation honours it for compatibility but it logs deprecation warnings. Let's use the replacement options in the go shim code. Also drop references to `-o` from the configuration TOML file. Fixes #7111 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-06-16 11:42:54 +02:00
Greg Kurz	8e00dc6944	virtiofsd: Drop `-o no_posix_lock` The C implementation of virtiofsd had some kind of limited support for remote POSIX locks that was causing some workflows to fail with kata. Commit `432f9bea6e` hard coded `-o no_posix_lock` in order to enforce guest local POSIX locks and avoid the issues. We've switched to the rust implementation of virtiofsd since then, but it emits a warning about `-o` being deprecated. According to https://gitlab.com/virtio-fs/virtiofsd/-/issues/53 : The C implementation of the daemon has limited support for remote POSIX locks, restricted exclusively to non-blocking operations. We tried to implement the same level of functionality in #2, but we finally decided against it because, in practice most applications will fail if non-blocking operations aren't supported. Implementing support for non-blocking isn't trivial and will probably require extending the kernel interface before we can even start working on the daemon side. There is thus no justification to pass `-o no_posix_lock` anymore. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-06-16 11:42:39 +02:00
Greg Kurz	2a15ad9788	virtiofsd: Stop using deprecated `-f` option The rust implementation of virtiofsd always runs foreground and spits a deprecation warning when `-f` is passed. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-06-16 10:30:40 +02:00
David Esparza	b9d92f4577	Merge pull request #7117 from dborquez/add_checkout_metrics_workflow gha: Add base branch on SHA on pull requst	2023-06-15 17:06:16 -06:00
Gabriela Cervantes	c3043a6c60	tests: Add tests lib common script This PR adds the test lib common script that is going to be used for kata containers metrics. Fixes #7113 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-15 21:23:00 +00:00
David Esparza	b16e0de734	gha: Add base branch on SHA on pull requst The run-launchtimes-metrics workflow needs to get the commit ID for the last commit to the head branch of the PR. Fixes: #7116 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-06-15 13:11:33 -06:00
Zvonko Kaiser	72f2cb84e6	gpu: Reset cold or hot plug after overriding If we override the cold, hot plug with an annotation we need to reset the other plugging mechanism to NoPort otherwise both will be enabled. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-15 17:51:01 +00:00
Zvonko Kaiser	fbacc09646	gpu: PCIe topology, consider vhost-user-block in Virt In Virt the vhost-user-block is an PCIe device so we need to make sure to consider it as well. We're keeping track of vhost-user-block devices and deduce the correct amount of PCIe root ports. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-15 17:39:55 +00:00
GabyCT	0f24f427d7	Merge pull request #7101 from dborquez/add_initial_metrics_gh_workflow gha: ci-on-push: Run metrics tests	2023-06-15 10:08:56 -06:00
David Esparza	bc152b1141	gha: ci-on-push: Run metrics tests This gh-workflow prints a simple msg, but is the base for future PRs that will gradually add the jobs corresponding to the kata metrics test. Fixes: #7100 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-06-14 15:15:08 -06:00
GabyCT	a3180d0cb8	Merge pull request #7095 from GabyCT/topic/updatedebugconse docs: Update Developer Guide	2023-06-14 13:49:37 -06:00
Gabriela Cervantes	dad731d5c1	docs: Update Developer Guide This PR updates the developer guide at the connect to the debug console section. Fixes #7094 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-14 15:36:51 +00:00
Zhongtao Hu	11692a76e1	Merge pull request #7092 from Apokleos/virtiofs-enhancement runtime-rs: Enhance flexibility of virtio-fs config	2023-06-14 20:01:46 +08:00
Zvonko Kaiser	b11246c3aa	gpu: Various fixes for virt machine type The PCI qom path was not deduced correctly added regex for correct path walking. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 08:33:57 +00:00
Zvonko Kaiser	40101ea7db	vfio: Added annotation for hot(cold) plug Now it is possible to configure the PCIe topology via annotations and addded a simple test, checking for Invalid and RootPort Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 08:20:24 +00:00
Zvonko Kaiser	8f0d4e2612	vfio: Cleanup of Cold and Hot Plug Removed the configuration of PCIeRootPort and PCIeSwitchPort, those values can be deduced in createPCIeTopology Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 08:20:24 +00:00
Zvonko Kaiser	b5c4677e0e	vfio: Rearrange the bus assignemnt Refactor the bus assignment so that the call to GetAllVFIODevicesFromIOMMUGroup can be used by any module without affecting the topology. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 08:20:24 +00:00
Zvonko Kaiser	b1aa8c8a24	gpu: Moved the PCIe configs to drivers The hypervisor_state file was the wrong location for the PCIe Port settings, moved everything under device umbrella, where it can be consumed more easily and we do not get into circular deps. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 08:20:24 +00:00
Zvonko Kaiser	55a66eb7fb	gpu: Add config to TOML Update cold-plug and hot-plug setting to include bridge, root and switch-port Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 08:20:24 +00:00
Zvonko Kaiser	da42801c38	gpu: Add config settings tests for hot-plug Updated all references and config settings for hot-plug to match cold-plug Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 08:20:24 +00:00
Zvonko Kaiser	de39fb7d38	runtime: Add support for GPUDirect and GPUDirect RDMA PCIe topology Fixes: #4491 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 08:20:24 +00:00
Zvonko Kaiser	9318e022af	gpu: Add CC relates configs For the GPU CC use case we need to set several crypto algorithms. The driver relies on them in the CC case. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 07:56:53 +00:00
Zvonko Kaiser	b7932be4b6	gpu: Add Arm64 Kernel Settings For different archs we need diferent settings use ${ARCH} to choose the right fragment Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 07:56:53 +00:00
Zvonko Kaiser	211b0ab268	gpu: Update Kernel Config Newer drivers need more symbols so lets enable them Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 07:56:53 +00:00
Zvonko Kaiser	5f103003d6	gpu: Update kernel building to the latest changes Use now the sev.conf rather then the snp.conf. Devices can be prestend in two different way in the container (1) as vfio devices /dev/vfio/<num> (2) the device is managed by whataever driver in the VM kernel claims it. Fixes: #6844 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-06-14 07:56:53 +00:00
Fabiano Fidêncio	95bec479ca	Merge pull request #7090 from GabyCT/topic/ufcversion versions: Update firecracker version to 1.3.3	2023-06-14 01:24:02 +02:00
Fabiano Fidêncio	8aa4a87fae	Merge pull request #7099 from sprt/fix-new-targets tools: Fix no-op builds	2023-06-14 01:23:39 +02:00
Aurélien Bombo	35e4938e8c	tools: Fix no-op builds This fixes the builds of `cloud-hypervisor-glibc` and `rootfs-initrd-mariner` to properly create the `build/` directory. Fixes: #7098 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-13 10:56:49 -07:00
Zhongtao Hu	da8dde0c24	Merge pull request #7079 from HerlinCoder/herlincoder/vpa runtime-rs: update Cargo.lock	2023-06-13 21:44:45 +08:00
Fabiano Fidêncio	ff38937246	Merge pull request #7087 from sprt/fix-gha-stage gha: Fix `stage` definition in matrix	2023-06-13 12:17:25 +02:00
alex.lyn	347385b4ee	runtime-rs: Enhance flexibility of virtio-fs config support more and flexible options for inline virtiofs. Fixes: #7091 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-06-13 15:12:47 +08:00
Zhongtao Hu	355a24e0e1	Merge pull request #6289 from openanolis/runtime_vcpu_resize feat(runtime): vcpu resize capability	2023-06-13 10:54:11 +08:00
Chelsea Mafrica	1763b1f69f	Merge pull request #7082 from jodh-intel/remove-snap packaging: Remove snap package	2023-06-12 17:05:00 -07:00
Gabriela Cervantes	21d2278539	versions: Update firecracker version to 1.3.3 This PR updates the firecracker version to 1.3.3 which includes the following changes Fixed passing through cache information from host in CPUID leaf 0x80000006. A race condition that has been identified between the API thread and the VMM thread due to a misconfiguration of the api_event_fd. Fixes #7089 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-12 20:32:02 +00:00
Aurélien Bombo	0e2379909b	gha: Fix `stage` definition in matrix This defines `stage` as a list instead of a literal to fix the GHA CI. Fixes: #7086 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-12 11:24:45 -07:00
Fabiano Fidêncio	977309a281	Merge pull request #7027 from sprt/sprt/mariner-build-targets gha: Add new build targets for Mariner	2023-06-12 19:19:22 +02:00
Yushuo	ae2cfa8263	doc: add vcpu handlint doc for runtime-rs Kubernetes and Containerd will help calculate the Sandbox Size and pass it to Kata Containers through annotations. In order to accommodate this favorable change and be compatible with the past, we have implemented the handling of the number of vCPUs in runtime-rs. This is This is slightly different from the original runtime-go design. This doc introduce how we handle vCPU size in runtime-rs. Fixes: #5030 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com> Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2023-06-12 19:23:11 +08:00
Yushuo	7b1e67819c	fix(clippy): fix clippy error Fixes: #5030 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com> Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2023-06-12 17:53:16 +08:00
Yushuo	67972ec48a	feat(runtime-rs): calculate initial size In this commit, we refactored the logic of static resource management. We defined the sandbox size calculated from PodSandbox's annotation and SingleContainer's spec as initial size, which will always be the sandbox size when booting the VM. The configuration static_sandbox_resource_mgmt controls whether we will modify the sandbox size in the following container operation. Signed-off-by: Yushuo <y-shuo@linux.alibaba.com> Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2023-06-12 17:53:16 +08:00
Yushuo	aaa96c749b	feat(runtime-rs): modify onlineCpuMemRequest Some vmms, such as dragonball, will actively help us perform online cpu operations when doing cpu hotplug. Under the old onlineCpuMem interface, it is difficult to adapt to this situation. So we modify the semantics of nb_cpus in onlineCpuMemRequest. In the original semantics, nb_cpus represents the number of newly added CPUs that need to be online. The modified semantics become that the number of online CPUs in the guest needs to be guaranteed. Fixes: #5030 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com> Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2023-06-12 17:53:16 +08:00
Yushuo	d66f7572dd	feat(runtime-rs): clear cpuset in runtime side The declaration of the cpu number in the cpuset is greater than the actual number of vcpus, which will cause an error when updating the cgroup in the guest. This problem is difficult to solve, so we temporarily clean up the cpuset in the container spec before passing in the agent. Fixes: #5030 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com> Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2023-06-12 17:53:16 +08:00
Yushuo	a0385e1383	feat(runtime-rs): update linux resource when stop_process Update the resource when delete container, which is in stop_process in runtime-rs. Fixes: #5030 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com> Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2023-06-12 17:53:16 +08:00
Yushuo	a39e1e6cd1	feat(runtime-rs): merge the update_cgroups in update_linux_resources Updating vCPU resources and memory resources of the sandbox and updating cgroups on the host will always happening together, and they are all updated based on the linux resources declarations of all the containers. So we merge update_cgroups into the update_linux_resources, so we can better manage the resources allocated to one pod in the host. Fixes: #5030 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com> Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2023-06-12 17:53:16 +08:00
Ji-Xinyou	fa6dff9f70	feat(runtime-rs): support vcpu resizing on runtime side Support vcpu resizing on runtime side: 1. Calculate vcpu numbers in resource_manager using all the containers' linux_resources in the spec. 2. Call the hypervisor(vmm) to do the vcpu resize. 3. Call the agent to online vcpus. Fixes: #5030 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com> Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-06-12 17:53:16 +08:00
James O. D. Hunt	8cb4238b46	packaging: Remove snap package Nobody has volunteered to maintain the (currently broken) snap build, so remove it. Fixes: #6769. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-06-12 09:24:09 +01:00
Helin Guo	2137739987	runtime-rs: update Cargo.lock After we support memory resize in Dragonball, we need to update Cargo.lock in runtime-rs. Fixes: #6719 Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>	2023-06-12 11:25:59 +08:00
Chao Wu	2988553305	Merge pull request #6998 from HerlinCoder/herlincoder/vpa Dragonball: support resize memory	2023-06-11 17:21:12 +08:00
Archana Shinde	56d2ea9b78	kata-ctl: Refactor kernel module check Adding vhost and vhost-net to the kernel modules. These do not require any kernel module parameters to be checked. Currently, kernel params is a required field. Make this as optional. Could make this as <Option>, but making this a slice instead, as a module could have multiple kernel params. Refactor the function that checks are for kernel modules into two with one specifically checking if the module is loaded and other checking for module parameters. Refactor some of the tests to take into account these changes. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-06-09 14:10:31 -07:00
Aurélien Bombo	9f7a45996c	gha: Add `rootfs-initrd-mariner` build target This adds the Mariner guest image build target to the list of assets as preparation for #6839. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-09 11:36:42 -07:00
Aurélien Bombo	f28a62164a	gha: Add `cloud-hypervisor-glibc` build target This adds the glibc flavor of CLH to the list of assets as preparation for #6839. Mariner Kata is only tested with glibc. Fixes: #7026 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-09 11:35:50 -07:00
Fabiano Fidêncio	b50f62ce48	Merge pull request #6756 from arronwy/measured_rootfs Port Measured rootfs feature from CCv0 branch to main	2023-06-09 12:35:05 +02:00
Helin Guo	8fb7ab7518	dragonball: introduce virtio-balloon device We introduce virtio-balloon device to support memory resize. virtio-balloon device could reclaim memory from guest to host. Fixes: #6719 Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>	2023-06-09 17:47:27 +08:00
Helin Guo	7ed9494973	dragonball: introduce virtio-mem device We introduce virtio-mem device to support memory resize. virtio-mem device could hot-plug more memory blocks to guest and could also hot-unplug them from guest. Fixes: #6719 Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>	2023-06-09 17:47:21 +08:00
Chao Wu	c7c45626c9	Merge pull request #6973 from Apokleos/direct-vol add support direct volume and refactor device manager	2023-06-09 11:29:00 +08:00
alex.lyn	776a15e092	runtime-rs: add support direct volume. As block/direct volume use similar steps of device adding, so making full use of block volume code is a better way to handle direct volume. the only different point is that direct volume will use DirectVolume and get_volume_mount_info to parse mountinfo.json from the direct volume path. That's to say, direct volume needs the help of `kata-ctl direct-volume ...`. Details seen at Advanced Topics: [How to run Kata Containers with kinds of Block Volumes] docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.md Fixes: #5656 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-06-09 08:16:26 +08:00
Helin Guo	a8e0f51c52	dragonball: extend DeviceOpContext In order to support virtio-mem and virtio-balloon devices, we need to extend DeviceOpContext with VmConfigInfo and InstanceInfo. Fixes: #6719 Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>	2023-06-08 22:04:31 +08:00
alex.lyn	abae114046	runtime-rs: refactor device manager implementation The key aspects of the DM implementation refactoring as below: 1. reduce duplicated code Many scenarios have similar steps when adding devices. so to reduce duplicated code, we should create a common method abstracted and use it in various scenarios. do_handle_device: (1) new_device with DeviceConfig and return device_id; (2) try_add_device with device_id and do really add device; (3) return device info of device's info; 2. return full info of Device Trait get_device_info replace the original type DeviceConfig with full info DeviceType. 3. refactor find_device method. Fixes: #5656 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-06-08 08:47:08 +08:00
Fabiano Fidêncio	08d10d38be	Merge pull request #7048 from sprt/sprt/fix-gha gha: Fix gha-run.sh and unbreak CI	2023-06-07 23:40:02 +02:00
James O. D. Hunt	452f286552	Merge pull request #6764 from byron-marohn/fix_5401 kata-ctl: Switch to slog logging; add --log-level and --json-logging arguments	2023-06-07 16:08:53 +01:00
Fuu	210a15794c	dragonball: avoid obtaining lock twice in create_stdio_console Fixes #7055 Signed-off-by: Fuu <fuu-open@linux.alibaba.com>	2023-06-07 16:12:22 +08:00
Aurélien Bombo	69668ce87f	tests: gha-run: Use correct env variable for repo s/DOCKER_IMAGE/DOCKER_REPO Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-06 11:54:43 -07:00
Aurélien Bombo	f487199edf	gha: aks: Fix argument in call to gha-run.sh Fixes: #7047 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-06 11:51:18 -07:00
GabyCT	5ad8aaf9df	Merge pull request #7035 from GabyCT/topic/logparserdoc log-parser: Update log parser link at README	2023-06-06 12:02:25 -06:00
Fabiano Fidêncio	de2e507483	Merge pull request #6972 from sprt/sprt/gha-run-script gha: aks: Extract `run` commands to a script	2023-06-06 14:54:03 +02:00
Wang, Arron	f6afae9c73	packaging: Add rootfs-image-tdx-tarball target Add rootfs-image-tdx target: ./tools/packaging/kata-deploy/local-build/kata-deploy-binaries.sh --build=rootfs-image-tdx ./opt/kata/share/kata-containers/kata-containers-tdx.img ./opt/kata/share/kata-containers/kata-ubuntu-latest-tdx.image Fixes: #6674 Signed-off-by: Wang, Arron <arron.wang@intel.com>	2023-06-06 12:34:20 +02:00
Wang, Arron	f62b2670c0	config: Add root hash value and measure config to kernel params After we have a guest kernel with builtin initramfs which provide the rootfs measurement capability and Kata rootfs image with hash device, we need set related root hash value and measure config to the kernel params in kata configuration file. Fixes: #6674 Signed-off-by: Wang, Arron <arron.wang@intel.com>	2023-06-06 12:34:13 +02:00
Wang, Arron	0080588075	kernel: Integrate initramfs into Guest kernel Integrate initramfs into guest kernel as one binary, which will be measured by the firmware together. Fixes: #6674 Signed-off-by: Wang, Arron <arron.wang@intel.com>	2023-06-06 12:33:41 +02:00
Wang, Arron	28b2645624	initramfs: Add build script to generate initramfs The init.sh in initramfs will parse the verity scheme, roothash, root device and setup the root device accordingly. Fixes: #6674 Signed-off-by: Wang, Arron <arron.wang@intel.com>	2023-06-06 12:33:28 +02:00
Wang, Arron	5cb02a8067	image-build: generate root hash as an separate partition for rootfs Generate rootfs hash data during creating the kata rootfs, current kata image only have one partition, we add another partition as hash device to save hash data of rootfs data blocks. Fixes: #6674 Signed-off-by: Wang, Arron <arron.wang@intel.com>	2023-06-06 12:31:14 +02:00
Arron Wang	31c0ad2076	packaging: Add cryptsetup support in Guest kernel and rootfs Add required kernel config for dm-crypt/dm-integrity/dm-verity and related crypto config. Add userspace command line tools for disk encryption support and ext4 file system utilities. Fixes: #6674 Signed-off-by: Arron Wang <arron.wang@intel.com>	2023-06-06 12:30:07 +02:00
Fabiano Fidêncio	eb1bfa922b	Merge pull request #6980 from nubificus/feat_sharefs_files runtime-rs: handle copy files when share_fs is not available	2023-06-06 12:26:55 +02:00
Chao Wu	b0c6cd05a2	Merge pull request #7033 from openanolis/fix-agent-ctl agent-ctl: fix the compile error	2023-06-06 11:55:15 +08:00
Gabriela Cervantes	980d084f47	log-parser: Update log parser link at README This PR updates the link to the correspondent Developer Guide at the enabling full containerd debug that we have for kata 2.0 documentation. Fixes #7034 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-06-05 15:59:52 +00:00
Yushuo	410bc18143	agent-ctl: fix the compile error When the version of libc is upgraded to 0.2.145, older getrandom could not adapt to new API, and this will make agent-ctl fail to compile. We upgrade the version of `rand`, so the low version of getrandom will no longer need. Fixes: #7032 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-06-05 21:48:36 +08:00
Jayant Singh	77519fd120	kata-ctl: Switch to slog logging; add --log-level, --json-logging args Fixes: #5401, #6654 - Switch kata-ctl from eprintln!()/println!() to structured logging via the logging library which uses slog. - Adds a new create_term_logger() library call which enables printing log messages to the terminal via a less verbose / more human readable terminal format with colors. - Adds --log-level argument to select the minimum log level of printed messages. - Adds --json-logging argument to switch to logging in JSON format. Co-authored-by: Byron Marohn <byron.marohn@intel.com> Co-authored-by: Luke Phillips <lucas.phillips@intel.com> Signed-off-by: Jayant Singh <jayant.singh@intel.com> Signed-off-by: Byron Marohn <byron.marohn@intel.com> Signed-off-by: Luke Phillips <lucas.phillips@intel.com> Signed-off-by: Kelby Madal-Hellmuth <kelby.madal-hellmuth@intel.com> Signed-off-by: Liz Lawrens <liz.lawrens@intel.com>	2023-06-02 20:13:22 +00:00
Aurélien Bombo	aab6030962	gha: aks: Extract `run` commands to a script Github Actions reads and runs workflow files from the main branch, rather than from the PR branch. This means that PRs that modify workflow files aren't being tested with the updated workflows coming from the PR, but rather with the old workflows from the main branch. AFAIK, this behavior isn't avoidable for workflow files (but is for other scripts). This makes it very hard to reliably test workflow changes before they're actually merged into main and leads to issues that we have to hotifx (see #6983, #6995). This PR aims to mitigate that by extracting the commands used in workflows to a separate script file. The way our CI is set up, those script files are read from the PR branch and thus changes would be reflected in the CI checks. Fixes: #6971 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-06-02 10:22:35 -07:00
Fupan Li	465f5a5ced	Merge pull request #4748 from lifupan/main_fix agent: fix the issue of exec hang with a backgroud process	2023-06-02 10:46:43 +08:00
Chao Wu	2128fa2b4e	Merge pull request #7013 from xuejun-xj/xuejun/bugfix runtime-rs: bugfix: update Cargo.lock	2023-06-02 10:08:27 +08:00
Anastassios Nanos	e4eb664d27	runtime-rs: update rust to 1.69.0 We are probably hitting this: https://github.com/rust-lang/rust/issues/63033 Seems like it is worth a try to upgrade to 1.69.0 Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2023-06-01 21:40:56 +00:00
Anastassios Nanos	ed37715e05	runtime-rs: handle copy files when share_fs is not available In hypervisors that do not support virtiofs we have to copy files in the VM sandbox to properly setup the network (resolv.conf, hosts, and hostname). To do that, we construct the volume as before, with the addition of an extra variable that designates the path where the file will reside in the sandbox. In this case, we issue a `copy_file` agent request and we patch the spec to account for this change. Fixes: #6978 Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk> Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk>	2023-06-01 21:40:56 +00:00
Fabiano Fidêncio	18b1a019d4	Merge pull request #7011 from jepio/fix-aks-cluster-name gha: aks: Use short SHA in cluster name	2023-06-01 15:56:20 +02:00
Fabiano Fidêncio	5ab42d87fb	Merge pull request #7009 from fidencio/topic/display-badge-for-the-publish-artefacts-job README: Display badge for the "Publish Artefacts" job and update the Kata Containers logo	2023-06-01 15:13:41 +02:00
Fabiano Fidêncio	eb1f44f111	Merge pull request #7007 from fidencio/topic/try-to-fix-ubuntu-k8s-key-not-available kata-deploy: Change how we get the Ubuntu k8s key	2023-06-01 15:13:22 +02:00
xuejun-xj	5f6fc3ed76	runtime-rs: bugfix: update Cargo.lock When dragonball update dbs-boot crate in commit `64c764c147`, the Cargo.lock in runtime-rs should also be updated. Fixes: #6969 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-06-01 20:25:35 +08:00
Jeremi Piotrowski	1c6d22c803	gha: aks: Use short SHA in cluster name Full SHA is 40 characters, while AKS cluster name has a limit of 63. Trim the SHA to 12 characters, which is widely considered to be unique enough and is short enough to be used in the cluster name Fixes: #7010 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-06-01 14:03:53 +02:00
Fabiano Fidêncio	3c1f6d36dc	readme: Update Kata Containers logo Let's use the horizontal logo, as it occupies better the space the we have. The logo comes from: https://openinfra.dev/brand/logos Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-06-01 12:25:13 +02:00
Fabiano Fidêncio	3886841131	readme: Add status badge for the "Publish Artefacts" job Let's start adding the status of our jobs as part of our main page, so folks monitoring those can easily check whether they're okay, or if someone has to be pinged about those. Fixes: #7008 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-06-01 12:25:01 +02:00
Fabiano Fidêncio	26f7520387	kata-deploy: Change how we get the Ubuntu k8s key The current method has been failing every now and then, and was reported on https://github.com/kubernetes/release/issues/2862. Ding poked me and suggested to do this change here, so here we go. :-) Fixes: #7006 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-06-01 12:10:30 +02:00
Fabiano Fidêncio	9ec2bca101	Merge pull request #7002 from fidencio/topic/follow-up-on-7000 gha: aks: Ensure host_os is used everywhere needed	2023-06-01 08:51:27 +02:00
Fabiano Fidêncio	8cbb80da66	Merge pull request #6929 from LindaYu17/dev kubernetes: add agnhost command in pod yaml	2023-06-01 08:39:58 +02:00
Fabiano Fidêncio	aebd3b47d9	gha: aks: Ensure host_os is used everywhere needed We added that to create the cluster name, but I forgot to add that to the part we get the k8s config file, or to the part where we delete the AKS cluster. Fixes: #6999 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-31 20:50:55 +02:00
Fabiano Fidêncio	e01f75723a	Merge pull request #6997 from singhwang/main main \| release: Standardize kata static file name	2023-05-31 15:22:30 +02:00
Fabiano Fidêncio	1ed917a079	Merge pull request #6989 from BbolroC/configurable-build-registry packaging: make BUILDER_REGISTRY configurable	2023-05-31 15:18:51 +02:00
Fabiano Fidêncio	de22783124	Merge pull request #7000 from fidencio/topic/use-a-different-name-for-the-ubuntu-and-mariner-aks-clusters gha: aks: Add the host_os as part of the aks cluster's name	2023-05-31 15:18:17 +02:00
Archana Shinde	141c26f307	Merge pull request #6985 from amshinde/kernel-tdx-build kernel: Modify build-kernel.sh to accomodate for changes in version.yaml	2023-05-31 01:57:20 -07:00
Fabiano Fidêncio	0c8282c224	gha: aks: Add the host_os as part of the aks cluster's name We need to do so, otherwise we'll create two clusters for testing Cloud Hypervisor with exactly the same name, one using Ubuntu, and one using Mariner. Fixes: #6999 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-31 05:20:04 +02:00
SinghWang	4b89a6bdac	release: Standardize kata static file name The string representing the architecture aarch64 and x86_64 need to be changed to arm64 and amd64 for the release. Fixes: #6986 Signed-off-by: SinghWang <wangxin_0611@126.com>	2023-05-31 10:24:45 +08:00
Fabiano Fidêncio	51e42a9972	Merge pull request #6995 from sprt/sprt/fix-mariner-ci gha: Fix Mariner cluster creation	2023-05-31 00:23:36 +02:00
Archana Shinde	9228815ad2	kernel: Modify build-kernel.sh to accomodate for changes in version.yaml There were recent changes for the tdx kernel in the version.yaml that are not currently accounted for in the build-kernel.sh script. Attempts to setup a tdx kernel to build local changes seemed to not download the tdx kernel. Instead the mainline kernel is downloaded which has no tdx-related changes. The version.yaml has a new entry for tdx kernel. Use that instead for setting up and downloading the tdx kernel. Fixes: #6984 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-05-30 13:44:58 -07:00
Aurélien Bombo	03027a7399	gha: Fix Mariner cluster creation While the Mariner Kata host is in preview, we need the `aks-preview` extension to enable the `--workload-runtime KataMshvVmIsolation` flag. Fixes: #6994 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-05-30 13:26:49 -07:00
Hyounggyu Choi	43e73bdef7	packaging: make BUILDER_REGISTRY configurable This PR is to make an environment variable `BUILDER_REGISTRY` configurable so that those who want to use their own registry for build can set up the registry. Fixes: #6988 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-05-30 14:40:02 +02:00
Fabiano Fidêncio	2e2d7243d2	Merge pull request #6983 from sprt/sprt/fix-gha-ci gha: Unbreak CI and fix cluster creation step	2023-05-30 12:58:10 +02:00
Zhongtao Hu	8b6cb2cd75	Merge pull request #6806 from xuejun-xj/xuejun/vcpuhotplug Dragonball: support vcpu hotplug on aarch64	2023-05-30 18:47:50 +08:00
xuejun-xj	ffe3157a46	dragonball: add arm64 patches for upcall The vcpu hotplug/hotunplug feature is implemented with upcall. This commit add three patches to support the feature on aarch64. Patches: > 0005: add support of upcall on aarch64 > 0006: skip activate offline cpus' MSI interrupt > 0007: set the correct boot cpu number Fixes: #6010 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-05-30 15:51:08 +08:00
xuejun-xj	560442e6ed	dragonball: add vcpu_boot_onlined vector This commit implements the vcpu_boot_onlined vector in get_fdt_vm_info. "boot_enabled" means whether this vcpu should be onlined at first boot. It will be used by fdt, which write an attribute called boot_enabled, and will be handled by guest kernel to pass the correct cpu number to function "bringup_nonboot_cpus". Fixes: #6010 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-05-30 15:51:08 +08:00
xuejun-xj	e31772cfea	dragonball: add support resize_vcpu on aarch64 This commit add support of resize_vcpu on aarch64. As kvm will check whether vgic is initialized when calling KVM_CREATE_VCPU ioctl, all the vcpu fds should be created before vm is booted. To support resizing vcpu scenario, we use max_vcpu_count for create_vcpus and setup_interrupt_controller interfaces. The SetVmConfiguration API will ensure max_vcpu_count >= boot_vcpu_count. Fixes: #6010 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-05-30 15:51:08 +08:00
xuejun-xj	64c764c147	dragonball: update dbs-boot to v0.4.0 dbs-boot-v0.4.0 refectors the create_fdt interface. It simplifies the parameters needed to be passed and abstracts them into three structs. By the way, it also reserves some interfaces for future feature: numa passthrough and cache passthrough. Fixes: #6969 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-05-30 15:51:08 +08:00
xuejun-xj	fd9b414646	dragonball: update comment for init_microvm Rewrite the comment of Vm::init_microvm method for aarch64. Fixes cargo test warnings on aarch64. Fixes: #6969 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-05-30 15:51:08 +08:00
Aurélien Bombo	af16d3fca4	gha: Unbreak CI and fix cluster creation step This fixes the regression introduced by #6686 by properly injecting the `--os-sku mariner --workload-runtime KataMshvVmIsolation` flags. Error reference: https://github.com/kata-containers/kata-containers/actions/runs/5111460297/jobs/9188819103 Fixes: #6982 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-05-29 13:32:47 -07:00
Zhongtao Hu	099b4b0d0e	Merge pull request #6598 from Apokleos/sandbox_bind_mounts runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts	2023-05-28 12:00:39 +08:00
Zhongtao Hu	cb962b0dc9	Merge pull request #6702 from Apokleos/directvol-common runtime-rs/kata-ctl: Enhancement of DirectVolumeMount.	2023-05-28 12:00:12 +08:00
Fabiano Fidêncio	44546a4a57	Merge pull request #6686 from sprt/sprt/mariner-ci gha: Create Mariner host as part of k8s tests	2023-05-27 05:34:28 +02:00
alex.lyn	5ddc4f94c5	runtime-rs/kata-ctl: Enhancement of DirectVolumeMount. Move the get_volume_mount_info to kata-types/src/mount.rs. If so, it becomes a common method of DirectVolumeMountInfo and reduces duplicated code. Fixes: #6701 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-26 11:18:29 +08:00
Fupan Li	25d2fb0fde	agent: fix the issue of exec hang with a backgroud process When run a exec process in backgroud without tty, the exec will hang and didn't terminated. For example: crictl -i <container id> sh -c 'nohup tail -f /dev/null &' Fixes: #4747 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2023-05-26 10:56:46 +08:00
Tim Zhang	5231aff90f	Merge pull request #6860 from lifupan/main netlink: Fix the issue of update_interface	2023-05-26 10:54:07 +08:00
Aurélien Bombo	4af4ced1aa	gha: Create Mariner host as part of k8s tests The current testing setup only supports running Kata on top of an Ubuntu host. This adds Mariner to the matrix of testable hosts for k8s tests, with Cloud Hypervisor as a VMM. As preparation for the upcoming PR that will change only the actual test code (rather than workflow YAMLs), this also introduces a new file `setup.sh` that will be used to set host-specific parameters at test run-time. Fixes: #6961 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-05-25 14:29:46 -07:00
Fabiano Fidêncio	59cefa719c	Merge pull request #6965 from fidencio/topic/gha-increase-aks-creation-waiting-time gha: Increase timeout for AKS jobs and give more time to start running the tests	2023-05-25 17:23:17 +02:00
Greg Kurz	837f7a2fe6	Merge pull request #6959 from beraldoleal/issues/6757 runtime: sending SIGKILL to qemu	2023-05-25 16:24:37 +02:00
alex.lyn	eee7aae71d	runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts sandbox_bind_mounts supports kinds of mount patterns, for example: (1) "/path/to", default readonly mode. (2) "/path/to:ro", same as (1). (3) "/path/to:rw", readwrite mode. Both support configuration and annotation: (1)[runtime] sandbox_bind_mounts=["/path/to", "/path/to:rw", "/mnt/to:ro"] (2) annotation will alse be supported, restricted as below: io.katacontainers.config.runtime.sandbox_bind_mounts = "/path/to /path/to:rw /mnt/to:ro" Fixes: #6597 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-25 20:00:25 +08:00
Fupan Li	62b2838962	Merge pull request #6846 from ZhangShuaiyi/DeviceMgrMethod dragonball: convert BlockDeviceMgr and VirtioNetDeviceMgr functions to methods	2023-05-25 18:11:44 +08:00
QuanweiZhou	377b7735f5	Merge pull request #6872 from justxuewei/rm-virtio-devices dragonball: Remove virtio-net and vsock devices gracefully	2023-05-25 17:08:36 +08:00
Fabiano Fidêncio	3d5d6eb361	Merge pull request #6958 from fidencio/topic/kata-deploy-improve-backup-restore kata-deploy: Improve shim backup / restore	2023-05-25 10:54:06 +02:00
Fabiano Fidêncio	3f0735a7e8	Merge pull request #6952 from stevenhorsman/git-clone-doc-fix doc: Update git commands	2023-05-25 10:36:08 +02:00
Fabiano Fidêncio	557b840814	gha: aks: Wait longer to start running the tests We're still facing issues related to the time taken to deploy the kata-deplot daemonset and starting to run the tests. Ideally, we should solve this with a readiness probe, and that's the approach we want to take in the future. However, for now, let's just make sure those tests are not on the way of the community. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-25 10:13:19 +02:00
Fabiano Fidêncio	c04c872c42	gha: aks: Increase the timeout time We've seen tests being aborted close to the end of the run due to the timeout. Let's increase it, avoiding to hit such cases again.. Fixes: #6964 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-25 10:13:08 +02:00
GabyCT	8d98484230	Merge pull request #6926 from GabyCT/topic/fixtabsmerge kata-deploy: Fix indentation on kata deploy merge script	2023-05-24 14:55:51 -06:00
Fabiano Fidêncio	428041624a	kata-deploy: Improve shim backup / restore We're currently backing up and restoring all the possible shim files, but the default one ("containerd-shim-kata-v2"). Let's ensure this is also backed up and restored. Fixes: #6957 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-24 18:39:27 +02:00
Gabriela Cervantes	14c3f1e9f5	kata-deploy: Fix indentation on kata deploy merge script This PR fixes the indentation on the kata deploy merge script that instead of single spaces uses a tap. Fixes #6925 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-05-24 16:01:10 +00:00
Beraldo Leal	0e47cfc4c7	runtime: sending SIGKILL to qemu There is a race condition when virtiofsd is killed without finishing all the clients. Because of that, when a pod is stopped, QEMU detects virtiofsd is gone, which is legitimate. Sending a SIGTERM first before killing could introduce some latency during the shutdown. Fixes #6757. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-05-24 11:31:28 -04:00
stevenhorsman	6a0035e419	doc: Update git commands Fix bad migrations from `go get` to `git clone` and update the cloned directory path Fixes: #6951 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-05-24 13:16:48 +01:00
Fabiano Fidêncio	7c9faab523	Merge pull request #6947 from fidencio/topic/gha-release-fix-payload-tagging gha: release: Simplify the process for tagging the payload	2023-05-24 11:22:09 +02:00
Fabiano Fidêncio	f636c1f8a4	gha: release: Simplify the process for tagging the payload We previously were doing: * Create a new image on kata-deploy-ci using the commit hash of the latest tag * This was used to test on AKS, which is no longer needed as we test on AKS on every PR * Create a new image on kata-deploy using the release tag and "latest" or "stable", by tagging the kata-deploy-ci image accordingly As part of `cfe63527c5`, we broke the workflow described above, as in the first step we would save the PKG_SHA to be used in the second step, but that part ended up being removed. Anyways, this back and forth is not needed anymore and we can simplify the process by doing: * Create a new image on kata-deploy, using: - The tag received as ref from the event that triggered this worklow - "latest" or "stable" tag, depending on whether it's a stable release or not Fixes: #6946 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-24 08:54:43 +02:00
Fabiano Fidêncio	01827911f4	Merge pull request #6943 from fidencio/topic/gha-login-dont-specify-the-registry-if-using-docker-io gha: release: login-action: Don't specify docker.io registry	2023-05-24 07:33:12 +02:00
Fabiano Fidêncio	1c9ad4435a	Merge pull request #6939 from GabyCT/topic/updatenydus versions: Update nydus version to 2.2.1	2023-05-24 00:12:57 +02:00
Fabiano Fidêncio	d10c9be603	gha: release: login-action: Don't specify docker.io registry For some bizarre reason, the login-action will simply fail to authenticate to docker.io in it's specified as a registry. The way to proceed, instead, is to not specify any registry as it'd be used by default. Fixes: #6943 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-23 22:38:12 +02:00
Fabiano Fidêncio	9aae333343	Merge pull request #6871 from kmjohansen/bugfix/ptmx runtime: make debug console work with sandbox_cgroup_only	2023-05-23 22:24:51 +02:00
Fabiano Fidêncio	df77fefce8	Merge pull request #6941 from fidencio/3.2.0-alpha3-branch-bump # Kata Containers 3.2.0-alpha3	2023-05-23 22:21:03 +02:00
Fabiano Fidêncio	c54363114d	release: Kata Containers 3.2.0-alpha3 - release: Fix `docker/login-action` version `f3702268d` release: Fix `docker/login-action` version Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-23 18:39:16 +02:00
Fabiano Fidêncio	c7a77f980b	Merge pull request #6935 from fidencio/topic/release-fix-docker-login-action-version release: Fix `docker/login-action` version	2023-05-23 18:35:03 +02:00
Gabriela Cervantes	0b1c5ea5bb	versions: Update nydus version to 2.2.1 This PR updates the nydus version to 2.2.1. This change includes: nydus-image: fix a underflow issue in get_compressed_size() backport fix/feature to stable 2.2 [backport] contrib: upgrade runc to v1.1.5 service: add README for nydus-service nydus: fix a possible panic caused by SubCmdArgs::is_present Backports two bugfixes from master into stable/v2.2 [backport stable/v2.2] action: upgrade golangci-lint to v1.51.2 [backport] action: fix smoke test for branch pattern Fixes #6938 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-05-23 15:39:04 +00:00
Fabiano Fidêncio	f3702268d1	release: Fix `docker/login-action` version `docker/login-action@v3` does not exist and `docker/login-action@v2` should be used instead. Fixes: #6934 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-23 14:11:03 +02:00
Fabiano Fidêncio	c82ac57e30	Merge pull request #6930 from fidencio/3.2.0-alpha2-branch-bump # Kata Containers 3.2.0-alpha2	2023-05-23 13:50:58 +02:00
Linda Yu	433b5add4a	kubernetes: add agnhost command in pod yaml Fixes: #6928 Signed-off-by: Linda Yu <linda.yu@intel.com>	2023-05-23 18:11:45 +08:00
Fupan Li	170336517f	Merge pull request #5441 from openanolis/device_manager_dev runtime-rs: device manager for runtime-rs	2023-05-23 16:50:07 +08:00
Fabiano Fidêncio	fc09d0f5dd	release: Kata Containers 3.2.0-alpha2 - Fix cache for OVMF and rootfs-initrd (both x86_64) - Upgrade to Cloud Hypervisor v32.0 - osbuilder: Bump fedora image version - local-build: Standardise what's set for the local build scripts - gha: aks: Wait a little bit more before run the tests - docs: Update container network model url - gha: release: Fix s390x worklow - cache: Fix OVMF caching - gha: payload-after-push: Pass secrets down - tools: Fix arch bug `22154e0a3` cache: Fix OVMF tarball name for different flavours `b7341cd96` cache: Use "initrd" as `initrd_type` to build rootfs-initrd `b8ffcd1b9` osbuilder: Bump fedora image version `636539bf0` kata-deploy: Use apt-key.gpg from k8s.io `ae24dc73c` local-build: Standardise what's set for the local build scripts `35c3d7b4b` runtime: clh: Re-generate the client code `cfee99c57` versions: Upgrade to Cloud Hypervisor v32.0 `ad324adf1` gha: aks: Wait a little bit more before run the tests `191b6dd9d` gha: release: Fix s390x worklow `cfd8f4ff7` gha: payload-after-push: Pass secrets down `75330ab3f` cache: Fix OVMF caching `a89b44aab` tools: Fix arch bug `11a34a72e` docs: Update container network model url Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-23 09:06:44 +02:00
Fabiano Fidêncio	160d9aae4d	Merge pull request #6918 from fidencio/topic/fix-cache-x86_64-ovmf-rootfs-initrd Fix cache for OVMF and rootfs-initrd (both x86_64)	2023-05-22 21:34:56 +02:00
Zhongtao Hu	4719802c8d	runtime-rs: add virtio-blk-mmio add virtio-blk-mmio option for dragonball Fixes:#5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:58:10 +08:00
Zhongtao Hu	f9bded4484	runtime-rs: add devicetype enum use device type to store the config information for different kind of devices Fixes:#5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:55:35 +08:00
Zhongtao Hu	6800d30fdb	runtime-rs: remove device Support remove device after container stop Fixes:#5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:54:22 +08:00
Zhongtao Hu	f16012a1eb	runtime-rs: support linux device support linux device in runtime-rs Fixes:#5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:54:13 +08:00
Zhongtao Hu	fe9ec67644	runtime-rs: block volume support block volume in runtime-rs Fixes: #5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:54:04 +08:00
Zhongtao Hu	a8bfac90b1	runtime-rs: support block rootfs support devmapper for block rootfs Fixes: #5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:53:30 +08:00
Zhongtao Hu	b076d46db3	agent: handle hotplug virtio-mmio device As dragonball support hotplug virtio-mmio device, we should handle it in agent Fixes:#5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:53:22 +08:00
Zhongtao Hu	6e273d6ccc	runtime-rs: implement trait for vhost-user device add the trait implementation for vhost-user device Fixes:#5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-05-23 00:53:16 +08:00
Zhongtao Hu	cc9c915384	runtime-rs: implement trait for vfio device add the trait implementation for vfio device, Fixes:#5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:53:10 +08:00
Archana Shinde	2c9efbe04c	Merge pull request #6907 from likebreath/0519/clh_v32.0 Upgrade to Cloud Hypervisor v32.0	2023-05-22 09:53:05 -07:00
Zhongtao Hu	e4c5c74a75	runtime-rs: device manager Support device manager for runtime-rs, add block device handler for device manager Fixes:#5375 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-05-23 00:53:04 +08:00
Fabiano Fidêncio	22154e0a3b	cache: Fix OVMF tarball name for different flavours `75330ab3f9` tried to fix OVMF caching, but didn't consider that the "vanilla" OVMF tarball name is not "kata-static-ovmf-x86_64.tar.xz", but rather "kata-static-ovmf.tar.xz". The fact we missed that, led to the cache builds of OVMF failing, and the need to build the component on every single PR. Fixes: #6917 (hopefully for good this time). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-22 18:12:30 +02:00
Fabiano Fidêncio	b7341cd968	cache: Use "initrd" as `initrd_type` to build rootfs-initrd We've been defaulting to "", which would lead to a mismatch with the latest version from the cache, causing a miss, and finally having to build the rootfs-initrd as part of the tests, every single time. Fixes: #6917 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-22 18:12:30 +02:00
Fabiano Fidêncio	a28cefd538	Merge pull request #6924 from stevenhorsman/fedora-bump osbuilder: Bump fedora image version	2023-05-22 18:10:57 +02:00
Fabiano Fidêncio	7f350d3ec6	Merge pull request #6913 from fidencio/topic/gha-build-and-upload-payload-can-silently-fail local-build: Standardise what's set for the local build scripts	2023-05-22 18:04:51 +02:00
stevenhorsman	b8ffcd1b9b	osbuilder: Bump fedora image version - Swap out an EoL fedora image for the latest Fixes: #6923 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-05-22 13:48:00 +01:00
Fabiano Fidêncio	636539bf0c	kata-deploy: Use apt-key.gpg from k8s.io We're facing some issues to download / use the public key provided by google for installing kubernetes as part of the kata-deploy image. ``` The following signatures couldn't be verified because the public key is not available: NO_PUBKEY B53DC80D13EDEF05 Reading package lists... Done W: GPG error: https://packages.cloud.google.com/apt kubernetes-xenial InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY B53DC80D13EDEF05 E: The repository 'https://apt.kubernetes.io kubernetes-xenial InRelease' is not signed. N: Updating from such a repository can't be done securely, and is therefore disabled by default. N: See apt-secure(8) manpage for repository creation and user configuration details. ``` Let's work this around following the suggestion made by @dims, at: https://github.com/kubernetes/k8s.io/pull/4837#issuecomment-1446426585 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-22 11:06:01 +02:00
Fabiano Fidêncio	ae24dc73c1	local-build: Standardise what's set for the local build scripts We've a discrepancy on what's set along the scripts used to build the Kata Cotainers artefacts locally. Some of those were missing a way to easily debug them in case of a failure happens, but one specific one (build-and-upload-payload.sh) could actually silently fail. All of those have been changed as part of this commut. Fixes: #6908 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-22 08:36:01 +02:00
Steve Horsman	a2e69c5b66	Merge pull request #6906 from fidencio/topic/gh-aks-wait-a-little-more-before-start-the-tests gha: aks: Wait a little bit more before run the tests	2023-05-20 08:01:20 +01:00
GabyCT	6796af511b	Merge pull request #6890 from GabyCT/topic/fixurlvirt docs: Update container network model url	2023-05-19 15:10:26 -06:00
Bo Chen	35c3d7b4bc	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v32.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #6632 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-05-19 12:49:45 -07:00
Bo Chen	cfee99c577	versions: Upgrade to Cloud Hypervisor v32.0 Details of this release can be found in ourroadmap project as iteration v32.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #6682 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-05-19 12:11:13 -07:00
Steve Horsman	98fa436627	Merge pull request #6904 from fidencio/topic/gha-fix-s390x-release-workflow gha: release: Fix s390x worklow	2023-05-19 19:00:57 +01:00
Steve Horsman	d5355dee20	Merge pull request #6898 from fidencio/topic/fix-ovmf-caching cache: Fix OVMF caching	2023-05-19 18:24:51 +01:00
Fabiano Fidêncio	dfa9301eac	Merge pull request #6900 from fidencio/topic/gha-fix-payload-after-push gha: payload-after-push: Pass secrets down	2023-05-19 17:23:00 +02:00
Fabiano Fidêncio	ad324adf1d	gha: aks: Wait a little bit more before run the tests `fa832f4709` increased the timeout, which helped a lot, mainly in the TEE machines. However, we're still seeing some failures here and there with the AKS tests. Let's bump it yet again and, hopefully, those errors to start the tests will go away. Fixes: #6905 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-19 16:40:35 +02:00
Fabiano Fidêncio	191b6dd9dd	gha: release: Fix s390x worklow GitHub is warning us that: """ The workflow is not valid. In .github/workflows/release.yaml (Line: 21, Col: 11): Error from called workflow kata-containers/kata-containers/.github/workflows/release-s390x.yaml@d2e92c9ec993f56537044950a4673e50707369b5 (Line: 14, Col: 12): Job 'kata-deploy' depends on unknown job 'create-kata-tarball'. """ This is happening as we need to reference "build-kata-static-tarball-s390x" instead of "create-kata-tarball". Fixes: #6903 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-19 16:21:49 +02:00
Fabiano Fidêncio	cfd8f4ff76	gha: payload-after-push: Pass secrets down The "build-assets-${arch}" jobs need to have access to the secrets in order to log into the container registry in the cases where "push-to-registry", which is used to push the builder containers to quay.io, is set to "yes". Now that "build-assets-${arch}" pass the secrets down, we need to log into the container registry in the "build-kata-static-tarball-${arch}" files, in case "push-to-registry" is set to "yes". Fixes: #6899 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-19 15:00:06 +02:00
Fabiano Fidêncio	7abae8ee9c	Merge pull request #6896 from stevenhorsman/firecracker-arch-case tools: Fix arch bug	2023-05-19 14:26:14 +02:00
Fabiano Fidêncio	75330ab3f9	cache: Fix OVMF caching OVMF has been cached, but it's not been used from cache as the `version` set in the cached builds has always been empty. The reason for that is because we've been trying to look for `externals.ovmf.ovmf.version`, while we should be actually looking for `externals.ovmf.x86_64.version`. Setting `x86_64` as the OVMF_FLAVOUR would cause another bug, as the expected tarball name would then be `kata-static-x86_64.tar.xz`, instead of `kata-static-ovmf-x86_64.tar.xz`. With everything said, let's simplify the OVMF_FLAVOUR usage, by using it as it's passed, and only adapting the tarball name for the TDVF case, which is the abnormal one. Fixes: #6897 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-19 14:00:39 +02:00
Fabiano Fidêncio	d2e92c9ec9	Merge pull request #6892 from fidencio/3.2.0-alpha1-branch-bump # Kata Containers 3.2.0-alpha1	2023-05-19 12:31:33 +02:00
stevenhorsman	a89b44aabf	tools: Fix arch bug Fix mismatched case of `arch` Fixes: #6895 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-05-19 09:28:22 +01:00
Fabiano Fidêncio	f527f614c1	release: Kata Containers 3.2.0-alpha1 - runtime: Use static_sandbox_resource_mgmt=true for TEEs - update tokio dependency - resource-control: fix setting CPU affinities on Linux - runtime: use enable_vcpus_pinning from toml - gha: k8s: Make the tests more reliable - gha: Enable SEV-SNP tests on main - gha: tdx: Use the k3s overlay for kata-cleanup - runtime: Port sev package to main - gpu: Rename the last bits from `gpu` to `nvidia-gpu` - deploy: fix shell script error - ppc64le: switch virtiofsd from C to rust version - osbuilder: Fix indentation in rootfs.sh - virtcontainers/qemu_test.go: Improve coverage - agent: Add context to errors that may occur when AgentConfig file is … - virtcontainers/pkg/compatoci/: Improved coverage for for Kata 2.0 - kata-manager: Fix '-o' syntax and logic error - kata-ctl: Add the option to install kata-ctl to a user specified directory - runtime-rs: fix building instructions to use correct required Rust ve… - Dragonball: use LinuxBootConfigurator::write_bootparams - kata-deploy: Add http_proxy as part of the docker build - kata-deploy: Do not ship the kata tarball - kata-deploy: Build improvements - deploy: Fix arch in image tag - Revert "kata-deploy: Use readinessProbe to ensure everything is ready" - virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5% - main \| release: Fix multi-arch publishing is not supported - cache: More fixes to nvidia-gpu kernels caching - runtime: remove overriding ARCH value by default for ppc64le - gha: Fix Body Line Length action flagging empty body commit messages - gha: Fix snap creation workflow - cache: Fix nvidia-gpu version - cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu - packaging: Add SEV-SNP artifacts to main - docs: Mark snap installation method as unmaintained - packaging: Add sev artifacts to main - kata-ctl: add generic kvm check & unit test - Log-parser-rs - warning_fix: fix warnings when build with cargo-1.68.0 - cross-compile: Include documentation and configuration for cross-compile - runtime: Fix virtiofs fd leak - gpu: cold plug VFIO devices - pkg/signals: Improved test coverage 60% to 100% - virtcontainers/persist: Improved test coverage 65% to 87.5% - virtcontainers/clh_test.go: improve unit test coverage - virtcontainers/factory: Improved test coverage - gha: Also run k8s tests on qemu-snp - gha: sev: fix for kata-deploy error - gha: Also run k8s tests on qemu-sev - Implement the "kata-ctl env" command - runtime-rs: support keep_abnormal in toml config - gpu: Build and Ship an GPU enabled Kernel - kata-ctl: checks for kvm, kvm_intel modules loaded - osbuilder: Fix D-Bus enabling in the dracut case - snap: fix docker start fail issue - kata-manager: Fix containerd download - agent: Fix ut issue caused by fd double closed - Bump ttrpc to 0.7.2 and protobuf to 3.2.0 - gpu: Add GPU enabled confguration and runtime - gpu: Do not pass-through PCI (Host) Bridges - cache-components: Fix caching of TDVF and QEMU for TDX - gha: tdx: Ensure kata-deploy is removed after the tests run - versions: Upgrade to Cloud Hypervisor v31.0 - osbuilder: Enable dbus in the dracut case - runtime: Don't create socket file in /run/kata - nydus_rootfs/prefetch_files: add prefetch_files for RAFS - runtime-rs/virtio-fs: add support extra handler for cache mode. - runtime-rs: enable nerdctl to setup cni plugin - tdx: Add artefacts from the latest TDX tools release into main - runtime: support non-root for clh - gha: ci-on-push: Run k8s tests with dragonball - rustjail: Use CPUWeight with systemd and CgroupsV2 - gha: k8s-on-aks: {create,delete} AKS must be a coded-in step - docs: update the rust version from version.yaml - gha: k8s-on-aks: Set {create,delete}_aks as steps - gha: k8s-on-aks: Fix cluster name - gha: Also run k8s tests on AKS with dragonball - gha: Only push images to registry after merging a PR - gha: aks: Use D4s_v5 instance - tools: Avoid building the kernel twice - rustjail: Fix panic when cgroup manager fails - runtime: add filter metrics with specific names - gha: Use ghcr.io for the k8s CI - GHA \|Switch "kubernetes tests" from jenkins to GitHub actions - docs: Update CNM url in networking document - kata-ctl: add function to get platform protection. `f6e1b1152` agent: update tokio dependency `4cb83dc21` kata-ctl: update tokio dependency `df615ff25` runk: update tokio dependency `ca6892ddb` runtime-rs: update tokio dependency `ca1531fe9` runtime: Use static_sandbox_resource_mgmt=true for TEEs `fa832f470` gha: k8s: Make the tests more reliable `cbb9fe8b8` config: Use standard OVMF with SEV `724437efb` kata-deploy: add kata-qemu-sev runtimeclass `521dad2a4` Tests: skip CPU constraints test on SEV and SNP `72308ddb0` gha: ci-on-push: Don't skip tests for SEV `da0f92cef` gha: ci-on-push: Don't skip tests for SEV-SNP `12f43bea0` gha: tdx: Use the k3s overlay for kata-cleanup `1a3f8fc1a` deploy: fix shell script error `87cb98c01` osbuilder: Fix indentation in rootfs.sh `c5a59caca` ppc64le: switch virtiofsd from C to rust version `bfdf0144a` versions: Bump virtiofsd to 1.6.1 `dd7562522` runtime: pkg/sev: Add kbs utility package for SEV pre-attestation `05de7b260` runtime: Add sev package `3a9d3c72a` gpu: Rename the last bits from `gpu` to `nvidia-gpu` `4cde844f7` local-build: Fix kernel-nvidia-gpu target name `593840e07` kata-ctl: Allow INSTALL_PATH= to be specified `bdb75fb21` runtime: use enable_vcpus_pinning from toml `20cb87508` virtcontainers/qemu_test.go: Improve test coverage `b9a1db260` kata-deploy: Add http_proxy as part of the docker build `3e85bf5b1` resource-control: fix setting CPU affinities on Linux `5f3f844a1` runtime-rs: fix building instructions with respect to required Rust version `777c3dc8d` kata-deploy: Do not ship the kata tarball `50cc9c582` tests: Improve coverage for virtcontainers/pkg/compatoci/ for Kata 2.0 `136e2415d` static-build: Download firecracker instead of building it `3bf767cfc` static-build: Adjust ARCH for nydus `ac88d34e0` static-build: Use relased binary for CLH (aarch64) `73913c8eb` kata-manager: Fix '-o' syntax and logic error `2856d3f23` deploy: Fix arch in image tag `e8f81ee93` Revert "kata-deploy: Use readinessProbe to ensure everything is ready" `cfe63527c` release: Fix multi-arch publishing is not supported `197c33651` Dragonball: use LinuxBootConfigurator::write_bootparams to writes the boot parameters into guest memory. `4d17ea4a0` cache: Fix nvidia-snp caching version `a133fadbf` cache: Fix nvidia-gpu-tdx-experimental cache URL `b9990c201` cache: Fix nvidia-gpu version `c9bf7808b` cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu `3665b4204` gpu: Rename `gpu` targets to `nvidia-gpu` `2c90cac75` local-build: fixup alphabetization `4da6eb588` kata-deploy: Add qemu-snp shim `14dd05375` kata-deploy: add kata-qemu-snp runtimeclass `0bb37bff7` config: Add SNP configuration `af7f2519b` versions: update SEV kernel description `dbcc3b5cc` local-build: fix default values for OVMF build `b8bbe6325` gha: build OVMF for tests and release `cf0ca265f` local-build: Add x86_64 OVMF target `db095ddeb` cache: add SNP flavor to comments `f4ee00576` gha: Build and ship QEMU for SNP `7a58a91fa` docs: update SNP guide `879333bfc` versions: update SNP QEMU version `38ce4a32a` local-build: add support to build QEMU for SEV-SNP `5f8008b69` kata-ctl: add unit test for kvm check `a085a6d7b` kata-ctl: add generic kvm check `772d4db26` gha: Build and ship SEV initrd `45fa36692` gha: Build and ship SEV OVMF `4770d3064` gha: Build and ship SEV kernel. `fb9c1fc36` runtime: Add qemu-sev config `813e4c576` runtimeClasses: add sev runtime class `af18806a8` static-build: Add caching support to sev ovmf `76ae7a3ab` packaging: adding caching capability for kernel `12c5ef902` packaging: add support to build OVMF for SEV `b87820ee8` packaging: add support to build initrd for sev `e1f3b871c` docs: Mark snap installation method as unmaintained `022a33de9` agent: Add context to errors when AgentConfig file is missing `b0e6a094b` packaging: Add sev kernel build capability `a4c0303d8` virtcontainers: Fixed static checks for improved test coverage for fc.go `8495f830b` cross-compile: Include documentation and configuration for cross-compile `13d7f39c7` gpu: Check for VFIO port assignments `6594a9329` tools: made log-parser-rs `03a8cd69c` virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5% `9e2b7ff17` gha: sev: fix for kata-deploy error `5c9246db1` gha: Also run k8s tests on qemu-snp `c57a44436` gha: Add the ability to test qemu-snp `406419289` env: Utilize arch specific functionality to get cpu details `fb40c71a2` env: Check for root privileges `1016bc17b` config: Add api to fetch config from default config path `b908a780a` kata-env: Pass cmd option for file path `b1920198b` config: Workaround the way agent and hypervisor configs are fetched `f2b2621de` kata-env: Implement the kata-env command. `c849bdb0a` gha: Also run k8s tests on qemu-sev `6bf1fc605` virtcontainers/factory: Improved test coverage `0d49ceee0` gha: Fix snap creation workflow warnings `138ada049` gpu: Cold Plug VFIO toml setting `defb64334` runtime: remove overriding ARCH value by default for ppc64le `f7ad75cb1` gpu: Cold-plug extend the api.md `0fec2e698` gpu: Add cold-plug test `f2ebdd81c` utils: Get rid of spurious print statement left behind. `9a94f1f14` make: Export VERSION and COMMIT `2f81f48da` config: Add file under /opt as another location to look for the config `07f7d17db` config: Make the pipe_size field optional `68f635773` config: Make function to get the default conf file public `7565b3356` kata-ctl: Implement Display trait for GuestProtection enum `94a00f934` utils: Make certain constants in utils.rs public `572b338b3` gitignore: Ignore .swp and .swo editor backup files `376884b8a` cargo: Update version of clap to 4.1.13 `17daeb9dd` warning_fix: fix warnings when build with cargo-1.68.0 `521519d74` gha: Add the ability to test qemu-sev `205909fbe` runtime: Fix virtiofs fd leak `5226f15c8` gha: Fix Body Line Length action flagging empty body commit messages `0f45b0faa` virtcontainers/clh_test.go: improve unit test coverage `dded731db` gpu: Add OVMF setting for MMIO aperture `2a830177c` gpu: Add fwcfg helper function `131f056a1` gpu: Extract VFIO Functions to drivers `c8cf7ed3b` gpu: Add ColdPlug of VFIO devices with devManager `e2b5e7f73` gpu: Add Rawdevices to hypervisor `6107c32d7` gpu: Assign default value to cold-plug `377ebc2ad` gpu: Add configuration option for cold-plug VFIO `c18ceae10` gpu: Add new struct PCIePort `9c38204f1` virtcontainers/persist: Improved test coverage 65% to 87.5% `1c1ee8057` pkg/signals: Improved test coverage 60% to 100% `cc8ea3232` runtime-rs: support keep_abnormal in toml config `96e8470db` kata-manager: Fix containerd download `432d40744` kata-ctl: checks for kvm, kvm_intel modules loaded `b1730e4a6` gpu: Add new kernel build option to usage() `3e7b90226` osbuilder: Fix D-Bus enabling in the dracut case `53c749a9d` agent: Fix ut issue caused by fd double closed `2e3f19af9` agent: fix clippy warnings caused by protobuf3 `4849c56fa` agent: Fix unit test issue cuased by protobuf upgrade `0a582f781` trace-forwarder: remove unused crate protobuf `73253850e` kata-ctl: remove unused crate ttrpc `76d2e3054` agent-ctl: Bump ttrpc from 0.6.0 to 0.7.1 `eb3d20dcc` protocols: Add ut for Serde `59568c79d` protocols: add support for Serde `a6b4d92c8` runtime-rs: Bump ttrpc from 0.6.0 to 0.7.1 `ac7c63bc6` gpu: Add containerd shim for qemu-gpu `a0cc8a75f` gpu: Add a kube runtime class `a81fff706` gpu: Adding a GPU enabled configuration `8af6fc77c` agent: Bump ttrpc from 0.6.0 to 0.7.1 `009b42dbf` protocols: Fix unit test `392732e21` protocols: Bump ttrpc from 0.6.0 to 0.7.1 `f4f958d53` gpu: Do not pass-through PCI (Host) Bridges `825e76948` gpu: Add GPU support to default kernel without any TEE `e4ee07f7d` gpu: Add GPU TDX experimental kernel `a1272bcf1` gha: tdx: Fix typo overlay -> overlays `3fa0890e5` cache-components: Fix TDVF caching `80e3a2d40` cache-components: Fix TDX QEMU caching `87ea43cd4` gpu: Add configuration fragment `aca6ff728` gpu: Build and Ship an GPU enabled Kernel `dc662333d` runtime: Increase the dial_timeout `eb1762e81` osbuilder: Enable dbus in the dracut case `f478b9115` clh: tdx: Update timeouts for confidential guest `3b76abb36` kata-deploy: Ensure node is ready after CRI Engine restart `5ec9ae0f0` kata-deploy: Use readinessProbe to ensure everything is ready `ea386700f` kata-deploy: Update podOverhead for TDX `e31efc861` gha: tdx: Use the k3s overlay `542bb0f3f` gha: tdx: Set KUBECONFIG env at the job level `d7fdf19e9` gha: tdx: Delete kata-deploy after the tests finish `da35241a9` tests: k8s: Skip k8s-cpu-ns when testing TDX `db2cac34d` runtime: Don't create socket file in /run/kata `6d315719f` snap: fix docker start fail issue `e4b3b0887` gpu: Add proper CONFIG_LOCALVERSION depending on TEE `69ba2098f` runtime-rs: remove network entities and netns `b31f103d1` runtime-rs: enable nerdctl cni plugin `69d7a959c` gha: ci-on-push: Run tests on TDX `5a0727ecb` kata-deploy: Ship kata-qemu-tdx runtimeClass `98682805b` config: Add configuration for QEMU TDX `3e1580019` govmm: Directly pass the firmware using -bios with TDX `3c5ffb0c8` govmm: Set "sept-ve-disable=on" `ed145365e` runtime/qemu: Drop "kvm-type=tdx" `25b3cdd38` virtcontainers: Drop check for the `tdx` CPU flag `01bdacb4e` virtcontainers: Also check /sys/firmwares/tdx for TDX `9feec533c` cache: Add ability to cache OVMF `ce8d98251` gha: Build and ship the OVMF for TDX `39c3fab7b` local-build: Add support to build OVMF for TDX `054174d3e` versions: Bump OVMF for TDX `800fb49da` packaging: Add get_ovmf_image_name() helper `fbf03d7ac` cache: Document kernel-tdx-experimental `5d79e9696` cache: Add a space to ease the reading of the kernel flavours `6e4726e45` cache: Fix typos `fc22ed0a8` gha: Build and ship the Kernel for TDX `502844ced` local-build: Add support to build Kernel for TDX `b2585eecf` local-build: Avoid code duplication building the kernel `f33345c31` versions: Update Kernel TDX version `20ab2c242` versions: Move Kernel TDX to its own experimental entry `3d9ce3982` cache: Allow specifying the QEMU_FLAVOUR `33dc6c65a` gha: Build and ship QEMU for TDX `eceaae30a` local-build: Add support to build QEMU for TDX `f7b7c187e` static-build: Improve qemu-experimental build script `3018c9ad5` versions: Update QEMU TDX version `800ee5cd8` versions: Move QEMU TDX to its own experimental entry `1315bb45f` local-build: Add dragonball kernel to the `all` target `73e108136` local-build: Rename non vanilla kernel build functions `1d851b4be` local-build: Cosmetic changes in build targets `49ce685eb` gha: k8s-on-aks: Always delete the AKS cluster `e2a770df5` gha: ci-on-push: Run k8s tests with dragonball `d1f550bd1` docs: update the rust version from versions.yaml `f3595e48b` nydus_rootfs/prefetch_files: add prefetch_files for RAFS `3bfaafbf4` fix: oci hook `c1fbaae8d` rustjail: Use CPUWeight with systemd and CgroupsV2 `375187e04` versions: Upgrade to Cloud Hypervisor v31.0 `79f3047f0` gha: k8s-on-aks: {create,delete} AKS must be a coded-in step `2f35b4d4e` gha: ci-on-push: Only run on `main` branch `e7bd2545e` Revert "gha: ci-on-push: Depend on Commit Message Check" `0d96d4963` Revert "gha: ci-on-push: Adjust to using workflow_run" `c7ee45f7e` Revert "gha: ci-on-push: Adapt chained jobs to workflow_run" `5d4d72064` Revert "gha: k8s-on-aks: Fix cluster name" `13d857a56` gha: k8s-on-aks: Set {create,delete}_aks as steps `dc6569dbb` runtime-rs/virtio-fs: add support extra handler for cache mode. `85cc5bb53` gha: k8s-on-aks: Fix cluster name `1688e4f3f` gha: aks: Use D4s_v5 instance `108d80a86` gha: Add the ability to also test Dragonball `2550d4462` gha: build-kata-static-tarball: Only push to registry after merge `e81b8b8ee` local-build: build-and-upload-payload is not quay.io specific `13929fc61` gha: publish-kata-deploy-payload: Improve registry login `41026f003` gha: payload-after-push: Pass registry / repo as inputs `7855b4306` gha: ci-on-push: Adapt chained jobs to workflow_run `3a760a157` gha: ci-on-push: Adjust to using workflow_run `a159ffdba` gha: ci-on-push: Depend on Commit Message Check `8086c75f6` gha: Also run k8s tests on AKS with dragonball `fe86c08a6` tools: Avoid building the kernel twice `3215860a4` gha: Set ci-on-push to run on `pull_request_target` `d17dfe4cd` gha: Use ghcr.io for the k8s CI `b661e0cf3` rustjail: Add anyhow context for D-Bus connections `60c62c3b6` gha: Remove kata-deploy-test.yaml `43894e945` gha: Remove kata-deploy-push.yaml `cab9ca043` gha: Add a CI pipeline for Kata Containers `53b526b6b` gha: k8s: Add snippet to run k8s tests on aks clusters `c444c24bc` gha: aks: Add snippets to create / delete aks clusters `11e0099fb` tests: Move k8s tests to this repo `73be4bd3f` gha: Update actions for release.yaml `d38d7fbf1` gha: Remove code duplication from release.yaml `56331bd7b` gha: Split payload-after-push-*.yaml `a552a1953` docs: Update CNM url in networking document `7796e6ccc` rustjail: Fix minor grammatical error in function name `41fdda1d8` rustjail: Do not unwrap potential error with cgroup manager `a914283ce` kata-ctl: add function to get platform protection. `0f7351556` runtime: add filter metrics with specific names `cbe6ad903` runtime: support non-root for clh `d3bb25418` utils: Add function to check vhost-vsock Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-19 09:26:36 +02:00
Fabiano Fidêncio	0364620844	Merge pull request #6819 from fidencio/topic/use-static-sandbox-resource-mgmt-for-TEEs runtime: Use static_sandbox_resource_mgmt=true for TEEs	2023-05-18 22:38:31 +02:00
Fabiano Fidêncio	2ea8acaaa5	Merge pull request #6882 from bergwolf/github/tokio update tokio dependency	2023-05-18 20:35:16 +02:00
Krister Johansen	eff6ed2d5f	runtime: make debug console work with sandbox_cgroup_only If a hypervisor debug console is enabled and sandbox_cgroup_only is set, the hypervisor can fail to open /dev/ptmx, which prevents the sandbox from launching. This is caused by the absence of a device cgroup entry to allow access to /dev/ptmx. When sandbox_cgroup_only is not set, the hypervisor inherits the default unrestrcited device cgroup, but with it enabled it runs into allow / deny list restrictions. Fix by adding an allowlist entry for /dev/ptmx when debug is enabled, sandbox_cgroup_only is true, and no /dev/ptmx is already in the list of devices. Fixes: #6870 Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>	2023-05-18 10:36:24 -07:00
Gabriela Cervantes	11a34a72e2	docs: Update container network model url This PR updates the container network model url that is part of the virtcontainers documentation. Fixes #6889 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-05-18 15:08:08 +00:00
Peng Tao	f6e1b1152c	agent: update tokio dependency To 1.28.1 to bring in the latest fixes. Fixes: #6881 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-05-18 09:36:06 +00:00
Shuaiyi Zhang	c477ac551f	dragonball: Convert VirtioNetDeviceMgr function to method Convert VirtioNetDeviceMgr::insert_device and VirtioNetDeviceMgr::update_device_ratelimiters to method. Fixes: #6880 Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>	2023-05-18 16:57:01 +08:00
Shuaiyi Zhang	4659facb74	dragonball: Convert BlockDeviceMgr function to method Convert BlockDeviceMgr::insert_device, BlockDeviceMgr::remove_device and BlockDeviceMgr::update_device_ratelimiters to method. Fixes: #6880 Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>	2023-05-18 16:56:49 +08:00
Peng Tao	4cb83dc219	kata-ctl: update tokio dependency Update to 1.28.1 To pick up the latest fixes. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-05-18 08:25:13 +00:00
Peng Tao	df615ff252	runk: update tokio dependency Update to 1.28.1 to pick up latest fixes. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-05-18 08:24:41 +00:00
Peng Tao	ca6892ddb1	runtime-rs: update tokio dependency Unify it to the latest 1.28.1 version. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-05-18 08:18:22 +00:00
Fabiano Fidêncio	3a4b924226	Merge pull request #6833 from rye-stripe/bugfix/vcpu-pinning resource-control: fix setting CPU affinities on Linux	2023-05-18 08:12:39 +02:00
Xuewei Niu	ee6deef09d	dragonball: Remove virtio-net and vsock devices gracefully This MR implements removing virtio-net and virtio-vsock devices gracefully when shutting down VMM. Fixes: #6684 Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-05-18 12:11:20 +08:00
Fabiano Fidêncio	e762f70920	Merge pull request #6838 from rye-stripe/bugfix/use-enable-vcpus-pinning-from-toml runtime: use enable_vcpus_pinning from toml	2023-05-17 21:30:44 +02:00
Fabiano Fidêncio	ca1531fe9d	runtime: Use static_sandbox_resource_mgmt=true for TEEs When this option is enabled the runtime will attempt to determine the appropriate sandbox size (memory, CPU) before booting the virtual machine. As TEEs do not support memory and CPU hotplug, this approach must be used. Fixes: #6818 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-17 19:21:52 +02:00
Fabiano Fidêncio	851b97fa51	Merge pull request #6866 from fidencio/topic/gha-improve-actions gha: k8s: Make the tests more reliable	2023-05-17 19:19:18 +02:00
Fabiano Fidêncio	8ce14e709a	Merge pull request #6810 from fitzthum/snp-enable gha: Enable SEV-SNP tests on main	2023-05-17 15:29:54 +02:00
Greg Kurz	206df04b99	Merge pull request #6858 from fidencio/topic/gha-tdx-fix-cleanup gha: tdx: Use the k3s overlay for kata-cleanup	2023-05-17 15:04:56 +02:00
Wainer Moschetta	259158f1c3	Merge pull request #6789 from dubek/add-sev-package runtime: Port sev package to main	2023-05-17 10:02:19 -03:00
Fabiano Fidêncio	fa832f4709	gha: k8s: Make the tests more reliable We like it or not, every now and then we'll have to deal with flaky tests, and our tests using GHA are not exempt from that fact. With this simple commit, we're trying to improve the reliability of the tests in a few different fronts: * Giving enough time for the script used by kata-deploy to be executed * We've hit issues as the kata-deploy pod is considered "Ready" at the moment it starts running, not when it finishes the needed setup. We should also be looking on how to solve this on the kata-deploy side but, for now, let's ensure our tests do not break with the current kata-deploy behavior. * Merging the "Deploy kata-deploy" and "Run tests" steps * We've hit issues re-running tests and seeing even more failures than the ones we're trying to debug, as a step will simply be taken as succeeded as part of the re-run, in case it was successful executed as part of the first run. This causes issues with the kata-deploy deployment, as the tests would start running before even having the node set up for running Kata Containers. Fixes: #6865 #6649 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-17 13:38:08 +02:00
Tobin Feldman-Fitzthum	cbb9fe8b81	config: Use standard OVMF with SEV The AmdSev firmware package should be used with measured direct boot. If the expected hashes are not injected into the firmware binary by the VMM, the guest will not boot. This is required for security. Currently the main branch does not have the extended shim support for SEV, which tells the VMM to inject the expected hashes. We ship the standard OVMF package to use with SNP, so let's switch SEV to that for now. This will need to be changed back when shim support for SEV(-ES) is added to main. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-17 11:36:04 +02:00
Tobin Feldman-Fitzthum	724437efb3	kata-deploy: add kata-qemu-sev runtimeclass In order to populate containerd config file with support for SEV, we need to add the qemu-sev shim to the kata-deploy script. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-17 11:36:02 +02:00
Tobin Feldman-Fitzthum	521dad2a47	Tests: skip CPU constraints test on SEV and SNP Currently Kata does not support memory / CPU hotplug for SEV or SEV-SNP so we need to skip tests that rely on it. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-17 11:35:13 +02:00
Tobin Feldman-Fitzthum	72308ddb07	gha: ci-on-push: Don't skip tests for SEV Now that SEV artifacts are built by GHA, remove conditional that skips tests when using qemu-sev. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-17 11:35:13 +02:00
Tobin Feldman-Fitzthum	da0f92cef8	gha: ci-on-push: Don't skip tests for SEV-SNP Now that we have SNP artifacts in place and they are built via gha, remove the condition that skips the tests for SNP. Fixes: #6809 Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-17 11:35:13 +02:00
fupan	2bda92face	netlink: Fix the issue of update_interface When updating an interface, there's maybe an existed interface whose name would be the same with the updated required name, thus it would update failed with interface name existed error. Thus we should rename the existed interface with an temporary name and swap it with the previouse interface name last. Fixes: #6842 Signed-off-by: fupan <fupan.lfp@antgroup.com>	2023-05-17 16:45:49 +08:00
Fabiano Fidêncio	12f43bea0f	gha: tdx: Use the k3s overlay for kata-cleanup As the TDX CI runs on k3s, we must ensure the cleanup, as already done for the deploy, used the k3s overlay. Fixes: #6857 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-17 09:50:29 +02:00
Fabiano Fidêncio	9630c13ac0	Merge pull request #6845 from fidencio/topic/yet-more-nvidia-gpu-naming-fixes gpu: Rename the last bits from `gpu` to `nvidia-gpu`	2023-05-17 09:05:12 +02:00
Steve Horsman	e4a458035c	Merge pull request #6852 from stevenhorsman/container-image-arch-consistency deploy: fix shell script error	2023-05-17 08:01:39 +01:00
Amulya Meka	3ccc29030d	Merge pull request #6780 from Amulyam24/rust-virtfs ppc64le: switch virtiofsd from C to rust version	2023-05-17 09:36:28 +05:30
GabyCT	e0e46de12d	Merge pull request #6849 from GabyCT/topic/fixtabs osbuilder: Fix indentation in rootfs.sh	2023-05-16 16:47:09 -06:00
stevenhorsman	1a3f8fc1a2	deploy: fix shell script error - Remove local introduced by bad copy-paste Fixes: #6814 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-05-16 19:30:32 +01:00
Salvador Fuentes	b76058c979	Merge pull request #6721 from nedsouza/virtcontainers-qemu-go-coverage virtcontainers/qemu_test.go: Improve coverage	2023-05-16 11:11:43 -06:00
Feng Wang	ebc8e8e2fd	Merge pull request #6773 from jepio/agent-config-error-context agent: Add context to errors that may occur when AgentConfig file is …	2023-05-16 09:21:34 -07:00
Gabriela Cervantes	87cb98c01d	osbuilder: Fix indentation in rootfs.sh This PR replaces single spaces to tabs in order to fix the indentation of the rootfs script. Fixes #6848 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-05-16 15:30:50 +00:00
James O. D. Hunt	a96fcfd5be	Merge pull request #6735 from nedsouza/258/tests-coverage-compatoci virtcontainers/pkg/compatoci/: Improved coverage for for Kata 2.0	2023-05-16 15:36:35 +01:00
Amulyam24	c5a59caca1	ppc64le: switch virtiofsd from C to rust version We have been using the C version of virtiofsd on ppc64le. Now that the issue with rust virtiofsd have been fixed, let's switch to it. Fixes: #4259 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-05-16 14:46:19 +02:00
Amulyam24	bfdf0144aa	versions: Bump virtiofsd to 1.6.1 virtiofsd v1.6.1 has been released with the fixes required for running successfully on ppc64le. Fixes: #4259 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-05-16 14:46:16 +02:00
Dov Murik	dd7562522a	runtime: pkg/sev: Add kbs utility package for SEV pre-attestation Supports both online and offline modes of interaction with simple-kbs for SEV/SEV-ES confidential guests. Fixes: #6795 Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>	2023-05-16 15:27:32 +03:00
Dov Murik	05de7b2607	runtime: Add sev package The sev package provides utilities for launching AMD SEV and SEV-ES confidential guests. Fixes: #6795 Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>	2023-05-16 15:27:32 +03:00
Fabiano Fidêncio	3a9d3c72aa	gpu: Rename the last bits from `gpu` to `nvidia-gpu` Let's specifically name the `gpu` runtime class as `nvidia-gpu`. By doing this we keep the door open and ease the life of the next vendor adding GPU support for Kata Containers. Fixes: #6553 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-16 13:47:52 +02:00
Fabiano Fidêncio	4cde844f70	local-build: Fix kernel-nvidia-gpu target name It must have `-tarball` as part of its name. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-16 13:34:52 +02:00
Archana Shinde	8d10d157b3	Merge pull request #6823 from jodh-intel/utils-kata-manager-containerd-fix kata-manager: Fix '-o' syntax and logic error	2023-05-15 21:44:35 -07:00
Bin Liu	47a02dcc7f	Merge pull request #6767 from ngpatel6/Issue-5403 kata-ctl: Add the option to install kata-ctl to a user specified directory	2023-05-16 10:43:40 +08:00
Chao Wu	911d8a5a7f	Merge pull request #6804 from pmores/fix-rust-version-in-docs runtime-rs: fix building instructions to use correct required Rust ve…	2023-05-16 10:14:05 +08:00
Bin Liu	2cd2d02d1f	Merge pull request #6812 from ZhangShuaiyi/dev/write_bootparams Dragonball: use LinuxBootConfigurator::write_bootparams	2023-05-16 09:54:41 +08:00
GabyCT	3d8185863d	Merge pull request #6835 from GabyCT/topic/buildkataproxy kata-deploy: Add http_proxy as part of the docker build	2023-05-15 16:15:27 -06:00
Narendra Patel	593840e075	kata-ctl: Allow INSTALL_PATH= to be specified Update the kata-ctl install rule to allow it to be installed to a given directory The Makefile was updated to use an INSTALL_PATH variable to track where the kata-ctl binary should be installed. If the user doesn't specify anything, then it uses the default path that cargo uses. Otherwise, it will install it in the directory that the user specified. The README.md file was also updated to show how to use the new option. Fixes #5403 Co-authored-by: Cesar Tamayo <cesar.tamayo@intel.com> Co-authored-by: Kevin Mora Jimenez <kevin.mora.jimenez@intel.com> Co-authored-by: Narendra Patel <narendra.g.patel@intel.com> Co-authored-by: Ray Karrenbauer <ray.karrenbauer@intel.com> Co-authored-by: Srinath Duraisamy <srinath.duraisamy@intel.com> Signed-off-by: Narendra Patel <narendra.g.patel@intel.com>	2023-05-15 17:21:49 -04:00
Peteris Rudzusiks	bdb75fb21e	runtime: use enable_vcpus_pinning from toml Set the default value of runtime's EnableVCPUsPinning to value read from .toml. Fixes: #6836 Signed-off-by: Peteris Rudzusiks <rye@stripe.com>	2023-05-15 21:41:20 +02:00
Tamas K Lengyel	20cb875087	virtcontainers/qemu_test.go: Improve test coverage Rework TestQemuCreateVM routine to be a table driven test with various config variations passed to it. After CreateVM a handful of additional functions are exercised to improve code-coverage. Also add partial coverage for StartVM routine. Currently improving from 19.7% to 35.7% Credit PR to Hackathon Team3 Fixes: #267 Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>	2023-05-15 15:26:35 -04:00
Fabiano Fidêncio	da877a603d	Merge pull request #6829 from fidencio/topic/kata-deploy-remove-tarball-from-payload-image kata-deploy: Do not ship the kata tarball	2023-05-15 19:01:14 +02:00
Gabriela Cervantes	b9a1db2601	kata-deploy: Add http_proxy as part of the docker build Add http_proxy and https_proxy as part of the docker build arguments in order to build properly when we are behind a proxy. Fixes #6834 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-05-15 15:57:29 +00:00
Peteris Rudzusiks	3e85bf5b17	resource-control: fix setting CPU affinities on Linux With this fix the vCPU pinning feature chooses the correct physical cores to pin the vCPU threads on rather than always using core 0. Fixes #6831 Signed-off-by: Peteris Rudzusiks <rye@stripe.com>	2023-05-15 16:46:36 +02:00
Pavel Mores	5f3f844a1e	runtime-rs: fix building instructions with respect to required Rust version Fixes: #6803 Signed-off-by: Pavel Mores <pmores@redhat.com>	2023-05-15 16:30:41 +02:00
Fabiano Fidêncio	9e83795fca	Merge pull request #6825 from fidencio/topic/kata-deploy-build-improvements kata-deploy: Build improvements	2023-05-15 13:49:15 +02:00
Fabiano Fidêncio	802cd2f673	Merge pull request #6821 from stevenhorsman/container-image-arch-consistency deploy: Fix arch in image tag	2023-05-15 11:16:01 +02:00
Fabiano Fidêncio	815b4e8dac	Merge pull request #6816 from fidencio/topic/kata-deploy-fixes Revert "kata-deploy: Use readinessProbe to ensure everything is ready"	2023-05-15 10:24:58 +02:00
Fabiano Fidêncio	777c3dc8d2	kata-deploy: Do not ship the kata tarball There's absolutely no reason to ship the kata-static tarball as part of the payload image, as: * The tarball is already part of the release process * The payload image already has uncompressed content of the tarball * The tarball itself is not used anywhere by the kata-deploy scripts Fixes: #6828 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-15 09:22:39 +02:00
LiuWeijie	50cc9c582f	tests: Improve coverage for virtcontainers/pkg/compatoci/ for Kata 2.0 Add test cases for ParseConfigJson function and GetContainerSpec function Fixes: #258 Signed-off-by: LiuWeijie <weijie.liu@intel.com>	2023-05-15 11:58:17 +08:00
Fabiano Fidêncio	136e2415da	static-build: Download firecracker instead of building it There's no reason for us to build firecracker instead of simply downloading the official released tarball, as tarballs are provided for the architectures we want to use them. Fixes: #6770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-12 22:05:33 +02:00
Fabiano Fidêncio	3bf767cfcd	static-build: Adjust ARCH for nydus When building from aarch64, just use "arm64" as that's what's used in the name of the released nydus tarballs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-12 22:05:33 +02:00
Fabiano Fidêncio	ac88d34e0c	static-build: Use relased binary for CLH (aarch64) There's no need to build Cloud Hypervisor aarch64 as, for a few releases already, Cloud Hypervisor provides an official release binary for the architecture. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-12 22:05:01 +02:00
Archana Shinde	32b39ee347	Merge pull request #6763 from nedsouza/266/tests_coverage_virtcontainers_fc virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%	2023-05-12 11:53:27 -07:00
James O. D. Hunt	73913c8eb7	kata-manager: Fix '-o' syntax and logic error Fix the syntax and logic error that is only displayed if the user runs the script with `-o`. This option requests that "only" Kata Containers is installed and stops containerd from being installed. Fixes: #6822. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-05-12 16:44:24 +01:00
stevenhorsman	2856d3f23d	deploy: Fix arch in image tag `uname -m` produces `x86_64`, but container image convention is to use `amd64`, so update this in the tag Fixes: #6820 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-05-12 16:14:19 +01:00
Fabiano Fidêncio	42dce15b1f	Merge pull request #6450 from singhwang/main main \| release: Fix multi-arch publishing is not supported	2023-05-12 15:25:59 +02:00
Fabiano Fidêncio	e8f81ee93d	Revert "kata-deploy: Use readinessProbe to ensure everything is ready" This reverts commit `5ec9ae0f04`, for two main reasons: * The readinessProbe was misintepreted by myself when working on the original PR * It's actually causing issues, as the pod ends up marked as not healthy.	2023-05-12 14:28:23 +02:00
SinghWang	cfe63527c5	release: Fix multi-arch publishing is not supported When release is published, kata-deploy payload and kata-static package can support multi-arch publishing. Fixes: #6449 Signed-off-by: SinghWang <wangxin_0611@126.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-12 13:36:44 +02:00
Shuaiyi Zhang	197c336516	Dragonball: use LinuxBootConfigurator::write_bootparams to writes the boot parameters into guest memory. Fixes: #6813 Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>	2023-05-12 16:07:44 +08:00
Fabiano Fidêncio	181017d1d8	Merge pull request #6811 from fidencio/topic/yet-more-fixes-for-nvidia-gpu-kernels cache: More fixes to nvidia-gpu kernels caching	2023-05-12 10:02:08 +02:00
Amulya Meka	76f975e5e6	Merge pull request #6742 from Amulyam24/agent-build runtime: remove overriding ARCH value by default for ppc64le	2023-05-12 12:34:50 +05:30
Archana Shinde	20ac3917ad	Merge pull request #6739 from byron-marohn/fix_5561 gha: Fix Body Line Length action flagging empty body commit messages	2023-05-11 15:17:07 -07:00
Archana Shinde	1ad442e656	Merge pull request #6748 from nedsouza/fix-snap gha: Fix snap creation workflow	2023-05-11 15:09:22 -07:00
Fabiano Fidêncio	4d17ea4a01	cache: Fix nvidia-snp caching version All the kernel-foo instances, such as "kernel-sev" or "kernel-snp", should be transformed into "kernel.foo" when looking at the versions.yaml file. This was already done for SEV, but missed on the SNP case. Fixes: #6777 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-11 21:26:58 +02:00
Fabiano Fidêncio	a133fadbfa	cache: Fix nvidia-gpu-tdx-experimental cache URL We were passing "kernel-nvidia-gpu-tdx", missing the "-experimental" part, leading to a non-valid URL. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-11 21:20:06 +02:00
Fabiano Fidêncio	a7dd6cbadd	Merge pull request #6807 from fidencio/topic/fix-nvidia-gpu-cache cache: Fix nvidia-gpu version	2023-05-11 17:40:41 +02:00
Fabiano Fidêncio	b9990c2017	cache: Fix nvidia-gpu version `c9bf7808b6` introduced the logic to properly get the version of nvidia-gpu kernels, but one important part was dropped during the rebase into main, which is actually getting the correct version of the kernel. Fixing this now, and using the old issue as reference. Fixes: #6777 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-11 13:55:14 +02:00
Fabiano Fidêncio	14939d00ad	Merge pull request #6778 from fidencio/topic/cache-gpu-related-kernels cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu	2023-05-11 13:14:45 +02:00
Fabiano Fidêncio	c9bf7808b6	cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu We need to make sure that, when caching a `-nvidia-gpu` kernel, we still look at the version of the base kernel used to build the nvidia-gpu drivers, as the ${vendor}-gpu kernels are based on already existing entries in the versions.yaml file and do not require a new entry to be added. Fixes: #6777 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-11 10:56:13 +02:00
Fabiano Fidêncio	3665b42045	gpu: Rename `gpu` targets to `nvidia-gpu` This will make it easier for other GPU vendors to add the needed bits in the future. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-05-11 10:55:55 +02:00
Fabiano Fidêncio	edfaae85cb	Merge pull request #6700 from fitzthum/snp-artifacts packaging: Add SEV-SNP artifacts to main	2023-05-11 10:47:10 +02:00
James O. D. Hunt	fe33015075	Merge pull request #6794 from jodh-intel/docs-mark-snap-as-unmaintained docs: Mark snap installation method as unmaintained	2023-05-11 09:14:25 +01:00
Fabiano Fidêncio	c937d0a5d4	Merge pull request #6591 from UnmeshDeodhar/add-sev-artifacts-to-main packaging: Add sev artifacts to main	2023-05-11 09:09:36 +02:00
Tobin Feldman-Fitzthum	2c90cac751	local-build: fixup alphabetization A few pieces of the local-build tooling are supposed to be alphabetized. Fixup a couple minor issues that have accumulated. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 21:23:38 +00:00
Tobin Feldman-Fitzthum	4da6eb588d	kata-deploy: Add qemu-snp shim Now that we have the SNP components in place, make sure that kata-deploy knows about the qemu-snp shim so that it will be added to containerd config. Fixes: #6575 Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 20:55:36 +00:00
Tobin Feldman-Fitzthum	14dd053758	kata-deploy: add kata-qemu-snp runtimeclass Since SEV-SNP has limited hotplug support, increase the pod overhead to account for fixed resource usage. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 20:55:36 +00:00
Tobin Feldman-Fitzthum	0bb37bff78	config: Add SNP configuration SNP requires many specific configurations, so let's make a new SNP configuration file that we can use with the kata-qemu-snp runtime class. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Alex Carter <Alex.Carter@ibm.com>	2023-05-10 20:55:36 +00:00
Chelsea Mafrica	13f9ba2298	Merge pull request #6379 from cmaf/kata-ctl-check-kvm-1 kata-ctl: add generic kvm check & unit test	2023-05-10 13:33:57 -07:00
Tobin Feldman-Fitzthum	af7f2519bf	versions: update SEV kernel description SNP and SEV will share a (guest) kernel. Update the description in versions.yaml to mention this. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 20:27:12 +00:00
Tobin Feldman-Fitzthum	dbcc3b5cc8	local-build: fix default values for OVMF build Existing value has wrong name and compression type leading to installation failure. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 20:27:12 +00:00
Tobin Feldman-Fitzthum	b8bbe6325f	gha: build OVMF for tests and release The x86_64 package of OVMF is required for deployments that don't use kernel hashes, which includes SEV-SNP in the short term. We should keep this in the bundle in the long term in case someone wants to disable kernel hashes. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 20:27:12 +00:00
Tobin Feldman-Fitzthum	cf0ca265f9	local-build: Add x86_64 OVMF target Add targets to build the "plain" x86_64 OVMF. This will be used by anyone who is using SEV or SNP without kernel hashes. The SNP QEMU does not yet support kernel hashes so the OvmfPkg will be used by default. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Alex Carter <Alex.Carter@ibm.com>	2023-05-10 20:24:51 +00:00
Tobin Feldman-Fitzthum	db095ddeb4	cache: add SNP flavor to comments Update comments to include new SNP QEMU option Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 20:19:56 +00:00
Tobin Feldman-Fitzthum	f4ee00576a	gha: Build and ship QEMU for SNP Now that we can build SNP QEMU, let's do that for tests and release. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 20:19:56 +00:00
Tobin Feldman-Fitzthum	7a58a91fa6	docs: update SNP guide Since we reshuffled versions.yaml, update the guide so that we can find the SNP QEMU info. Once runtime support is merged we should overhaul or remove this guide, but let's keep it for now. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2023-05-10 20:19:56 +00:00
Tobin Feldman-Fitzthum	879333bfc7	versions: update SNP QEMU version Refactor SNP QEMU entry in versions.yaml to match qemu-experimental and qemu-tdx-experimental. Also, update the version of QEMU to what we are using in CCv0. This is the non-UPM QEMU and it does not have kernel hashes support. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Alex Carter <Alex.Carter@ibm.com>	2023-05-10 20:19:56 +00:00
Tobin Feldman-Fitzthum	38ce4a32af	local-build: add support to build QEMU for SEV-SNP Add Make targets and helper functions to build the QEMU needed for SEV-SNP. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Alex Carter <Alex.Carter@ibm.com>	2023-05-10 20:19:56 +00:00
Chelsea Mafrica	5f8008b69c	kata-ctl: add unit test for kvm check Check that kvm test fails when run as non-root and when device specified is not /dev/kvm. Fixes #5338 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-05-10 10:29:20 -07:00
Chelsea Mafrica	a085a6d7b4	kata-ctl: add generic kvm check Add kvm check using ioctl macro to create a syscall that checks the kvm api version and if creation of a vm is successful. Fixes #5338 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-05-10 10:29:20 -07:00
Unmesh Deodhar	772d4db262	gha: Build and ship SEV initrd We have code that builds initrd for SEV. thus, adding that to the test and release process. Fixes: #6572 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:56 -05:00
Unmesh Deodhar	45fa366926	gha: Build and ship SEV OVMF SEV requires special OVMF to work. Thus, building that for test and release. Fixes: #6572 Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:56 -05:00
Unmesh Deodhar	4770d3064a	gha: Build and ship SEV kernel. SEV requires custom kernel arguments when building. Thus, adding it to the test and release process. Fixes: #6572 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:56 -05:00
Unmesh Deodhar	fb9c1fc36e	runtime: Add qemu-sev config Adding config file that can be used with qemu-sev runtime class. Since SEV has limited hotplug support, increase the pod overhead to account for fixed resource usage. Fixes: #6572 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:56 -05:00
Unmesh Deodhar	813e4c576f	runtimeClasses: add sev runtime class Adding kata-qemu-sev runtime class. Fixes: #6572 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:56 -05:00
Unmesh Deodhar	af18806a8d	static-build: Add caching support to sev ovmf SEV requires special OVMF. Now that we have ability to build this custom OVMF, let's optimize it by caching so that we don't have to build it for every run. Fixes: sev: #6572 Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:55 -05:00
Unmesh Deodhar	76ae7a3abe	packaging: adding caching capability for kernel The SEV initrd build requires kernel modules. So, for SEV case, we need to cache kernel modules tarball in addition to kernel tarball. Fixes: #6572 Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:55 -05:00
Unmesh Deodhar	12c5ef9020	packaging: add support to build OVMF for SEV SEV requires special OVMF to work with kernel hashes. Thus, adding changes that builds this custom OVMF for SEV. Fixes: #6572 Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:55 -05:00
Unmesh Deodhar	b87820ee8c	packaging: add support to build initrd for sev We need special initrd for SEV. The work on SEV initrd is based on Ubuntu. Thus, adding another entry in versions.yaml This binary will have '-sev' suffix to distinguish it from the generic binary. Fixes: #6572 Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>	2023-05-10 12:19:55 -05:00
James O. D. Hunt	e1f3b871cd	docs: Mark snap installation method as unmaintained The snap package is no longer being maintained so update the docs to warn readers. We'll remove the snap installation docs in a few weeks. See: #6769. Fixes: #6793. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-05-10 18:02:46 +01:00
Jeremi Piotrowski	022a33de92	agent: Add context to errors when AgentConfig file is missing When the agent config file is missing, the panic message says "no such file or directory" but doesn't inform the user about which file was missing. Add context to the parsing (with filename) and to the from_config_file() calls (with information where the path is coming from). Fixes: #6771 Depends-on: github.com/kata-containers/tests#5627 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-05-10 08:43:16 +02:00
Fabiano Fidêncio	6881b9558b	Merge pull request #6512 from gabevenberg/log-parser-rs Log-parser-rs	2023-05-10 08:22:59 +02:00
Chao Wu	7218229af0	Merge pull request #6594 from Apokleos/warning_fix_1.68.0 warning_fix: fix warnings when build with cargo-1.68.0	2023-05-10 09:51:45 +08:00
Unmesh Deodhar	b0e6a094be	packaging: Add sev kernel build capability Adding code that builds sev kernel. Fixes: #6572 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-05-09 13:47:22 -05:00
Tim Zhang	b0b5d7082e	Merge pull request #6753 from amshinde/add-cross-building-with-cross cross-compile: Include documentation and configuration for cross-compile	2023-05-09 16:31:40 +08:00
Feng Wang	4e0dce6802	Merge pull request #6738 from fengwang666/oss-fix-fd-leak runtime: Fix virtiofs fd leak	2023-05-08 10:52:36 -07:00
Eduardo Berrocal	a4c0303d89	virtcontainers: Fixed static checks for improved test coverage for fc.go Expanded tests on fc_test.go to cover more lines of code. Coverage went from 4.6% to 18.5%. Fixed very simple static check fail on line 202. Fixes: #266 Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>	2023-05-07 00:17:36 -07:00
Peng Tao	65670e6b0a	Merge pull request #6699 from zvonkok/cold-plug-vfio gpu: cold plug VFIO devices	2023-05-05 10:04:29 +08:00
Archana Shinde	b86d32aba9	Merge pull request #6728 from nedsouza/256/tests_coverage_pkg_signals pkg/signals: Improved test coverage 60% to 100%	2023-05-04 16:19:12 -07:00
Archana Shinde	9443c4aea7	Merge pull request #6729 from nedsouza/259/tests_coverage_virtcontainers_persist virtcontainers/persist: Improved test coverage 65% to 87.5%	2023-05-04 16:18:55 -07:00
Archana Shinde	09134c30de	Merge pull request #6737 from nedsouza/265/virtcontainers-clh-go-coverage virtcontainers/clh_test.go: improve unit test coverage	2023-05-04 16:15:43 -07:00
Archana Shinde	8495f830b7	cross-compile: Include documentation and configuration for cross-compile `cross` is an open source tool that provides zero-setup cross compile for rust binaries. Add documentation on this tool for compiling kata-ctl tool and Cross.toml file that provides required configuration for installing dependencies for various targets. This is pretty useful for a developer to make sure code compiles and passes checks for various architectures. Fixes: #6765 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-05-04 14:13:00 -07:00
Bin Liu	e57ac2ae18	Merge pull request #6749 from nedsouza/260/tests_coverage_virtcontainers_factory virtcontainers/factory: Improved test coverage	2023-05-04 10:54:40 +08:00
Zvonko Kaiser	13d7f39c71	gpu: Check for VFIO port assignments Bailing out early if the port is wrong, allowed port settings are no-port, root-port, switch-port Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-05-03 12:32:33 +00:00
Gabe Venberg	6594a9329d	tools: made log-parser-rs Eventual replacement of kata-log-parser, but for now replicates its functionaility for the new runtime-rs syntax. Takes in log files, parses, sorts by timestamp, spits them out in json, csv, xml, toml, and a few others. Fixes #5350 Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>	2023-05-02 13:16:54 -05:00
Wainer Moschetta	f5ff975560	Merge pull request #6723 from ryansavino/gha-k8s-also-test-snp gha: Also run k8s tests on qemu-snp	2023-05-01 10:37:12 -03:00
Fabiano Fidêncio	b6e54676eb	Merge pull request #6759 from ryansavino/gha-sev-kata-deploy-fix gha: sev: fix for kata-deploy error	2023-05-01 11:42:16 +02:00
Eduardo Berrocal	03a8cd69c2	virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5% Expanded tests on fc_test.go to cover more lines of code. Coverage went from 4.6% to 18.5%. Fixes: #266 Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>	2023-04-28 15:40:45 -07:00
Ryan Savino	9e2b7ff177	gha: sev: fix for kata-deploy error kubectl commands need a '-f' instead of a '-k' Fixes: #6758 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2023-04-28 14:54:36 -05:00
Ryan Savino	5c9246db19	gha: Also run k8s tests on qemu-snp Added the k8s tests for qemu-snp Fixes: #6722 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2023-04-28 14:43:53 -05:00
Ryan Savino	c57a44436c	gha: Add the ability to test qemu-snp With the changes proposed as part of this PR, a qemu-snp cluster will be created but no tests will be performed. GitHub Actions will only run the tests using the workflows that are part of the target branch, instead of the using the ones coming from the PR. No way to work around this for now. After this commit is merged, the tests (not the yaml files for the actions) will be altered in order for the checkout action to help in this case. Fixes: #6722 Signed-off-by: Ryan Savino <ryan.savino@amd.com>	2023-04-28 13:07:13 -05:00
Wainer Moschetta	29785a43d7	Merge pull request #6712 from ryansavino/gha-k8s-also-test-sev gha: Also run k8s tests on qemu-sev	2023-04-28 14:22:03 -03:00
Archana Shinde	65c61785fc	Merge pull request #6660 from amshinde/kata-ctl-cmd Implement the "kata-ctl env" command	2023-04-28 01:33:28 -07:00
Archana Shinde	4064192896	env: Utilize arch specific functionality to get cpu details Have kata-env call architecture specific function to get cpu details instead of generic function to get cpu details that works only for certain architectures. The functionality for cpu details has been fully implemented for x86_64 and arm architectures, but needs to be implemented for s390 and powerpc. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-27 16:45:41 -07:00
Archana Shinde	fb40c71a21	env: Check for root privileges Check for root privileges early on. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-27 16:45:41 -07:00
Archana Shinde	1016bc17b7	config: Add api to fetch config from default config path Add api to fetch config from default config path and use that in kata-ctl tool. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-27 16:45:41 -07:00
Archana Shinde	b908a780a0	kata-env: Pass cmd option for file path Add ability to write the environment information to a file or stdout if file path is absent. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-27 16:45:41 -07:00
Archana Shinde	b1920198be	config: Workaround the way agent and hypervisor configs are fetched This is essentially a workaround for the issue: https://github.com/kata-containers/kata-containers/issues/5954 runtime-rs chnages the Kata config format adding agent_name and hypervisor_name which are then used as keys to fetch the agent and hypervisor configs. This will not work for older configs. So use the first entry in the hashmaps to fetch the configs as a workaround while the config change issue is resolved. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-27 16:45:41 -07:00
Archana Shinde	f2b2621dec	kata-env: Implement the kata-env command. Command implements functionality to get user environment settings. Fixes: #5339 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-27 16:45:41 -07:00
Ryan Savino	c849bdb0a5	gha: Also run k8s tests on qemu-sev Added the k8s tests for qemu-sev Fixes: #6711 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2023-04-27 15:24:08 -05:00
Eduardo Berrocal	6bf1fc6051	virtcontainers/factory: Improved test coverage Expanded tests on factory_test.go to cover more lines of code. Coverage went from 34% to 41.5% in the case of user-mode run tests, and from 77.7% to 84% in the case of priviledge-mode run tests. Fixes: #260 Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>	2023-04-27 13:08:35 -07:00
Tamas K Lengyel	0d49ceee0b	gha: Fix snap creation workflow warnings Fix recurring issues of failing to install dependencies due to stale apt cache. Uprev actions/checkout to v3 to resolve issue "Node.js 12 actions are deprecated." Fixes: #5659 Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>	2023-04-27 18:40:02 +00:00
Zvonko Kaiser	138ada049c	gpu: Cold Plug VFIO toml setting Added the cold_plug_vfio setting to the qemu-toml.in with some epxlanation Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-27 11:04:45 +00:00
Amulyam24	defb643346	runtime: remove overriding ARCH value by default for ppc64le Currently, ARCH value is being set to powerpc64le by default. powerpc64le is only right in context of rust and any operation which might use this variable for a different purpose would fail on ppc64le. Fixes: #6741 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-04-27 16:17:48 +05:30
Zvonko Kaiser	f7ad75cb12	gpu: Cold-plug extend the api.md Make the hypervisorconfig consistent in code and api.md Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-27 09:35:05 +00:00
Zvonko Kaiser	0fec2e6986	gpu: Add cold-plug test Cold plug setting is now correctly decoded in toml Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-27 09:30:24 +00:00
Archana Shinde	f2ebdd81c2	utils: Get rid of spurious print statement left behind. The print was used for debugging, get ris of it. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
Archana Shinde	9a94f1f149	make: Export VERSION and COMMIT These will be consumed by kata-ctl, so export these so that they can be used to replace variables available to the rust binary. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
Archana Shinde	2f81f48dae	config: Add file under /opt as another location to look for the config Most of kata installation tools use this path for installation, so add this to the paths to look for the configuration.toml file. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
Archana Shinde	07f7d17db5	config: Make the pipe_size field optional Add the serde default attribute to the field so that parsing can continue if this field is not present. The agent assumes a default value for this, so it is not required by the user to provide a value here. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
Archana Shinde	68f6357731	config: Make function to get the default conf file public This will be used by the kata-env command. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
Archana Shinde	7565b33568	kata-ctl: Implement Display trait for GuestProtection enum Implement Display for enum to display in env output. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
Archana Shinde	94a00f9346	utils: Make certain constants in utils.rs public These would be used outside of utils. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
Archana Shinde	572b338b3b	gitignore: Ignore .swp and .swo editor backup files Ignore temporary files created by vim editor. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
Archana Shinde	376884b8a4	cargo: Update version of clap to 4.1.13 This version includes macros related to using command options. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-26 22:12:30 -07:00
alex.lyn	17daeb9dd7	warning_fix: fix warnings when build with cargo-1.68.0 Fixes: #6593 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-04-27 10:29:50 +08:00
Ryan Savino	521519d745	gha: Add the ability to test qemu-sev With the changes proposed as part of this PR, a qemu-sev cluster will be created but no tests will be performed. GitHub Actions will only run the tests using the workflows that are part of the target branch, instead of the using the ones coming from the PR. No way to work around this for now. After this commit is merged, the tests (not the yaml files for the actions) will be altered in order for the checkout action to help in this case. Fixes: #6711 Signed-off-by: Ryan Savino <ryan.savino@amd.com>	2023-04-26 17:56:28 -05:00
Feng Wang	205909fbed	runtime: Fix virtiofs fd leak The kata runtime invokes removeStaleVirtiofsShareMounts after a container is stopped to clean up the stale virtiofs file caches. Fixes: #6455 Signed-off-by: Feng Wang <fwang@confluent.io>	2023-04-26 15:53:39 -07:00
Byron Marohn	5226f15c84	gha: Fix Body Line Length action flagging empty body commit messages Change the Body Line Length workflow to not trigger when the commit message contains only a message without a body. Other workflows will flag the missing body sections, and it was confusing to have an error message that said 'Body line too long (max 150)' when this was not actually the case. Fixes: #5561 Co-authored-by: Jayant Singh <jayant.singh@intel.com> Co-authored-by: Luke Phillips <lucas.phillips@intel.com> Signed-off-by: Byron Marohn <byron.marohn@intel.com> Signed-off-by: Jayant Singh <jayant.singh@intel.com> Signed-off-by: Luke Phillips <lucas.phillips@intel.com> Signed-off-by: Kelby Madal-Hellmuth <kelby.madal-hellmuth@intel.com> Signed-off-by: Liz Lawrens <liz.lawrens@intel.com>	2023-04-26 17:29:16 -04:00
Tamas K Lengyel	0f45b0faa9	virtcontainers/clh_test.go: improve unit test coverage Credit PR to Hackathon Team3 Fixes: #265 Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>	2023-04-26 19:12:51 +00:00
Zvonko Kaiser	dded731db3	gpu: Add OVMF setting for MMIO aperture The default size of OVMFs aperture is too low to initialized PCIe devices with huge BARs Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-26 09:47:37 +00:00
Zvonko Kaiser	2a830177ca	gpu: Add fwcfg helper function Added driver util function for easier handling of VFIO devices outside of the VFIO module. At the sandbox level we may need to set options depending if we have a VFIO/PCIe device, like the fwCfg for confiential guests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-26 09:47:37 +00:00
Zvonko Kaiser	131f056a12	gpu: Extract VFIO Functions to drivers Some functions may be used in other modules then only in the VFIO module, extract them and make them available to other layers like sandbox. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-26 09:47:37 +00:00
Zvonko Kaiser	c8cf7ed3bc	gpu: Add ColdPlug of VFIO devices with devManager If we have a VFIO device and cold-plug is enabled we mark each device as ColdPlug=true and let the VFIO module do the attaching. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-26 09:47:37 +00:00
Zvonko Kaiser	e2b5e7f73b	gpu: Add Rawdevices to hypervisor RawDevics are used to get PCIe device info early before the sandbox is started to make better PCIe topology decisions Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-26 09:47:37 +00:00
Zvonko Kaiser	6107c32d70	gpu: Assign default value to cold-plug Make sure the configuration is propagated to the right structs and the default value is assigned. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-26 09:47:37 +00:00
Zvonko Kaiser	377ebc2ad1	gpu: Add configuration option for cold-plug VFIO Users can set cold-plug="root-port" to cold plug a VFIO device in QEMU Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-26 09:47:37 +00:00
Zvonko Kaiser	c18ceae109	gpu: Add new struct PCIePort For the hypervisor to distinguish between PCIe components, adding a new enum that can be used for hot-plug and cold-plug of PCIe devices Fixes: #6687 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-26 09:47:37 +00:00
Bin Liu	509bc8b6c8	Merge pull request #6718 from openanolis/mengze/keep_abnormal runtime-rs: support keep_abnormal in toml config	2023-04-26 12:36:52 +08:00
Bin Liu	b6d880510a	Merge pull request #6595 from zvonkok/gpu-snp-tdx-kernel gpu: Build and Ship an GPU enabled Kernel	2023-04-26 12:33:51 +08:00
Eduardo Berrocal	9c38204f13	virtcontainers/persist: Improved test coverage 65% to 87.5% Expanded tests on manager_test.go to cover more lines of code. Fixes: #259 Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>	2023-04-25 23:53:46 +00:00
Eduardo Berrocal	1c1ee8057c	pkg/signals: Improved test coverage 60% to 100% Expanded tests on signals_test.go to cover more lines of code. 'go test' won't show 100% coverage (only 66.7%), because one test need to spawn a new process (since it is testing a function that calls os.Exit(1)). Fixes: #256 Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>	2023-04-25 23:34:13 +00:00
mengze	cc8ea3232e	runtime-rs: support keep_abnormal in toml config This patch adds keep_abnormal in runtime config. If keep_abnormal = true, it means that 1) if the runtime exits abnormally, the cleanup process will be skipped, and 2) the runtime will not exit even if the health check fails. This option is typically used to retain abnormal information for debugging and should NOT be enabled by default. Fixes: #6717 Signed-off-by: mengze <mengze@linux.alibaba.com> Signed-off-by: quanweiZhou <quanweiZhou@linux.alibaba.com>	2023-04-25 13:47:44 +08:00
David Esparza	7fdaab49bc	Merge pull request #6295 from dborquez/add_kernel_module_checks_kvm kata-ctl: checks for kvm, kvm_intel modules loaded	2023-04-24 13:33:18 -06:00
Greg Kurz	0ca6d3b726	Merge pull request #6681 from Vlad1mir-D/6677-fix-kata-agent-dbus-connection osbuilder: Fix D-Bus enabling in the dracut case	2023-04-24 17:31:13 +02:00
Bin Liu	3d8688f92e	Merge pull request #6620 from jongwu/docker_fail_start_snap snap: fix docker start fail issue	2023-04-24 10:53:16 +08:00
Archana Shinde	97291d88e9	Merge pull request #6696 from amshinde/kata-manager-containerd-fix kata-manager: Fix containerd download	2023-04-21 09:54:30 -07:00
Archana Shinde	96e8470dbe	kata-manager: Fix containerd download Newer containerd releases have an additional static package published. Because of this, download_url contains two urls causing curl to fail. To resolve this, pick the first url from the containerd releases to download containerd. Fixes: #6695 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-04-20 23:08:51 -07:00
David Esparza	432d407440	kata-ctl: checks for kvm, kvm_intel modules loaded Ensure that kvm and kvm_intel modules are loaded. Renames the get_cpu_info() function to read_file_contents() Fixes #5332 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-04-20 11:29:36 -06:00
Zvonko Kaiser	b1730e4a67	gpu: Add new kernel build option to usage() With each release make sure we ship a GPU enabled kernel Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-20 07:48:30 +00:00
Fupan Li	ceefd50bd0	Merge pull request #6680 from Tim-Zhang/fix-ut-bad-fd agent: Fix ut issue caused by fd double closed	2023-04-20 11:18:27 +08:00
Fupan Li	a7b4b69230	Merge pull request #6673 from Tim-Zhang/upgrade-ttrpc-protobuf Bump ttrpc to 0.7.2 and protobuf to 3.2.0	2023-04-20 10:13:43 +08:00
Fupan Li	a1568cd2f5	Merge pull request #6676 from zvonkok/gpu-runtime gpu: Add GPU enabled confguration and runtime	2023-04-19 13:01:49 +08:00
Vladimir	3e7b902265	osbuilder: Fix D-Bus enabling in the dracut case - D-Bus enabling now occurs only in setup_rootfs (instead of prepare_overlay and setup_rootfs) - Adjust permissions of / so dbus-broker will be able to traverse FS These changes enables kata-agent to successfully communicate with D-Bus. Fixes #6677 Signed-off-by: Vladimir <amigo.elite@gmail.com>	2023-04-18 23:17:34 +03:00
Tim Zhang	53c749a9de	agent: Fix ut issue caused by fd double closed Never ever try to close the same fd double times, even in a unit test. A file descriptor is a number which will be reused, so when you close the same number twice you may close another file descriptor in the second time and then there will be an error 'Bad file descriptor (os error 9)' while the wrongly closed fd is being used. Fixes: #6679 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-18 23:19:10 +08:00
Hyounggyu Choi	5c032c64ac	Merge pull request #6664 from zvonkok/vfio-fix gpu: Do not pass-through PCI (Host) Bridges	2023-04-18 19:50:15 +09:00
Tim Zhang	2e3f19af92	agent: fix clippy warnings caused by protobuf3 Fix warnings introduced by protobuf upgrade. Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 20:15:49 +08:00
Tim Zhang	4849c56faa	agent: Fix unit test issue cuased by protobuf upgrade Fixes: #6646 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 19:49:21 +08:00
Tim Zhang	0a582f7815	trace-forwarder: remove unused crate protobuf Remove unused crate protobuf. Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 19:49:21 +08:00
Tim Zhang	73253850e6	kata-ctl: remove unused crate ttrpc Remove unused crate ttrpc. Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 19:49:21 +08:00
Tim Zhang	76d2e30547	agent-ctl: Bump ttrpc from 0.6.0 to 0.7.1 Fixes: #6646 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 19:49:21 +08:00
Tim Zhang	eb3d20dccb	protocols: Add ut for Serde Fixes: #6646 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 19:49:21 +08:00
Tim Zhang	59568c79dd	protocols: add support for Serde rust-protobuf@3 does not support Serde natively anymore. So we need to do it by ourselves. Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 19:49:21 +08:00
Tim Zhang	a6b4d92c84	runtime-rs: Bump ttrpc from 0.6.0 to 0.7.1 Fixes: #6646 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 19:49:20 +08:00
Zvonko Kaiser	ac7c63bc66	gpu: Add containerd shim for qemu-gpu Last but not least add the continerd shim configuration pointing to the correct configuration-<shim>.toml Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-17 10:45:04 +00:00
Zvonko Kaiser	a0cc8a75f2	gpu: Add a kube runtime class With the added configuration add the corresponding kube runtime class. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-17 10:42:04 +00:00
Zvonko Kaiser	a81fff706f	gpu: Adding a GPU enabled configuration We need to set hotplug on pci root port and enable at least one root port. Also set the guest-hooks-dir to the correct path Fixes: #6675 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-17 10:40:09 +00:00
Tim Zhang	8af6fc77cd	agent: Bump ttrpc from 0.6.0 to 0.7.1 Fixes: #6646 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 18:31:41 +08:00
Tim Zhang	009b42dbff	protocols: Fix unit test Fixes: #6646 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 18:31:41 +08:00
Tim Zhang	392732e213	protocols: Bump ttrpc from 0.6.0 to 0.7.1 Fixes: #6646 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-04-17 18:31:35 +08:00
Zvonko Kaiser	f4f958d53c	gpu: Do not pass-through PCI (Host) Bridges On some systems a GPU is in a IOMMU group with a PCI Bridge and PCI Host Bridge. Per default no PCI Bridge needs to be passed-through. When scanning the IOMMU group, ignore devices with a 0x60 class ID prefix. Fixes: #6663 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-17 10:08:23 +00:00
Zvonko Kaiser	825e769483	gpu: Add GPU support to default kernel without any TEE With each release make sure we ship a GPU enabled kernel Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-17 09:58:58 +00:00
Zvonko Kaiser	e4ee07f7d4	gpu: Add GPU TDX experimental kernel With each release make sure we ship a GPU and TEE enabled kernel This adds tdx-experimental kernel support Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-17 09:58:52 +00:00
Fabiano Fidêncio	243cb2e3af	Merge pull request #6670 from fidencio/topic/fix-caching-of-tdvf-and-tdx-qemu cache-components: Fix caching of TDVF and QEMU for TDX	2023-04-16 09:04:04 +02:00
Fabiano Fidêncio	a1272bcf1d	gha: tdx: Fix typo overlay -> overlays The beauty of GHA not allowing us to easily test changes in the yaml files as part of the PR has hit us again. :-/ The correct path for the k3s deployment is tools/packaging/kata-deploy/kata-deploy/overlays/k3s instead of tools/packaging/kata-deploy/kata-deploy/overlay/k3s. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-15 15:00:06 +02:00
Fabiano Fidêncio	3fa0890e5e	cache-components: Fix TDVF caching TDVF caching is not working as the tarball name is incorrect. The result expected is kata-static-tdvf.tar.xz, but it's looking for kata-static-tdx.tar.xz. This happens as a logic to convert tdx -> tdvf has been added as part of the building scripts, but I missed doing this as part of the caching scripts. Fixes: #6669 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-15 14:12:29 +02:00
Fabiano Fidêncio	80e3a2d408	cache-components: Fix TDX QEMU caching TDX QEMU caching is not working as expected, as we're checking for its version looking at "assets.hypervisor.${QEMU_FLAVOUR}.version", which is correct for standard QEMU. However, for TDX QEMU we should be checking for "assets.hypervisor.${QEMU_FLAVOUR}.tag" Fixes: #6668 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-15 14:12:26 +02:00
Fabiano Fidêncio	fffe2c6082	Merge pull request #6648 from fidencio/topic/gha-tdx-improvements-and-fixes gha: tdx: Ensure kata-deploy is removed after the tests run	2023-04-15 00:21:31 +02:00
Bo Chen	a819ce145f	Merge pull request #6633 from likebreath/0406/clh_v31.0 versions: Upgrade to Cloud Hypervisor v31.0	2023-04-14 13:52:19 -07:00
Zvonko Kaiser	87ea43cd4e	gpu: Add configuration fragment Adding configuration fragment for the kernel, depending on the TEE kernel update the LOCALVERSION Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-14 07:52:51 +00:00
Zvonko Kaiser	aca6ff7289	gpu: Build and Ship an GPU enabled Kernel With each release make sure we ship a GPU and TEE enabled kernel Fixes: #6553 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-14 07:52:42 +00:00
Fabiano Fidêncio	dc662333df	runtime: Increase the dial_timeout When testing on AKS, we've been hitting the dial_timeout every now and then. Let's increase it to 45 seconds (instead of 30) for all the VMMs, and to 60 seconfs in case of TEEs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 22:42:52 +02:00
Greg Kurz	897c0bc67e	Merge pull request #6658 from gkurz/osbuilder-dracut-dbus osbuilder: Enable dbus in the dracut case	2023-04-13 19:03:15 +02:00
Greg Kurz	eb1762e813	osbuilder: Enable dbus in the dracut case The agent now offloads cgroup configuration to systemd when possible. This requires to enable D-Bus in order to communicate with systemd. Fixes #6657 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-04-13 14:16:50 +02:00
Greg Kurz	f9a94f8fc5	Merge pull request #6623 from UiPath/fix-no-space-device runtime: Don't create socket file in /run/kata	2023-04-13 10:36:20 +02:00
Fabiano Fidêncio	f478b9115e	clh: tdx: Update timeouts for confidential guest Booting up TDX takes more time than booting up a normal VM. Those values are being already used as part of the CCv0 branch, and we're just bringing them to the `main` branch as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 10:18:07 +02:00
Fabiano Fidêncio	3b76abb366	kata-deploy: Ensure node is ready after CRI Engine restart Let's ensure the node is ready after the CRI Engine restart, otherwise we may proceed and scripts may simply fail if they try to deploy a pod while the CRI Engine is not yet restarted (and, consequently, the node is not Ready). Related: #6649 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 10:18:07 +02:00
Fabiano Fidêncio	5ec9ae0f04	kata-deploy: Use readinessProbe to ensure everything is ready readinessProbe will help us to only have the kata-deploy pod marked as Ready when it finishes all the needed configurations in the node. Related: #6649 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 10:18:07 +02:00
Fabiano Fidêncio	ea386700fe	kata-deploy: Update podOverhead for TDX As TEEs cannot hotplug memory / CPU, we must consider the default values for those as part of the podOverhead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 10:18:07 +02:00
Fabiano Fidêncio	e31efc861c	gha: tdx: Use the k3s overlay As the TDX machine is using k3s, let's make sure we're deploying kat-deploy using the k3s overlay. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 10:18:07 +02:00
Fabiano Fidêncio	542bb0f3f3	gha: tdx: Set KUBECONFIG env at the job level By doing this we avoid having to set it up on every step. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 10:18:07 +02:00
Fabiano Fidêncio	d7fdf19e9b	gha: tdx: Delete kata-deploy after the tests finish We must ensure that no kata-deploy is left behind after the tests finish, otherwise it may interfere with the next run. Fixes: #6647 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 10:18:07 +02:00
Fabiano Fidêncio	da35241a91	tests: k8s: Skip k8s-cpu-ns when testing TDX TEEs do not support CPU / memory hotplug, thus this test must be skipped. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-13 10:18:07 +02:00
Alexandru Matei	db2cac34d8	runtime: Don't create socket file in /run/kata The socket file for shim management is created in /run/kata and it isn't deleted after the container is stopped. After running and stopping thousands of containers /run folder will run out of space. Fixes #6622 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com> Co-authored-by: Greg Kurz <groug@kaod.org>	2023-04-13 10:21:29 +03:00
Jianyong Wu	6d315719f0	snap: fix docker start fail issue In Arm baseline CI, docker starts fail with error: "no sockets found via socket activation: make sure the service was started by systemd". I find a solusion in [1] to fix it. [1] https://forums.docker.com/t/failed-to-load-listeners-no-sockets-found-via-socket-activation-make-sure-the-service-was-started-by-systemd/62505 Fixes: #6619 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-04-13 09:35:40 +08:00
Zhongtao Hu	328793bb27	Merge pull request #6585 from Apokleos/nydus_prefetch_files nydus_rootfs/prefetch_files: add prefetch_files for RAFS	2023-04-12 19:58:36 +08:00
Zvonko Kaiser	e4b3b08871	gpu: Add proper CONFIG_LOCALVERSION depending on TEE If conf_guest is set we need to update the CONFIG_LOCALVERSION to match the suffix created in install_kata -nvidia-gpu-{snp\|tdx}, the linux headers will be named the very same if build with make deb-pkg for TDX or SNP. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-12 11:30:59 +00:00
Zhongtao Hu	fef531f565	Merge pull request #6618 from Apokleos/virtiofs_extra_cache_mode runtime-rs/virtio-fs: add support extra handler for cache mode.	2023-04-12 14:40:05 +08:00
Bin Liu	9327bb0912	Merge pull request #6639 from openanolis/nerdctl runtime-rs: enable nerdctl to setup cni plugin	2023-04-12 12:04:37 +08:00
Zhongtao Hu	69ba2098f8	runtime-rs: remove network entities and netns remove network entities and netns Fixes:#4693 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-04-12 10:21:06 +08:00
Zhongtao Hu	b31f103d12	runtime-rs: enable nerdctl cni plugin 1. when we use nerdctl to setup network for kata, no netns is created by nerdctl, kata need to create netns by its own 2. after start VM, nerdctl will call cni plugin via oci hook, we need to rescan the netns after the interfaces have been created, and hotplug the network device into the VM Fixes:#4693 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-04-12 10:21:04 +08:00
Fabiano Fidêncio	3b3656d96d	Merge pull request #6522 from fidencio/topic/add-tdx-artefacts-from-2023ww01-to-main tdx: Add artefacts from the latest TDX tools release into main	2023-04-11 20:43:02 +02:00
Fabiano Fidêncio	50ce33b02d	Merge pull request #6205 from fengwang666/non-root-clh runtime: support non-root for clh	2023-04-11 19:34:00 +02:00
Fabiano Fidêncio	4751adbea1	Merge pull request #6610 from fidencio/topic/gha-run-dragonball-k8s-tests gha: ci-on-push: Run k8s tests with dragonball	2023-04-11 18:16:14 +02:00
Fabiano Fidêncio	69d7a959c8	gha: ci-on-push: Run tests on TDX Now that we've added a TDX capable external runner, let's make sure we also run the basic tests using TDX. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 16:10:35 +02:00
Fabiano Fidêncio	5a0727ecb4	kata-deploy: Ship kata-qemu-tdx runtimeClass Let's make sure we configure containerd for the kata-qemu-tdx handler and ship the kata-qemu-tdx runtime class for kubernetes. Fixes: #6537 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 16:10:35 +02:00
Fabiano Fidêncio	98682805be	config: Add configuration for QEMU TDX As the QEMU configuration for TDX differs quite a lot from the normal QEMU configuration, let's add a new configuration file for the QEMU TDX. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 16:10:35 +02:00
Fabiano Fidêncio	3e15800199	govmm: Directly pass the firmware using -bios with TDX Since TDX doesn't support readonly memslot, TDVF cannot be mapped as pflash device and it actually works as RAM. "-bios" option is chosen to load TDVF. OVMF is the opensource firmware that implements the TDVF support. Thus the command line to specify and load TDVF is ``-bios OVMF.fd`` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	3c5ffb0c85	govmm: Set "sept-ve-disable=on" This is needed since 22ww49. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	ed145365ec	runtime/qemu: Drop "kvm-type=tdx" This is not supported since 22ww49. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	25b3cdd38c	virtcontainers: Drop check for the `tdx` CPU flag In the recent kernels provided by Intel the `tdx` CPU flag is not present anymore. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	01bdacb4e4	virtcontainers: Also check /sys/firmwares/tdx for TDX Let's make sure we also check /sys/firmwares/tdx for TDX guest protection, as the location may depend on whether TDX Seam is being used or not. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	9feec533ce	cache: Add ability to cache OVMF Let's add the ability to cache OVMF, which right now we're only building and shipping it for TDX. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	ce8d982512	gha: Build and ship the OVMF for TDX Let's build the OVMF with TDX support as part of our tests, and let's ship it as part of our releases. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	39c3fab7b1	local-build: Add support to build OVMF for TDX Let's add the needed targets and modifications to be able to build OVMF for TDX as part of the local-build scripts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	054174d3e6	versions: Bump OVMF for TDX Let's update the OVMF for TDX version to what's the latest tested release of the Intel TDX tools with Kata Containers. This change requires a newer version of `nasm` than the one provided by the container used to build the project. This change will also be needed for SEV-SNP and was originally done by Alex Carter (thanks!). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Alex Carter <Alex.Carter@ibm.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	800fb49da1	packaging: Add get_ovmf_image_name() helper As we'll be using this from different places in the near future, let's create a helper function as part of the libs.sh. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	fbf03d7aca	cache: Document kernel-tdx-experimental Let's make users aware of the cache_components_main.sh that they can also cache the kernel-tdx-experimental builds. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	5d79e96966	cache: Add a space to ease the reading of the kernel flavours Right now it's quite hard to read those, let's improve it a little bit. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	6e4726e454	cache: Fix typos Let's just fix a few simple typos: * kernek -> kernel * experimetnal -> experimental Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	fc22ed0a8a	gha: Build and ship the Kernel for TDX Let's build the kernel with TDX support as part of our tests, and let's ship it as part of our releases. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	502844ced9	local-build: Add support to build Kernel for TDX Let's add the needed targets and modifications to be able to build kernel-tdx-experimental as part of the local-build scripts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	b2585eecff	local-build: Avoid code duplication building the kernel Let's create a `install_kernel_helper()` function, as it was already done for QEMU, and rely on that when calling `install_kernel` and `install_kernel_dragonball_experimental`. This helps us to reduce the code duplication by a fair amount. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	f33345c311	versions: Update Kernel TDX version Let's update the Kernel TDX version to what's the latest tested release of the Intel TDX tools with Kata Containers. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	20ab2c2420	versions: Move Kernel TDX to its own experimental entry Although we've been providing users a way to build kernel with TDX support, this must be moved to its own experimental entry instead of how it currently is. The reason for that is because the patches are not yet merged into kernel, and this is still an experimental build of the project. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	3d9ce3982b	cache: Allow specifying the QEMU_FLAVOUR Let's do what we already did when caching the kernel, and allow passing a FLAVOUR of the project to build. By doing this we can re-use the same function used to cache QEMU to also cache any kind of experimental QEMU that we may happen to have. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	33dc6c65aa	gha: Build and ship QEMU for TDX Let's build QEMU TDX as part of our tests, and let's ship it as part of our releases. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	eceaae30a5	local-build: Add support to build QEMU for TDX Let's add the needed targets and modifications to be able to build qemu-tdx-experimental as part of the local-build scripts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:23:42 +02:00
Fabiano Fidêncio	f7b7c187ec	static-build: Improve qemu-experimental build script Let's make sure the `qemu_suffix` and `qemu_tarball_name` can be specified. With this we make it really easy to reuse this script for any addition flavour of an experimental QEMU that ends up having to be built (specifically looking at the ones for Confidential Containers here). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:17:04 +02:00
Fabiano Fidêncio	3018c9ad51	versions: Update QEMU TDX version Let's update the QEMU TDX version to what's the latest tested release of the Intel TDX tools with Kata Containers. In order to do such update, we had to relax the checks on the QEMU version for some of the configuration options, as those were removed right after the window was open for the 7.1.0 development (thus the 7.0.50 check). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:17:04 +02:00
Fabiano Fidêncio	800ee5cd88	versions: Move QEMU TDX to its own experimental entry Although we've been providing users a way to build QEMU with TDX support, this must be moved to its own experimental entry instead of how it currently is. The reason for that is because the patches are not yet merged into QEMU, and this is still an experimental build of the project. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:17:04 +02:00
Fabiano Fidêncio	1315bb45f9	local-build: Add dragonball kernel to the `all` target As the dragonball kernel is shipped as part of our releases, it must be added to the `all` target. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:17:04 +02:00
Fabiano Fidêncio	73e108136a	local-build: Rename non vanilla kernel build functions In order to make it easier to read, let's just rename the install_dragonball_experimental_kernel and install_experimental_kernel to install_kernel_dragonball_experimental and install_kernel_experimental, respectively. This allows us to quickly get to those functions when looking for `install_kernel`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:17:04 +02:00
Fabiano Fidêncio	1d851b4be3	local-build: Cosmetic changes in build targets This is a simple cosmetic change, adding a space between the function call and the `;;`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 15:17:04 +02:00
Fabiano Fidêncio	49ce685ebf	gha: k8s-on-aks: Always delete the AKS cluster Regardless of the tests succeeding or failing, the AKS cluster must be deleted. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 13:40:40 +02:00
Fabiano Fidêncio	e2a770df55	gha: ci-on-push: Run k8s tests with dragonball Now that the infra for running dragonball tests has been enabled, let's actually make sure to have them running on each PR. The tests skipped are: * `k8s-cpu-ns.bats`, as CPU resize doesn't seem to be yet properly supported on runtime-rs * https://github.com/kata-containers/kata-containers/issues/6621 Fixes: #6605 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-11 11:47:47 +02:00
Fabiano Fidêncio	aee6174a53	Merge pull request #6637 from gkurz/cpu-shares-to-weight rustjail: Use CPUWeight with systemd and CgroupsV2	2023-04-11 10:55:48 +02:00
GabyCT	dc74133e74	Merge pull request #6631 from fidencio/topic/gha-create-delete-aks-cannot-be-workflows gha: k8s-on-aks: {create,delete} AKS must be a coded-in step	2023-04-10 14:05:24 -06:00
Zhongtao Hu	8cdec5707e	Merge pull request #6540 from houstar/main docs: update the rust version from version.yaml	2023-04-10 16:53:21 +08:00
Qingyuan Hou	d1f550bd1e	docs: update the rust version from versions.yaml Fixes: #6539 Signed-off-by: Qingyuan Hou <lenohou@gmail.com>	2023-04-10 03:34:15 +00:00
alex.lyn	f3595e48b0	nydus_rootfs/prefetch_files: add prefetch_files for RAFS A sandbox annotation used to specify prefetch_files.list path the container image being used, and runtime will pass it to Hypervisor to search for corresponding prefetch file: format looks like: "io.katacontainers.config.hypervisor.prefetch_files.list" = /path/to/<uid>/xyz.com/fedora:36/prefetch_file.list Fixes: #6582 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-04-10 10:05:52 +08:00
Zhongtao Hu	3bfaafbf44	fix: oci hook 1. when do the deserialization for the oci hook, we should use camel case for createRuntime 2. we should pass the dir of bundle path instead of the path of config.json Fixes:#4693 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-04-10 09:53:43 +08:00
Greg Kurz	c1fbaae8d6	rustjail: Use CPUWeight with systemd and CgroupsV2 The CPU shares property belongs to CgroupsV1. CgroupsV2 uses CPU weight instead. The correct value is computed in the latter case but it is passed to systemd using the legacy property. Systemd rejects the request and the agent exists with the following error : Value specified in CPUShares is out of range: unknown Replace the "shares" wording with "weight" in the CgroupsV2 code to avoid confusions. Use the "CPUWeight" property since this is what systemd expects in this case. Fixes #6636 References: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#CPUWeight=weight https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#systemd%20252 https://github.com/containers/crun/blob/main/crun.1.md#cpu-controller Signed-off-by: Greg Kurz <groug@kaod.org>	2023-04-07 17:57:26 +02:00
Bo Chen	375187e045	versions: Upgrade to Cloud Hypervisor v31.0 Details of this release can be found in our new roadmap project as iteration v31.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #6632 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-06 14:35:26 -07:00
Fabiano Fidêncio	79f3047f06	gha: k8s-on-aks: {create,delete} AKS must be a coded-in step I should have seen this coming, but currently the "create" and "delete" AKS workflows cannot be imported and uses as a job's step, resulting on an error trying to find the correspondent action.yaml file for those. Fixes: #6630 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-06 22:56:08 +02:00
Fabiano Fidêncio	ee5dda012b	Merge pull request #6629 from fidencio/topic/gha-refactor-run-k8s-tests-on-aks gha: k8s-on-aks: Set {create,delete}_aks as steps	2023-04-06 22:02:34 +02:00
Fabiano Fidêncio	2f35b4d4e5	gha: ci-on-push: Only run on `main` branch Let's ensure we're only running this workflow when PRs are opened against the main branch. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-06 19:11:24 +02:00
Fabiano Fidêncio	e7bd2545ef	Revert "gha: ci-on-push: Depend on Commit Message Check" This reverts commit `a159ffdba7`. Unfortunately we have to revert the PRs related to the switch done to using `workflow_run` instead of `pull_request_target`. The reason for that being that we can only mark jobs as required if they are targetting PRs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-06 19:11:14 +02:00
Fabiano Fidêncio	0d96d49633	Revert "gha: ci-on-push: Adjust to using workflow_run" This reverts commit `3a760a157a`. Unfortunately we have to revert the PRs related to the switch done to using `workflow_run` instead of `pull_request_target`. The reason for that being that we can only mark jobs as required if they are targetting PRs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-06 19:11:06 +02:00
Fabiano Fidêncio	c7ee45f7e5	Revert "gha: ci-on-push: Adapt chained jobs to workflow_run" This reverts commit `7855b43062`. Unfortunately we have to revert the PRs related to the switch done to using `workflow_run` instead of `pull_request_target`. The reason for that being that we can only mark jobs as required if they are targetting PRs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-06 19:09:54 +02:00
Fabiano Fidêncio	5d4d720647	Revert "gha: k8s-on-aks: Fix cluster name" This reverts commit `85cc5bb534`. Unfortunately we have to revert the PRs related to the switch done to using `workflow_run` instead of `pull_request_target`. The reason for that being that we can only mark jobs as required if they are targetting PRs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-06 19:07:04 +02:00
Fabiano Fidêncio	13d857a56d	gha: k8s-on-aks: Set {create,delete}_aks as steps We've been currently using {create,delete}_aks as jobs. However, it means that if the tests fail we'll end up deleting the AKS cluster (as expected), but not having a way to recreate the cluster without re-running all jobs, which is a waste of resources. Fixes: #6628 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-06 16:54:15 +02:00
Fabiano Fidêncio	abaf881f4a	Merge pull request #6612 from fidencio/topic/gha-k8s-on-aks-fix-cluster-name gha: k8s-on-aks: Fix cluster name	2023-04-06 10:48:38 +02:00
alex.lyn	dc6569dbbc	runtime-rs/virtio-fs: add support extra handler for cache mode. Add support for virtiofsd when virtio_fs_extra_args with "-o cache auto, ..." users specified. Fixes: #6615 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-04-06 16:31:02 +08:00
Fabiano Fidêncio	85cc5bb534	gha: k8s-on-aks: Fix cluster name This was missed from the last series, as GHA will use the "target branch" yaml file to start the workflow. Basically we changed the name of the cluster created to stop relying on the PR number, as that's not easily accessible on `workflow_run`. Fixes: #6611 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-06 08:50:07 +02:00
Fabiano Fidêncio	68cb5689f5	Merge pull request #6584 from fidencio/topic/gha-k8s-also-test-dragonball gha: Also run k8s tests on AKS with dragonball	2023-04-05 22:50:14 +02:00
Fabiano Fidêncio	ae488cc09f	Merge pull request #6596 from fidencio/topic/gha-only-push-to-registry-when-merging-content gha: Only push images to registry after merging a PR	2023-04-05 22:07:13 +02:00
Fabiano Fidêncio	2c38e17ef0	Merge pull request #6607 from fidencio/topic/gha-switch-to-using-a-D4_v5-instance gha: aks: Use D4s_v5 instance	2023-04-05 22:06:40 +02:00
Archana Shinde	6af52cef3a	Merge pull request #6590 from zvonkok/build-kernel-fix tools: Avoid building the kernel twice	2023-04-05 11:45:59 -07:00
Greg Kurz	a3e3b0591f	Merge pull request #6562 from c3d/issue/6561-unwrap-panic rustjail: Fix panic when cgroup manager fails	2023-04-05 16:58:13 +02:00
James O. D. Hunt	cbe6f04194	Merge pull request #6501 from shippomx/dev_metrics runtime: add filter metrics with specific names	2023-04-05 15:15:09 +01:00
Fabiano Fidêncio	1688e4f3f0	gha: aks: Use D4s_v5 instance It's been pointed out that D4s_v5 instances are more powerful than the D4s_v3 ones, and have the very same price. With this in mind, let's switch to the newer machines. Fixes: #6606 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 16:02:17 +02:00
Fabiano Fidêncio	108d80a86d	gha: Add the ability to also test Dragonball With the changes proposed as part of this PR, an AKS cluster will be created but no tests will be performed. The reason we have to do this is because GitHub Actions will only run the tests using the workflows that are part of the target branch, instead of the using the ones coming from the PR, and we didn't find yet a way to work this around. Once this commit is in, we'll actually change the tests themselves (not the yaml files for the actions), as those will be the ones we want as the checkout action helps us on this case. Fixes: #6583 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 15:53:03 +02:00
Fabiano Fidêncio	2550d4462d	gha: build-kata-static-tarball: Only push to registry after merge `56331bd7bc` oversaw the fact that we mistakenly tried to push the build containers to the registry for a PR, rather than doing so only when the code is merged. As the workflow is now shared between different actions, let's introduce an input variable to specify which are the cases we actually need to perform a push to the registry. Fixes: #6592 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 13:57:26 +02:00
Fabiano Fidêncio	e81b8b8ee5	local-build: build-and-upload-payload is not quay.io specific Let's just print "to the registry" instead of printing "to quay.io", as the registry used is not tied to quay.io. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 12:54:44 +02:00
Fabiano Fidêncio	13929fc610	gha: publish-kata-deploy-payload: Improve registry login Let's only try to login to the registry that's being passed as an input argument. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 12:54:44 +02:00
Fabiano Fidêncio	41026f003e	gha: payload-after-push: Pass registry / repo as inputs We made registry / repo mandatory, but we only adapted that to the amd64 job. Let's fix it now and make sure this is also passed to the arm64 and s390x jobs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 12:54:44 +02:00
Fabiano Fidêncio	7855b43062	gha: ci-on-push: Adapt chained jobs to workflow_run As we're using the `workflow_run` event, the checkout action would pull the current target branch instead of the PR one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 12:54:44 +02:00
Fabiano Fidêncio	3a760a157a	gha: ci-on-push: Adjust to using workflow_run The way previously used to get the PR's commit sha can only be used with `pull_request*` kind of events. Let's adapt it to the `workflow_run` now that we're using it. With this change we ended up dropping the PR number from the tarball suffix, as that's not straightforward to get and, to be honest, not a unique differentiator that would justify the effort. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 12:54:44 +02:00
Fabiano Fidêncio	a159ffdba7	gha: ci-on-push: Depend on Commit Message Check Let's make this workflow dependent of the commit message check, and only start it if the commit message check one passes. As a side effect, this allows us to run this specific workflow using secrets, without having to rely on `pull_request_target`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-05 12:54:40 +02:00
Fabiano Fidêncio	8086c75f61	gha: Also run k8s tests on AKS with dragonball As already done for Cloud Hypervisor and QEMU, let's make sure we can run the AKS tests using dragonball. Fixes: #6583 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-04 10:58:47 +02:00
Fabiano Fidêncio	1c6d7cb0f7	Merge pull request #6589 from fidencio/topic/gha-k8s-use-ghcr-instead-of-quay gha: Use ghcr.io for the k8s CI	2023-04-04 10:48:16 +02:00
Zvonko Kaiser	fe86c08a63	tools: Avoid building the kernel twice Two different kernel build targets (build,install) have both instructions to build the kernel, hence it was executed twice. Install should only do install and build should only do build. Fixes: #6588 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-04-04 05:44:44 +00:00
Fabiano Fidêncio	3215860a47	gha: Set ci-on-push to run on `pull_request_target` This is less secure than running the PR on `pull_request`, and will require using an additional `ok-to-test` label to make sure someone deliverately ran the actions coming from a forked repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-03 20:50:36 +02:00
Fabiano Fidêncio	d17dfe4cdd	gha: Use ghcr.io for the k8s CI Let's switch to using the `ghcr.io` registry for the k8s CI, as this will save us some troubles on running the CI with PRs coming from forked repos. Fixes: #6587 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-04-03 15:52:33 +02:00
Fabiano Fidêncio	e1f972fb1d	Merge pull request #6568 from kata-containers/topic/add-k8s-tests-as-part-of-gha GHA \|Switch "kubernetes tests" from jenkins to GitHub actions	2023-04-03 14:25:35 +02:00
Christophe de Dinechin	b661e0cf3f	rustjail: Add anyhow context for D-Bus connections In cases where the D-Bus connection fails, add a little additional context about the origin of the error. Fixes: 6561 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Suggested-by: Archana Shinde <archana.m.shinde@intel.com> Spell-checked-by: Greg Kurz <gkurz@redhat.com>	2023-04-03 14:09:34 +02:00
Fabiano Fidêncio	60c62c3b69	gha: Remove kata-deploy-test.yaml This workflow becomes redundant as we're already testing kubernetes using kata-deploy, and also testing it on AKS. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 21:55:41 +02:00
Fabiano Fidêncio	43894e9459	gha: Remove kata-deploy-push.yaml This becomes redundant now that its steps are covered as part of the `ci-on-push.yaml`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 21:55:41 +02:00
Fabiano Fidêncio	cab9ca0436	gha: Add a CI pipeline for Kata Containers This is the very first step to replacing the Jenkins CI, and I've decided to start with an x86_64 approach only (although easily expansible for other arches as soon as they're ready to switch), and to start running our kubernetes tests (now running on AKS). Fixes: #6541 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 21:55:41 +02:00
Fabiano Fidêncio	53b526b6bd	gha: k8s: Add snippet to run k8s tests on aks clusters This will be shortly used as part of a newly created GitHub action which will replace our Jenkins CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 21:55:41 +02:00
Fabiano Fidêncio	c444c24bc5	gha: aks: Add snippets to create / delete aks clusters Those will be shortly used as part of a newly added GitHub action for testing k8s tests on Azure. They've been created using the secrets we already have exposed as part of our GitHub, and they follow a similar way to authenticate to Azure / create an AKS cluster as done in the `/test-kata-deploy` action. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 21:55:41 +02:00
Fabiano Fidêncio	11e0099fb5	tests: Move k8s tests to this repo The first part of simplifying things to have all our tests using GitHub actions is moving the k8s tests to this repo, as those will be the first vict^W targets to be migrated to GitHub actions. Those tests have been slightly adapted, mainly related to what they load / import, so they are more self-contained and do not require us bringing a lot of scripts from the tests repo here. A few scripts were also dropped along the way, as we no longer plan to deploy kubernetes as part of every single run, but rather assume there will always be k8s running whenever we land to run those tests. It's important to mention that a few tests were not added here: * k8s-block-volume: * k8s-file-volume: * k8s-volume: * k8s-ro-volume: These tests depend on some sort of volume being created on the kubernetes node where the test will run, and this won't fly as the tests will run from a GitHub runner, targetting a different machine where kubernetes will be running. * https://github.com/kata-containers/kata-containers/issues/6566 * k8s-hugepages: This test depends a whole lot on the host where it lands and right now we cannot assume anything about that anymore, as the tests will run from a GitHub runner, targetting a different machine where kubernetes will be running. * https://github.com/kata-containers/kata-containers/issues/6567 * k8s-expose-ip: This is simply hanging when running on AKS and has to be debugged in order to figure out the root cause of that, and then adapted to also work on AKS. * https://github.com/kata-containers/kata-containers/issues/6578 Till those issues are solved, we'll keep running a jenkins job with hose tests to avoid any possible regression. Last but not least, I've decided to not keep the history when bringing those tests here, otherwise we'd end up polluting a lot the history of this repo, without any clear benefit on doing so. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 21:55:41 +02:00
David Esparza	5d89d08fc4	Merge pull request #6564 from GabyCT/topic/updateneturl docs: Update CNM url in networking document	2023-03-31 09:58:55 -06:00
Fabiano Fidêncio	73be4bd3f9	gha: Update actions for release.yaml checkout@v2 should not be used anymore, please, see: https://github.blog/changelog/2022-09-22-github-actions-all-actions-will-begin-running-on-node16-instead-of-node12/ Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 13:24:26 +02:00
Fabiano Fidêncio	d38d7fbf1a	gha: Remove code duplication from release.yaml We can easily re-use the newly added build-kata-static-tarball-*.yaml as part of the release.yaml file. By doing this we consolidate on how we build the components accross our actions. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 13:24:26 +02:00
Fabiano Fidêncio	56331bd7bc	gha: Split payload-after-push-.yaml Let's split those actions into two different ones: Build the kata-static tarball * Publish the kata-deploy payload We're doing this as, later in this series we'll start taking advantage of both pieces. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 13:24:26 +02:00
Gabriela Cervantes	a552a1953a	docs: Update CNM url in networking document This PR updates the url for the Container Network Model in the network document. Fixes #6563 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-30 16:20:33 +00:00
Christophe de Dinechin	7796e6ccc6	rustjail: Fix minor grammatical error in function name Rename `unit_exist` function to `unit_exists` to match English grammar rule. Fixes: #6561 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2023-03-30 16:13:37 +02:00
Christophe de Dinechin	41fdda1d84	rustjail: Do not unwrap potential error with cgroup manager There can be an error while connecting to the cgroups managager, for example a `ENOENT` if a file is not found. Make sure that this is reported through the proper channels instead of causing a `panic()` that does not provide much information. Fixes: #6561 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Reported-by: Greg Kurz <gkurz@redhat.com>	2023-03-30 16:09:13 +02:00
Archana Shinde	07e49c63e1	Merge pull request #6257 from amshinde/kata-ctl-env kata-ctl: add function to get platform protection.	2023-03-29 11:55:07 -07:00
Archana Shinde	a914283ce0	kata-ctl: add function to get platform protection. This function checks for tdx, sev or snp protection on x86 platform. Fixes: #1000 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-03-28 15:40:25 -07:00
Fabiano Fidêncio	245ed2cecf	Merge pull request #6536 from gkurz/3.2.0-alpha0-branch-bump # Kata Containers 3.2.0-alpha0	2023-03-28 16:05:10 +02:00
Wainer Moschetta	d0f79e66b9	Merge pull request #6513 from fidencio/topic/use-kata-deploy-local-build-as-part-of-the-snap-stuff snap: Build the artefacts using kata-deploy	2023-03-28 09:59:31 -03:00
Miao Xia	0f73515561	runtime: add filter metrics with specific names The kata monitor metrics API returns a huge size response, if containers or sandboxs are a large number, focus on what we need will be harder. Fixes: #6500 Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>	2023-03-28 14:56:13 +08:00
Greg Kurz	4a246309ee	release: Kata Containers 3.2.0-alpha0 - nydus: upgrad to v2.2.0 - osbuilder: Add support for CBL-Mariner - kata-deploy: Fix bash semantics error - make only_kata work without -f - runtime-rs: ch: Implement confidential guest handling - qemu/arm64: disable image nvdimm once no firmware offered - static checks workflow improvements - A couple of kata-deploy fixes - agent: Bring in VFIO-AP device handling again - bugfix: set hostname in CreateSandboxRequest - packaging / kata-deploy builds: Add the ability to cache and consume cached components - versions: Update firecracker version - dependency: update cgroups-rs - Built-in Sandbox: add more unit tests for dragonball. Part 6 - runtime: add support for Hyper-V - runtime-rs: update load_config comment - Add support for ephemeral mounts to occupy entire sandbox's memory - runtime-rs: fix default kernel location and add more default config paths - Implement direct-volume commands handler for shim-mgmt - bugfix: modify tty_win info in runtime when handling ResizePtyRequest - bugfix: add get_ns_path API for Hypervisor - runtime-rs: add the missing default trait - packaging: Simplify get_last_modification() - utils: Make kata-manager.sh runs checks - dragonball: support pmu on aarch64 - docs: fix typo in key filename in AWS installation guide - backport rustjail systemd cgroup fix #6331 to 3.1 - main \| kata-deploy: Fix kata deploy arm64 image build error - workflows: Yet more fixes for publishing the kata-deploy payload after every PR merged - rustjail: fix cgroup handling in agent-init mode - runtime/Makefile: Fix install-containerd-shim-v2 dependency - fix wrong notes for func GetSandboxesStoragePathRust() - fix(runtime-rs): add exited state to ensure cleanup - runtime-rs: add oci hook support - utils: Remove kata-manager.sh cgroups v2 check - workflows: Fixes for the `payload-after-push` action - Dragonball: update dependencies - workflows: Do not install docker - workflows: Publish kata-deploy payload after a merge - src: Fixed typo mod.rs - actions: Use `git-diff` to get changes in kernel dir - agent: don't set permission of existing directory in copy_file - runtime: use filepath.Clean() to clean the mount path - Upgrade to Cloud Hypervisor v30.0 - feat(runtime): make static resource management consistent with 2.0 - osbuilder: Include minimal set of device nodes in ubuntu initrd - kata-ctl/exec: add new command exec to enter guest VM. - kernel: Add CONFIG_SEV_GUEST to SEV kernel config - runtime-rs: Improve Cloud Hypervisor config handling - virtiofsd: update to a valid path on ppc64le - runtime-rs: cleanup kata host share path - osbuilder: fix default build target in makefile - devguide: Add link to the contribution guidelines - kata-deploy: Ensure go binaries can run on Ubuntu 20.04 - dragonball: config_manager: preserve device when update - Revert "workflows: Push the builder image to quay.io" - Remove all remaining unsafe impl - kata-deploy: Fix building the kata static firecracker arm64 package occurred an error - shim-v2: Bump Ubuntu container image to 22.04 - packaging: Cache the container used to build the kata-deploy artefacts - utils: always check some dependencies. - versions: Use ubuntu as the default distro for the rootfs-image - github-action: Replace deprecated command with environment file - docs: Change the order of release step - runtime-rs: remove unnecessary Send/Sync trait implement - runtime-rs: Don't build on Power, don't break on Power. - runtime-rs: handle sys_dir bind volume - sandbox: set the dns for the sandbox - packaging/shim-v2: Only change the config if the file exists - runtime-rs: Add basic CH implementation - release: Revert kata-deploy changes after 3.1.0-rc0 release `8b008fc743` kata-deploy: fix bash semantics error `74ec38cf02` osbuilder: Add support for CBL-Mariner `ac58588682` runtime-rs: ch: Generate Cloud Hypervisor config for confidential guests `96555186b3` runtime-rs: ch: Honour debug setting `e3c2d727ba` runtime-rs: ch: clippy fix `ece5edc641` qemu/arm64: disable image nvdimm if no firmware offered `dd23f452ab` utils: renamed only_kata to skip_containerd `59c81ed2bb` utils: informed pre-check about only_kata `4f0887ce42` kata-deploy: fix install failing to chmod runtime-rs/bin/* `09c4828ac3` workflows: add missing artifacts on payload-after-push `fbf891fdff` packaging: Adapt `get_last_modification()` `82a04dbce1` local-build: Use cached VirtioFS when possible `3b99004897` local-build: Use cached shim v2 when possible `1b8c5474da` local-build: Use cached RootFS when possible `09ce4ab893` local-build: Use cached QEMU when possible `1e1c843b8b` local-build: Use cached Nydus when possible `64832ab65b` local-build: Use cached Kernel when possible `04fb52f6c9` local-build: Use cached Firecracker when possible `8a40f6f234` local-build: Use cached Cloud Hypervisor when possible `194d5dc8a6` tools: Add support for caching VirtioFS artefacts `a34272cf20` tools: Add support for caching shim v2 artefacts `7898db5f79` tools: Add support for caching RootFS artefacts `e90891059b` tools: Add support for caching QEMU artefacts `7aed8f8c80` tools: Add support for caching Nydus artefacts `cb4cbe2958` tools: Add support for caching Kernel artefacts `762f9f4c3e` tools: Add support for caching Firecracker artefacts `6b1b424fc7` tools: Add support for caching Cloud Hypervisor artefacts `08fe49f708` versions: Adjust kernel names to match kata-deploy build targets `99505c0f4f` versions: Update firecracker version `f4938c0d90` bugfix: set hostname `96baa83895` agent: Bring in VFIO-AP device handling again `f666f8e2df` agent: Add VFIO-AP device handling `b546eca26f` runtime: Generalize VFIO devices `4c527d00c7` agent: Rename VFIO handling to VFIO PCI handling `db89c88f4f` agent: Use cfg-if for s390x CCW `68a586e52c` agent: Use a constant for CCW root bus path `a8b55bf874` dependency: update cgroups-rs `97cdba97ea` runtime-rs: update load_config comment `974a5c22f0` runtime: add support for Hyper-V `40f4eef535` build: Use the correct kernel name `a6c67a161e` runtime: add support for ephemeral mounts to occupy entire sandbox memory `844bf053b2` runtime-rs: add the missing default trait `e7bca62c32` bugfix: modify tty_win info in runtime when handling ResizePtyRequest `30e235f0a1` runtime-rs: impl volume-resize trait for sandbox `e029988bc2` bugfix: add get_ns_path API for Hypervisor `42b8867148` runtime-rs: impl volume-stats trait for sandbox `462d4a1af2` workflows: static-checks: Free disk space before running checks `e68186d9af` workflows: static-checks: Set GOPATH only once `439ff9d4c4` tools/osbuilder/tests: Remove TRAVIS variable `43ce3f7588` packaging: Simplify get_last_modification() `33c5c49719` packaging: Move repo_root_dir to lib.sh `16e2c3cc55` agent: implement update_ephemeral_mounts api `3896c7a22b` protocol: add updateEphemeralMounts proto `23488312f5` agent: always use cgroupfs when running as init `8546387348` agent: determine value of use_systemd_cgroup before LinuxContainer::new() `736aae47a4` rustjail: print type of cgroup manager `dbae281924` workflows: Properly set the kata-tarball architecture `76b4591e2b` tools: Adjust the build-and-upload-payload.sh script `cd2aaeda2a` kata-deploy: Switch to using an ubuntu image `2d43e13102` docs: fix typo in AWS installation guide `760f78137d` dragonball: support pmu on aarch64 `9bc7bef3d6` kata-deploy: Fix path to the Dockerfile `78ba363f8e` kata-deploy: Use different images for s390x and aarch64 `6267909501` kata-deploy: Allow passing BASE_IMAGE_{NAME,TAG} `3443f558a6` nydus: upgrad nydus to v2.2.0 `395645e1ce` runtime: hybrid-mode cause error in the latest nydusd `f8e44172f6` utils: Make kata-manager.sh runs checks `f31c79d210` workflows: static-checks: Remove TRAVIS_XXX variables `8030e469b2` fix(runtime-rs): add exited state to ensure cleanup `7d292d7fc3` workflows: Fix the path of imported workflows `e07162e79d` workflows: Fix action name `dd2713521e` Dragonball: update dependencies `bd1ed26c8d` workflows: Publish kata-deploy payload after a merge `fea7e8816f` runtime-rs: Fixed typo mod.rs `a9e2fc8678` runtime/Makefile: Fix install-containerd-shim-v2 dependency `b6880c60d3` logging: Correct the code notes `12cfad4858` runtime-rs: modify the transfer to oci::Hooks `828d467222` workflows: Do not install docker `4b8a5a1a3d` utils: Remove kata-manager.sh cgroups v2 check `2c4428ee02` runtime-rs: move pre-start hooks to sandbox_start `e80c9f7b74` runtime-rs: add StartContainer hook `977f281c5c` runtime-rs: add CreateContainer hook support `875f2db528` runtime-rs: add oci hook support `ecac3a9e10` docs: add design doc for Hooks `3ac6f29e95` runtime: clh: Re-generate the client code `262daaa2ef` versions: Upgrade to Cloud Hypervisor v30.0 `192df84588` agent: always use cgroupfs when running as init `b0691806f1` agent: determine value of use_systemd_cgroup before LinuxContainer::new() `dc86d6dac3` runtime: use filepath.Clean() to clean the mount path `c4ef5fd325` agent: don't set permission of existing directory `3483272bbd` runtime-rs: ch: Enable initrd usage `fbee6c820e` runtime-rs: Improve Cloud Hypervisor config handling `1bff1ca30a` kernel: Add CONFIG_SEV_GUEST to SEV kernel config Adding kernel config to sev case since it is needed for SNP and SNP will use the SEV kernel. Incrementing kernel config version to reflect changes `ad8968c8d9` rustjail: print type of cgroup manager `b4a1527aa6` kata-deploy: Fix static shim-v2 build on arm64 `2c4f8077fd` Revert "shim-v2: Bump Ubuntu container image to 22.04" `afaccf924d` Revert "workflows: Push the builder image to quay.io" `4c39c4ef9f` devguide: Add link to the contribution guidelines `76e926453a` osbuilder: Include minimal set of device nodes in ubuntu initrd `697ec8e578` kata-deploy: Fix kata static firecracker arm64 package build error `ced3c99895` dragonball: config_manager: preserve device when update `da8a6417aa` runtime-rs: remove all remaining unsafe impl `0301194851` dragonball: use crossbeam_channel in VmmService instead of mpsc::channel `9d78bf9086` shim-v2: Bump Ubuntu container image to 22.04 `3cfce5a709` utils: improved unsupported distro message. `919d19f415` feat(runtime): make static resource management consistent with 2.0 `b835c40bbd` workflows: Push the builder image to quay.io `781ed2986a` packaging: Allow passing a container builder to the scripts `45668fae15` packaging: Use existing image to build td-shim `e8c6bfbdeb` packaging: Use existing image to build td-shim `3fa24f7acc` packaging: Add infra to push the OVMF builder image `f076fa4c77` packaging: Use existing image to build OVMF `c7f515172d` packaging: Add infra to push the QEMU builder image `fb7b86b8e0` packaging: Use existing image to build QEMU `d0181bb262` packaging: Add infra to push the virtiofsd builder image `7c93428a18` packaging: Use existing image to build virtiofsd `8c227e2471` virtiofsd: Pass the expected toolchain to the build container `7ee00d8e57` packaging: Add infra to push the shim-v2 builder image `24767d82aa` packaging: Use existing image to build the shim-v2 `e84af6a620` virtiofsd: update to a valid path on ppc64le `6c3c771a52` packaging: Add infra to push the kernel builder image `b9b23112bf` packaging: Use existing image to build the kernel `869827d77f` packaging: Add push_to_registry() `e69a6f5749` packaging: Add get_last_modification() `6c05e5c67a` packaging: Add and export BUILDER_REGISTRY `1047840cf8` utils: always check some dependencies. `95e3364493` runtime-rs: remove unnecessary Send/Sync trait implement `a96ba99239` actions: Use `git-diff` to get changes in kernel dir `619ef54452` docs: Change the order of release step `a161d11920` versions: Use ubuntu as the default distro for the rootfs-image `be40683bc5` runtime-rs: Add a generic powerpc64le-options.mk `47c058599a` packaging/shim-v2: Install the target depending on the arch/libc `b582c0db86` kata-ctl/exec: add new command exec to enter guest VM. `07802a19dc` runtime-rs: handle sys_dir bind volume `04e930073c` sandbox: set the dns for the sandbox `32ebe1895b` agent: fix the issue of creating the dns file `44aaec9020` github-action: Replace deprecated command with environment file `a68c5004f8` packaging/shim-v2: Only change the config if the file exists `ee76b398b3` release: Revert kata-deploy changes after 3.1.0-rc0 release `bbc733d6c8` docs: runtime-rs: Add CH status details `37b594c0d2` runtime-rs: Add basic CH implementation `545151829d` kata-types: Add Cloud Hypervisor (CH) definitions `2dd2421ad0` runtime-rs: cleanup kata host share path `0a21ad78b1` osbuilder: fix default build target in makefile `9a01d4e446` dragonball: add more unit test for virtio-blk device. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-03-28 08:40:06 +02:00
Bin Liu	75987aae72	Merge pull request #6408 from jongwu/nydus_rm_hybrid nydus: upgrad to v2.2.0	2023-03-28 11:07:56 +08:00
Fabiano Fidêncio	4a95375dc8	Merge pull request #6465 from dallasd1/mariner-rootfs osbuilder: Add support for CBL-Mariner	2023-03-27 22:18:31 +02:00
Fabiano Fidêncio	43dd4440f4	snap: Build the artefacts using kata-deploy Our CI and release process are currently taking advantage of the kata-deploy local build scripts to build the artefacts. Having snap doing the same is the next logical step, and it will also help to reduce, by a lot, the CI time as we only build the components that a PR is touching (otherwise we just pull the cached component). Fixes: #6514 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-27 17:34:43 +02:00
Fabiano Fidêncio	293119df78	Merge pull request #6515 from xyz-li/main kata-deploy: Fix bash semantics error	2023-03-24 13:18:10 +01:00
Chelsea Mafrica	bbc699ddd8	Merge pull request #6419 from gabevenberg/containerd-pre-check make only_kata work without -f	2023-03-23 10:02:32 -07:00
xyz-li	8b008fc743	kata-deploy: fix bash semantics error The argument of return must be numeric. Fixes: #6521 Signed-off-by: xyz-li <hui0787411@163.com>	2023-03-23 22:47:54 +08:00
James O. D. Hunt	da676872b1	Merge pull request #6439 from jodh-intel/runtime-rs-ch-confidential-guest runtime-rs: ch: Implement confidential guest handling	2023-03-23 13:01:47 +00:00
Dallas Delaney	74ec38cf02	osbuilder: Add support for CBL-Mariner Add osbuilder support to build a rootfs and image based on the CBL-Mariner Linux distro Fixes: #6462 Signed-off-by: Dallas Delaney <dadelan@microsoft.com>	2023-03-22 11:45:32 -07:00
James O. D. Hunt	ac58588682	runtime-rs: ch: Generate Cloud Hypervisor config for confidential guests This change provides a preliminary implementation for the Cloud Hypervisor (CH) feature ([currently disabled](https://github.com/kata-containers/kata-containers/pull/6201)) to allow it to generate the CH configuration for handling confidential guests. This change also introduces concrete errors using the `thiserror` crate (see `src/runtime-rs/crates/hypervisor/ch-config/src/errors.rs`) and a lot of unit tests for the conversion code that generates the CH configuration from the generic Hypervisor configuration. Fixes: #6430. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-03-22 14:38:38 +00:00
James O. D. Hunt	96555186b3	runtime-rs: ch: Honour debug setting Enable Cloud Hypervisor debug based on the specified configuration rather than hard-coding debug to be disabled. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-03-22 14:38:38 +00:00
James O. D. Hunt	e3c2d727ba	runtime-rs: ch: clippy fix Simplify the code to keep rust's `clippy` happy. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-03-22 14:38:38 +00:00
James O. D. Hunt	f06f72b5e9	Merge pull request #6467 from jongwu/qemu-uefi-path qemu/arm64: disable image nvdimm once no firmware offered	2023-03-22 08:43:01 +00:00
Steve Horsman	adaabd141a	Merge pull request #6406 from jepio/jepio/static-checks-workflow-improvements static checks workflow improvements	2023-03-20 17:12:54 +00:00
Wainer Moschetta	20da7f3ec8	Merge pull request #6495 from wainersm/fix-kata-deploy-ci A couple of kata-deploy fixes	2023-03-20 13:48:02 -03:00
Fabiano Fidêncio	2fe0733dcb	Merge pull request #4582 from BbolroC/vfio-ap agent: Bring in VFIO-AP device handling again	2023-03-20 11:43:13 +01:00
Jianyong Wu	ece5edc641	qemu/arm64: disable image nvdimm if no firmware offered For now, image nvdimm on qemu/arm64 depends on UEFI/ACPI, so if there is no firmware offered, it should be disabled. Fixes: #6468 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-03-20 18:03:05 +08:00
Zhongtao Hu	1e8005ff88	Merge pull request #6477 from openanolis/runtime-rs-hostname bugfix: set hostname in CreateSandboxRequest	2023-03-20 12:43:29 +08:00
Gabe Venberg	dd23f452ab	utils: renamed only_kata to skip_containerd Renamed for greater clarity as to what that flag does. Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>	2023-03-17 16:09:45 -05:00
Gabe Venberg	59c81ed2bb	utils: informed pre-check about only_kata passed the only_kata variable through to pre_check, only_kata does not abort the install when containerd is already installed. fixes #6385 Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>	2023-03-17 15:58:57 -05:00
Fabiano Fidêncio	96252db787	Merge pull request #6481 from fidencio/topic/cache-artefacts packaging / kata-deploy builds: Add the ability to cache and consume cached components	2023-03-17 20:54:42 +01:00
Wainer dos Santos Moschetta	4f0887ce42	kata-deploy: fix install failing to chmod runtime-rs/bin/* The kata-deploy install method tried to `chmod +x /opt/kata/runtime-rs/bin/*` but it isn't always true that /opt/kata/runtime-rs/bin/ exists. For example, the s390x payload does not build the kernel-dragonball-experimental artifacts. So let's ensure the dir exist before issuing the command. Fixes #6494 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-03-17 16:09:21 -03:00
Wainer dos Santos Moschetta	09c4828ac3	workflows: add missing artifacts on payload-after-push The kata-deploy-ci payloads for amd64 and arm64 were missing the shim-v2 and kernel-dragonball-experimental artifacts. Fixes #6493 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-03-17 15:31:21 -03:00
Fabiano Fidêncio	fbf891fdff	packaging: Adapt `get_last_modification()` The function is returning "" when called from the script used to cache the artefacts and one difference noted between this version and the already working one from the CCv0 is that we make sure to `pushd ${repo_root_dir}` in the CCv0 version. Let's give it a try here and see if it solves the issue. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	82a04dbce1	local-build: Use cached VirtioFS when possible As we've added the support for caching components, let's use them whenever those are available. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	3b99004897	local-build: Use cached shim v2 when possible As we've added the support for caching components, let's use them whenever those are available. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	1b8c5474da	local-build: Use cached RootFS when possible As we've added the support for caching components, let's use them whenever those are available. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	09ce4ab893	local-build: Use cached QEMU when possible As we've added the support for caching components, let's use them whenever those are available. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	1e1c843b8b	local-build: Use cached Nydus when possible As we've added the support for caching components, let's use them whenever those are available. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	64832ab65b	local-build: Use cached Kernel when possible As we've added the support for caching components, let's use them whenever those are available. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	04fb52f6c9	local-build: Use cached Firecracker when possible As we've added the support for caching components, let's use them whenever those are available. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	8a40f6f234	local-build: Use cached Cloud Hypervisor when possible As we've added the support for caching components, let's use them whenever those are available. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 16:27:34 +01:00
Fabiano Fidêncio	194d5dc8a6	tools: Add support for caching VirtioFS artefacts Let's add support for caching VirtioFS artefacts that are generated using the kata-deploy local-build scripts. Right now those are not used, but we'll switch to using them very soon as part of upcoming changes of how we build the components we test in our CI. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 11:43:01 +01:00
Fabiano Fidêncio	a34272cf20	tools: Add support for caching shim v2 artefacts Let's add support for caching shim v2 artefacts that are generated using the kata-deploy local-build scripts. Right now those are not used, but we'll switch to using them very soon as part of upcoming changes of how we build the components we test in our CI. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 11:43:01 +01:00
Fabiano Fidêncio	7898db5f79	tools: Add support for caching RootFS artefacts Let's add support for caching RootFS artefacts that are generated using the kata-deploy local-build scripts. Right now those are not used, but we'll switch to using them very soon as part of upcoming changes of how we build the components we test in our CI. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 11:43:01 +01:00
Fabiano Fidêncio	e90891059b	tools: Add support for caching QEMU artefacts Let's add support for caching QEMU artefacts that are generated using the kata-deploy local-build scripts. Right now those are not used, but we'll switch to using them very soon as part of upcoming changes of how we build the components we test in our CI. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 11:43:01 +01:00
Fabiano Fidêncio	7aed8f8c80	tools: Add support for caching Nydus artefacts Let's add support for caching Nydus artefacts that are generated using the kata-deploy local-build scripts. Right now those are not used, but we'll switch to using them very soon as part of upcoming changes of how we build the components we test in our CI. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 11:43:01 +01:00
Fabiano Fidêncio	cb4cbe2958	tools: Add support for caching Kernel artefacts Let's add support for caching Kernel artefacts that are generated using the kata-deploy local-build scripts. Right now those are not used, but we'll switch to using them very soon as part of upcoming changes of how we build the components we test in our CI. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 11:43:01 +01:00
Fabiano Fidêncio	762f9f4c3e	tools: Add support for caching Firecracker artefacts Let's add support for caching Firecracker artefacts that are generated using the kata-deploy local-build scripts. Right now those are not used, but we'll switch to using them very soon as part of upcoming changes of how we build the components we test in our CI. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 11:28:56 +01:00
Fabiano Fidêncio	6b1b424fc7	tools: Add support for caching Cloud Hypervisor artefacts Let's add support for caching Cloud Hypervisor artefacts that are generated using the kata-deploy local-build scripts. Right now those are not used, but we'll switch to using them very soon as part of upcoming changes of how we build the components we test in our CI. Fixes: #6480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-17 11:28:56 +01:00
Fabiano Fidêncio	08fe49f708	versions: Adjust kernel names to match kata-deploy build targets Let's adjust the kernel names in versions.yaml so those can match the names used as part of the kata-deploy local build scripts. Right now this doesn't bring any benefit nor drawback, but it'll make our life easier later on in this same series. Depends-on: github.com/kata-containers/tests#5534 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-17 11:28:56 +01:00
Fabiano Fidêncio	d281d1b90a	Merge pull request #6483 from GabyCT/topic/updatefcv versions: Update firecracker version	2023-03-17 10:37:22 +01:00
Gabriela Cervantes	99505c0f4f	versions: Update firecracker version This PR updates the firecracker version being used in kata containers versions.yaml The changes in version 1.3.1 are Added Introduced T2CL (Intel) and T2A (AMD) CPU templates to provide instruction set feature parity between Intel and AMD CPUs when using these templates. Added Graviton3 support (c7g instance type). Changed Improved error message when invalid network backend provided. Improved TCP throughput by between 5% and 15% (depending on CPU) by using scatter-gather I/O in the net device's TX path. Upgraded Rust toolchain from 1.64.0 to 1.66.0. Made seccompiler output bit-reproducible. Fixed Fixed feature flags in T2 CPU template on Intel Ice Lake. Fixes #6482 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-03-16 17:34:33 +00:00
Yushuo	f4938c0d90	bugfix: set hostname Setting hostname according to the spec. Fixes: #6247 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-03-16 17:16:06 +08:00
Hyounggyu Choi	96baa83895	agent: Bring in VFIO-AP device handling again This PR is a continuing work for (kata-containers#3679). This generalizes the previous VFIO device handling which only focuses on PCI to include AP (IBM Z specific). Fixes: kata-containers#3678 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-03-16 18:14:12 +09:00
Greg Kurz	e6e719699f	Merge pull request #6471 from etrunko/main dependency: update cgroups-rs	2023-03-16 08:01:07 +01:00
QuanweiZhou	56c63a9b1c	Merge pull request #6186 from wllenyj/dragonball-ut-6 Built-in Sandbox: add more unit tests for dragonball. Part 6	2023-03-16 11:02:05 +08:00
Jakob Naucke	f666f8e2df	agent: Add VFIO-AP device handling Initial VFIO-AP support (#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: #3678 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2023-03-16 10:07:48 +09:00
Jakob Naucke	b546eca26f	runtime: Generalize VFIO devices Generalize VFIO devices to allow for adding AP in the next patch. The logic for VFIOPciDeviceMediatedType() has been changed and IsAPVFIOMediatedDevice() has been removed. The rationale for the revomal is: - VFIODeviceMediatedType is divided into 2 subtypes for AP and PCI - Logic of checking a subtype of mediated device is included in GetVFIODeviceType() - VFIOPciDeviceMediatedType() can simply fulfill the device addition based on a type categorized by GetVFIODeviceType() Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2023-03-16 10:06:37 +09:00
Jakob Naucke	4c527d00c7	agent: Rename VFIO handling to VFIO PCI handling e.g., split_vfio_option is PCI-specific and should instead be named split_vfio_pci_option. This mutually affects the runtime, most notably how the labels are named for the agent. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2023-03-16 07:43:39 +09:00
Jakob Naucke	db89c88f4f	agent: Use cfg-if for s390x CCW Uses fewer lines in upcoming VFIO-AP support. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2023-03-16 07:43:39 +09:00
Jakob Naucke	68a586e52c	agent: Use a constant for CCW root bus path used a function like PCI does, but this is not necessary Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2023-03-16 07:43:39 +09:00
Fabiano Fidêncio	814d07af58	Merge pull request #6463 from sprt/sprt/mshv-compat runtime: add support for Hyper-V	2023-03-15 18:03:25 +01:00
Eduardo Lima (Etrunko)	a8b55bf874	dependency: update cgroups-rs Huge pages failure with cgroups v2. https://github.com/kata-containers/cgroups-rs/issues/112 Fixes: #6470 Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>	2023-03-15 12:21:12 -03:00
Chao Wu	530b2a7685	Merge pull request #6458 from openanolis/chao/update_comments runtime-rs: update load_config comment	2023-03-15 19:32:07 +08:00
Chao Wu	97cdba97ea	runtime-rs: update load_config comment Since shimv2 create task option is already implemented, we need to update the corresponding comments. Also, the ordering is also updated to fit with the code. fixes: #3961 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-03-15 14:44:47 +08:00
Eric Ernst	dc42f0a33b	Merge pull request #6411 from wlan0/empty-dir Add support for ephemeral mounts to occupy entire sandbox's memory	2023-03-13 20:07:27 -07:00
Henry Beberman	974a5c22f0	runtime: add support for Hyper-V This adds /dev/mshv to the list of sandbox devices so that VMMs can create Hyper-V VMs. In our testing, this also doesn't error out in case /dev/mshv isn't present. Fixes #6454. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-03-13 17:13:51 -07:00
Fabiano Fidêncio	ab0bd7a1ee	Merge pull request #6292 from fidencio/topic/runtime-rs-small-fixes runtime-rs: fix default kernel location and add more default config paths	2023-03-13 16:53:30 +01:00
Fabiano Fidêncio	40f4eef535	build: Use the correct kernel name When calling `MAKE_KERNEL_NAME` we're considering the default kernel name will be `vmlinux.container` or `vmlinuz.container`, which is not the case as the runtime-rs, when used with dragonball, relies on the `vmlinu[zx]-dragonball-experimental.container` kernel. Other hypervisors will have to introduce a similar `MAKE_KERNEL_NAME_${HYPERVISOR}` to adapt this to the kernel they want to use, similarly to what's already done for the go runtime. By doing this we also ensure that no changes in the configuration file will be required to run runtime-rs, with dragonball, as part of our CI or as part of kata-deploy. Fixes: #6290 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-13 13:47:20 +01:00
James O. D. Hunt	ae9be1d94b	Merge pull request #5840 from tzY15368/feat-runtimers-direct-vol Implement direct-volume commands handler for shim-mgmt	2023-03-13 07:58:40 +00:00
Chelsea Mafrica	4b877b0a3e	Merge pull request #6426 from openanolis/runtime-rs-resize-pty bugfix: modify tty_win info in runtime when handling ResizePtyRequest	2023-03-10 14:08:41 -08:00
Sidhartha Mani	a6c67a161e	runtime: add support for ephemeral mounts to occupy entire sandbox memory On hotplug of memory as containers are started, remount all ephemeral mounts with size option set to the total sandbox memory Fixes: #6417 Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>	2023-03-10 13:36:02 -08:00
James O. D. Hunt	99a4eaa898	Merge pull request #6443 from openanolis/runtime-rs-get-netns bugfix: add get_ns_path API for Hypervisor	2023-03-10 20:16:22 +00:00
Fabiano Fidêncio	44bc222ca4	Merge pull request #5578 from Richardhongyu/main runtime-rs: add the missing default trait	2023-03-10 18:01:43 +01:00
Li Hongyu	844bf053b2	runtime-rs: add the missing default trait Some structs in the runtime-rs don't implement Default trait. This commit adds the missing Default. Fixes: #5463 Signed-off-by: Li Hongyu <lihongyu1999@bupt.edu.cn>	2023-03-10 08:19:56 +00:00
Yushuo	e7bca62c32	bugfix: modify tty_win info in runtime when handling ResizePtyRequest Currently, we only create the new exec process in runtime, this will cause error when the following requests needing to be handled: - Task: exec process - Task: resize process pty - ... The agent do not do_exec_process when we handle ExecProcess, thus we can not find any process information in the guest when we handle ResizeProcessPty. This will report an error. In this commit, the handling process is modified to the: * Modify process tty_win information in runtime * If the exec process is not running, we just return. And the truly pty_resize will happen when start_process Fixes: #6248 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-03-10 14:33:51 +08:00
Tingzhou Yuan	30e235f0a1	runtime-rs: impl volume-resize trait for sandbox Implements resize-volume handlers in shim-mgmt, trait for sandbox and add RPC calls to agent. Note the actual rpc handler for the resize request is currently not implemented, refer to issue #3694. Fixes #5369 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>	2023-03-10 01:27:06 -05:00
Yushuo	e029988bc2	bugfix: add get_ns_path API for Hypervisor For external hypervisors(qemu, cloud-hypervisor, ...), the ns they launch vm in is different from internal hypervisor(dragonball). And when we doing CreateContainer hook, we will rely on the netns path. So we add a get_ns_path API. Fixes: #6442 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-03-10 13:57:00 +08:00
Tingzhou Yuan	42b8867148	runtime-rs: impl volume-stats trait for sandbox Implements get-volume-stats trait for sandbox, handler for shim-mgmt and add RPC calls to agent. Also added type conversions in trans.rs Fixes #5369 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>	2023-03-10 00:48:02 -05:00
Jeremi Piotrowski	462d4a1af2	workflows: static-checks: Free disk space before running checks We've been seeing the 'sudo make test' job occasionally run out of space in /tmp, which is part of the root filesystem. Removing dotnet and `AGENT_TOOLSDIRECTORY` frees around 10GB of space and in my tests the job still has 13GB of space left after running. Fixes: #6401 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-03-09 13:30:09 +01:00
Jeremi Piotrowski	e68186d9af	workflows: static-checks: Set GOPATH only once {{ runner.workspace }}/kata-containers and {{ github.workspace }} resolve to the same value, but they're being used multiple times in the workflow. Remove multiple definitions and define the GOPATH var at job level once. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-03-09 13:30:09 +01:00
Jeremi Piotrowski	439ff9d4c4	tools/osbuilder/tests: Remove TRAVIS variable The last remaining user of the TRAVIS variable in this repo is tools/osbuilder/tests and it is only used to skip spinning up VMs. Travis didn't support virtualization and the same is true for github actions hosted runners. Replace the variable with KVM_MISSING and determine availability of /dev/kvm at runtime. TRAVIS is also used by '.ci/setup.sh' in kata-containers/tests to reduce the set of dependencies that gets installed, but this is also in the process of being removed. Fixes: #3544 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-03-09 13:29:49 +01:00
Christophe de Dinechin	7566a7eae4	Merge pull request #6432 from fidencio/topic/simplify-get-last-modification packaging: Simplify get_last_modification()	2023-03-09 10:57:58 +01:00
Fabiano Fidêncio	43ce3f7588	packaging: Simplify get_last_modification() There's no need to pass repo_root_dir to get_last_modification() as the variable used everywhere is exported from that very same file. Fixes: #6431 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-08 21:22:03 +01:00
Fabiano Fidêncio	33c5c49719	packaging: Move repo_root_dir to lib.sh This is used in several parts of the code, and can have a single declaration as part of the `lib.sh` file, which is already imported by all the places where it's used. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-08 21:10:53 +01:00
James O. D. Hunt	614d1817ce	Merge pull request #6410 from tg5788re/kata-manager-use-runtime-checks utils: Make kata-manager.sh runs checks	2023-03-08 09:55:03 +00:00
Chao Wu	fef268a7de	Merge pull request #6413 from xuejun-xj/xuejun/pmu dragonball: support pmu on aarch64	2023-03-08 14:24:31 +08:00
Steve Horsman	cc1821fb8b	Merge pull request #6409 from Sig00rd/patch-1 docs: fix typo in key filename in AWS installation guide	2023-03-07 15:19:46 +00:00
Fabiano Fidêncio	861552c305	Merge pull request #6414 from jepio/jepio/backport-3.1-rustjail-systemd-cgroup-fix-6331 backport rustjail systemd cgroup fix #6331 to 3.1	2023-03-07 12:51:08 +01:00
Sidhartha Mani	16e2c3cc55	agent: implement update_ephemeral_mounts api - implement update_ephemeral_mounts rpc - for each mountpoint passed in, remount it with new options Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>	2023-03-06 13:44:14 -08:00
Sidhartha Mani	3896c7a22b	protocol: add updateEphemeralMounts proto - adds a new rpc call to the agent service named `updateEphemeralMounts` - this call takes a list of grpc.Storage objects Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>	2023-03-06 13:43:47 -08:00
Jeremi Piotrowski	23488312f5	agent: always use cgroupfs when running as init The logic to decide which cgroup driver is used is currently based on the cgroup path that the host provides. This requires host and guest to use the same cgroup driver. If the guest uses kata-agent as init, then systemd can't be used as the cgroup driver. If the host requests a systemd cgroup, this currently results in a rustjail panic: thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: I/O error: No such file or directory (os error 2) Caused by: No such file or directory (os error 2)', rustjail/src/cgroups/systemd/manager.rs:44:51 stack backtrace: 0: 0x7ff0fe77a793 - std::backtrace_rs::backtrace::libunwind::trace::h8c197fa9a679d134 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5 1: 0x7ff0fe77a793 - std::backtrace_rs::backtrace::trace_unsynchronized::h9ee19d58b6d5934a at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5 2: 0x7ff0fe77a793 - std::sys_common::backtrace::_print_fmt::h4badc450600fc417 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:65:5 3: 0x7ff0fe77a793 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::had334ddb529a2169 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:44:22 4: 0x7ff0fdce815e - core::fmt::write::h1aa7694f03e44db2 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/fmt/mod.rs:1209:17 5: 0x7ff0fe74e0c4 - std::io::Write::write_fmt::h61b2bdc565be41b5 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/io/mod.rs:1682:15 6: 0x7ff0fe77cd3f - std::sys_common::backtrace::_print::h4ec69798b72ff254 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:47:5 7: 0x7ff0fe77cd3f - std::sys_common::backtrace::print::h0e6c02048dec3c77 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:34:9 8: 0x7ff0fe77c93f - std::panicking::default_hook::{{closure}}::hcdb7e705dc37ea6e at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:267:22 9: 0x7ff0fe77d9b8 - std::panicking::default_hook::he03a933a0f01790f at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:286:9 10: 0x7ff0fe77d9b8 - std::panicking::rust_panic_with_hook::he26b680bfd953008 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:688:13 11: 0x7ff0fe77d482 - std::panicking::begin_panic_handler::{{closure}}::h559120d2dd1c6180 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:579:13 12: 0x7ff0fe77d3ec - std::sys_common::backtrace::__rust_end_short_backtrace::h36db621fc93b005a at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:137:18 13: 0x7ff0fe77d3c1 - rust_begin_unwind at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5 14: 0x7ff0fda52ee2 - core::panicking::panic_fmt::he7679b415d25c5f4 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14 15: 0x7ff0fda53182 - core::result::unwrap_failed::hb71caff146724b6b at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5 16: 0x7ff0fe5bd738 - <rustjail::cgroups::systemd::manager::Manager as rustjail::cgroups::Manager>::apply::hd46958d9d807d2ca 17: 0x7ff0fe606d80 - <rustjail::container::LinuxContainer as rustjail::container::BaseContainer>::start::{{closure}}::h1de806d91fcb878f 18: 0x7ff0fe604a76 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1749c148adcc235f 19: 0x7ff0fdc0c992 - kata_agent::rpc::AgentService::do_create_container::{{closure}}::{{closure}}::hc1b87a15dfdf2f64 20: 0x7ff0fdb80ae4 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h846a8c9e4fb67707 21: 0x7ff0fe3bb816 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h53de16ff66ed3972 22: 0x7ff0fdb519cb - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1cbece980286c0f4 23: 0x7ff0fdf4019c - <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::hc8e72d155feb8d1f 24: 0x7ff0fdfa5fd8 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h0a407ffe2559449a 25: 0x7ff0fdf033a1 - tokio::runtime::task::raw::poll::h1045d9f1db9742de 26: 0x7ff0fe7a8ce2 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h4924ae3464af7fbd 27: 0x7ff0fe7afb85 - tokio::runtime::task::raw::poll::h5c843be39646b833 28: 0x7ff0fe7a05ee - std::sys_common::backtrace::__rust_begin_short_backtrace::ha7777c55b98a9bd1 29: 0x7ff0fe7a9bdb - core::ops::function::FnOnce::call_once{{vtable.shim}}::h27ec83c953360cdd 30: 0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hed812350c5aef7a8 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9 31: 0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc7df8e435a658960 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9 32: 0x7ff0fe7801d5 - std::sys::unix::thread::Thread::new::thread_start::h575491a8a17dbb33 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys/unix/thread.rs:108:17 Forward the value of "init_mode" to AgentService, so that we can force cgroupfs when systemd is unavailable. Fixes: #5779 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-03-06 20:34:21 +01:00
Jeremi Piotrowski	8546387348	agent: determine value of use_systemd_cgroup before LinuxContainer::new() Right now LinuxContainer::new() gets passed a CreateOpts struct, but then modifies the use_systemd_cgroup field inside that struct. Pull the cgroups path parsing logic into do_create_container, so that CreateOpts can be immutable in LinuxContainer::new. This is just moving things around, there should be no functional changes. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-03-06 20:34:21 +01:00
Jeremi Piotrowski	736aae47a4	rustjail: print type of cgroup manager Since the cgroup manager is wrapped in a dyn now, the print in LinuxContainer::new has been useless and just says "CgroupManager". Extend the Debug trait for 'dyn Manager' to print the type of the cgroup manager so that it's easier to debug issues. Fixes: #5779 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-03-06 20:34:21 +01:00
Fabiano Fidêncio	0749657c73	Merge pull request #6359 from singhwang/main main \| kata-deploy: Fix kata deploy arm64 image build error	2023-03-06 16:48:03 +01:00
Fabiano Fidêncio	dbae281924	workflows: Properly set the kata-tarball architecture Let's make sure the kata-tarball architecture upload / downloaded / used is exactly the same one that we need as part of the architecture we're using to generate the image. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-06 13:18:51 +01:00
Fabiano Fidêncio	76b4591e2b	tools: Adjust the build-and-upload-payload.sh script Now that we've switched the base container image to using Ubuntu instead of CentOS, we don't need any kind of extra logic to correctly build the image for different architectures, as Ubuntu is a multi-arch image that supports all the architectures we're targetting. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-06 13:18:51 +01:00
SinghWang	cd2aaeda2a	kata-deploy: Switch to using an ubuntu image Let's make sure we use a multi-arch image for building kata-deploy. A few changes were also added in order to get systemd working inside the kata-deploy image, due to the switch from CentOS to Ubuntu. Fixes: #6358 Signed-off-by: SinghWang <wangxin_0611@126.com>	2023-03-06 13:18:51 +01:00
Szymon Fugas	2d43e13102	docs: fix typo in AWS installation guide Fixes referring to previously created key file with .pen extension instead of .pem. Fixes: #6412 Signed-off-by: Sig00rd <sfugas@virtuslab.com>	2023-03-06 13:18:08 +01:00
xuejun-xj	760f78137d	dragonball: support pmu on aarch64 This commit adds support for pmu virtualization on aarch64. The initialization of pmu is in the following order: 1. Receive pmu parameter(vpmu_feature) from runtime-rs to determine the VpmuFeatureLevel. 2. Judge whether to initialize pmu devices and add pmu device node into fdt on aarch64, according to VpmuFeatureLevel. Fixes: #6168 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>	2023-03-06 18:55:13 +08:00
Fabiano Fidêncio	93a40cb35e	Merge pull request #6402 from fidencio/topic/yet-more-fixes-for-the-publish-kata-deploy-payload-work workflows: Yet more fixes for publishing the kata-deploy payload after every PR merged	2023-03-06 10:43:32 +01:00
Fabiano Fidêncio	df35f8f885	Merge pull request #6331 from jepio/jepio/fix-agent-init-cgroups rustjail: fix cgroup handling in agent-init mode	2023-03-05 20:29:40 +01:00
Fabiano Fidêncio	98d611623f	Merge pull request #6361 from etrunko/main runtime/Makefile: Fix install-containerd-shim-v2 dependency	2023-03-04 13:47:11 +01:00
Fabiano Fidêncio	9bc7bef3d6	kata-deploy: Fix path to the Dockerfile As part of `bd1ed26c8d`, we've pointed to the Dockerfile that's used in the CC branch, which is wrong. For what we're doing on main, we should be pointing to the one under the `kata-deploy` folder, and not the one under the non-existent `kata-deploy-cc` one. Fixes: #6343 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-04 12:18:38 +01:00
Fabiano Fidêncio	78ba363f8e	kata-deploy: Use different images for s390x and aarch64 As the image provided as part of registry.centos.org is not a multi-arch one, at least not for CentOS 7, we need to expand the script used to build the image to pass images that are known to work for s390x (ClefOS) and aarch64 (CentOS, but coming from dockerhub). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-04 12:18:32 +01:00
Fabiano Fidêncio	6267909501	kata-deploy: Allow passing BASE_IMAGE_{NAME,TAG} Let's break the IMAGE build parameter into BASE_IMAGE_NAME and BASE_IMAGE_TAG, as it makes it easier to replace the default CentOS image by something else. Spoiler alert, the default CentOS image is not multi-arch, and we do want to support at least aarch64 and s390x in the near term future. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-04 12:16:41 +01:00
Jianyong Wu	3443f558a6	nydus: upgrad nydus to v2.2.0 Use the latest nydus, we may let nydus work on arm64. Fixes: #6407 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-03-04 12:58:48 +08:00
Jianyong Wu	395645e1ce	runtime: hybrid-mode cause error in the latest nydusd When update the nydusd to 2.2, the argument "--hybrid-mode" cause the following error: thread 'main' panicked at 'ArgAction::SetTrue / ArgAction::SetFalse is defaulted' Maybe we should remove it to upgrad nydusd Fixes: #6407 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-03-04 12:58:48 +08:00
tg5788re	f8e44172f6	utils: Make kata-manager.sh runs checks Updated the `kata-manager.sh` script to make it run all the checks on the host system before attempting to create a container. If any checks fail, they will indicate to the user what the problem is in a clearer manner than those reported by the container manager. Fixes: #6281. Signed-off-by: tg5788re <jfokugas@gmail.com>	2023-03-03 09:56:12 -06:00
Chelsea Mafrica	ebe916b372	Merge pull request #6355 from yanggangtony/fix-wrong-notes fix wrong notes for func GetSandboxesStoragePathRust()	2023-03-03 07:55:54 -08:00
Jeremi Piotrowski	f31c79d210	workflows: static-checks: Remove TRAVIS_XXX variables These variables are unused since we don't use travis CI. This also allows to remove two steps: - 'Setup GOPATH' only printed variables - 'Setup travis reference' modified some shell local variables that don't have any influence on the rest of the steps The TRAVIS var is still used by tools/osbuilder/tests to determine if virtualization is available. Fixes: #3544 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-03-03 11:38:34 +01:00
Zhongtao Hu	60bb9d114a	Merge pull request #6399 from yipengyin/fix-cleanup fix(runtime-rs): add exited state to ensure cleanup	2023-03-03 17:41:16 +08:00
Chao Wu	6fc4c8b099	Merge pull request #5788 from openanolis/runtime-rs-ocihook runtime-rs: add oci hook support	2023-03-03 01:06:21 +08:00
James O. D. Hunt	4a7a859592	Merge pull request #6377 from pembek01/remove-cgroupsv2-check utils: Remove kata-manager.sh cgroups v2 check	2023-03-02 17:00:46 +00:00
Fabiano Fidêncio	b20d5289cb	Merge pull request #6400 from fidencio/topic/fixes-for-generating-the-kata-deploy-payload workflows: Fixes for the `payload-after-push` action	2023-03-02 14:20:24 +01:00
Yipeng Yin	8030e469b2	fix(runtime-rs): add exited state to ensure cleanup Set process status to exited at end of io wait, which indicate process exited only, but stop process has not been finished. Otherwise, the cleanup_container will be skipped. Fixes: #6393 Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>	2023-03-02 18:14:20 +08:00
Fabiano Fidêncio	7d292d7fc3	workflows: Fix the path of imported workflows In `payload-after-push.yaml` we ended up mentioning cc-*.yaml workflows, which are non existent in the main branch. Let's adapt the name to the correct ones. Fixes: #6343 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-02 10:18:10 +01:00
Fabiano Fidêncio	e07162e79d	workflows: Fix action name We have a few actions in the `payload-after-push.*.yaml` that are referring to Confidential Containers, but they should be referring to Kata Containers instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-02 10:17:18 +01:00
Chao Wu	572c385774	Merge pull request #6269 from openanolis/chao/update_dragonball_version Dragonball: update dependencies	2023-03-02 17:15:39 +08:00
Fabiano Fidêncio	7286f8f706	Merge pull request #6391 from fidencio/topic/do-not-install-docker-as-part-of-the-actions workflows: Do not install docker	2023-03-02 10:12:15 +01:00
Fabiano Fidêncio	7201279647	Merge pull request #6344 from fidencio/topic/generate-a-kata-deploy-payload-on-each-PR-merged workflows: Publish kata-deploy payload after a merge	2023-03-02 09:02:34 +01:00
Chao Wu	dd2713521e	Dragonball: update dependencies Since rust-vmm and dragonball-sandbox has introduced several updates such as vPMU support for aarch64, we also need to update Dragonball dependencies to include those changes. Update: virtio-queue to v0.6.0 kvm-ioctls to v0.12.0 dbs-upcall to v0.2.0 dbs-virtio-devices to v0.2.0 kvm-bindings to v0.6.0 Also, several aarch64 features are updated because of dependencies changes: 1. update vcpu hotplug API. 2. update vpmu related API. 3. adjust unit test cases for aarch64 Dragonball. fixes: #6268 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-03-02 14:53:04 +08:00
Chao Wu	2934ab4a3c	Merge pull request #6380 from Christopher-C-Robinson/#6256-typo-fix src: Fixed typo mod.rs	2023-03-02 14:31:33 +08:00
Fabiano Fidêncio	bd1ed26c8d	workflows: Publish kata-deploy payload after a merge For the architectures we know that `make kata-tarball` works as expected, let's start publishing the kata-deploy payload after each merge. This will help to: * Easily test the content of current `main` or `stable-` branch Easily bisect issues * Start providing some sort of CI/CD content pipeline for those who need that This is a forward-port work from the `CCv0` and groups together patches that I've worked on, with the work that Choi did in order to support different architectures. Fixes: #6343 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-02 02:19:10 +01:00
Domesticcadiz	fea7e8816f	runtime-rs: Fixed typo mod.rs Fixed the typo in comment in the delete method located in mod.rs file. Fixes: #6256. Signed-off-by: Domesticcadiz <christopher.cadiz.robinson@gmail.com>	2023-03-01 18:03:41 -06:00
Archana Shinde	65fa19fe92	Merge pull request #6305 from amshinde/update-action-kernel-check actions: Use `git-diff` to get changes in kernel dir	2023-03-01 13:46:50 -08:00
Eduardo Lima (Etrunko)	a9e2fc8678	runtime/Makefile: Fix install-containerd-shim-v2 dependency $ make install make: *** No rule to make target 'containerd-shim-kata-v2', needed by 'install-containerd-shim-v2'. Stop. Spotted when building kata-runtime with a different name for SHIMV2_OUTPUT. For instance, trying to keep different runtime binaries installed at the same time, one from master and another from lets say, the CCv0 branch, with the following small change applied. diff --git a/src/runtime/Makefile b/src/runtime/Makefile index 95efaff78..2bab9eb75 100644 --- a/src/runtime/Makefile +++ b/src/runtime/Makefile @@ -231,7 +231,7 @@ SED = sed CLI_DIR = cmd SHIMV2 = containerd-shim-kata-v2 -SHIMV2_OUTPUT = $(bCURDIR)/$(SHIMV2) +SHIMV2_OUTPUT = $(CURDIR)/$(SHIMV2)-ccv0 SHIMV2_DIR = $(CLI_DIR)/$(SHIMV2) MONITOR = kata-monitor Fixes: #6398 Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>	2023-03-01 15:57:30 -03:00
yanggang	b6880c60d3	logging: Correct the code notes Fix wrong notes for func GetSandboxesStoragePathRust() Fixes: #6394 Signed-off-by: yanggang <gang.yang@daocloud.io>	2023-03-01 19:20:25 +08:00
Yushuo	12cfad4858	runtime-rs: modify the transfer to oci::Hooks In this commit, we have done: * modify the tranfer process from grpc::Hooks to oci::Hooks, so the code can be more clean * add more tests for create_runtime, create_container, start_container hooks Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-03-01 10:35:10 +08:00
Fabiano Fidêncio	828d467222	workflows: Do not install docker The latest ubuntu runners already have docker installed and trying to install it manually will cause the following issue: ``` Run curl -fsSL https://test.docker.com/ -o test-docker.sh Warning: the "docker" command appears to already exist on this system. If you already have Docker installed, this script can cause trouble, which is why we're displaying this warning and provide the opportunity to cancel the installation. If you installed the current Docker package using this script and are using it again to update Docker, you can safely ignore this message. You may press Ctrl+C now to abort this script. + sleep 20 + sudo -E sh -c apt-get update -qq >/dev/null E: The repository 'https://packages.microsoft.com/ubuntu/22.04/prod jammy Release' is no longer signed. ``` Fixes: #6390 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-28 23:53:28 +01:00
Alec Pemberton	4b8a5a1a3d	utils: Remove kata-manager.sh cgroups v2 check Removed the part in the `kata-manager.sh` script that checks if the host system only runs cgroups v2. Fixes: #6259. Signed-off-by: Alec Pemberton <pembek1901@gmail.com>	2023-02-28 11:23:51 -06:00
Steve Horsman	785310fe18	Merge pull request #6368 from yoheiueda/dir-perm agent: don't set permission of existing directory in copy_file	2023-02-28 14:48:10 +00:00
Chelsea Mafrica	703589c279	Merge pull request #6369 from XDTG/6082/Fix-path-check-bypassed runtime: use filepath.Clean() to clean the mount path	2023-02-27 17:24:50 -08:00
Bo Chen	ba9227184e	Merge pull request #6376 from likebreath/0224/clh_v30.0 Upgrade to Cloud Hypervisor v30.0	2023-02-27 11:48:52 -08:00
Yushuo	2c4428ee02	runtime-rs: move pre-start hooks to sandbox_start In some cases, network endpoints will be configured through Prestart Hook. So network endpoints may need to be added(hotpluged) after vm is started and also Prestart Hook is executed. We move pre-start hook functions' execution to sandbox_start to allow hooks running between vm_start and netns_scan easily, so that the lifecycle API can be cleaner. Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-02-27 21:56:43 +08:00
Yushuo	e80c9f7b74	runtime-rs: add StartContainer hook StartContainer will be execute in guest container namespace in Kata. The Hook Path of this kind of hook is also in guest container namespace. StartContainer is executed after start operation is called, and it should be executed before user-specific command is executed. Fixes: #5787 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-02-27 21:56:43 +08:00
Yushuo	977f281c5c	runtime-rs: add CreateContainer hook support CreateContainer hook is one kind of OCI hook. In kata, it will be executed after VM is started, before container is created, and after CreateRuntime is executed. The hook path of CreateContainer hook is in host runtime namespace, but it will be executed in host vmm namespace. Fixes: #5787 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-02-27 21:56:43 +08:00
Yushuo	875f2db528	runtime-rs: add oci hook support According to the runtime OCI Spec, there can be some hook operations in the lifecycle of the container. In these hook operations, the runtime can execute some commands. There are different points in time in the container lifecycle and different hook types can be executed. In this commit, we are now supporting 4 types of hooks(same in runtime-go): Prestart hook, CreateRuntime hook, Poststart hook and Poststop hook. Fixes: #5787 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-02-27 21:56:43 +08:00
Yushuo	ecac3a9e10	docs: add design doc for Hooks Fixes: #5787 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-02-27 21:56:43 +08:00
Bin Liu	e90989b16b	Merge pull request #6314 from openanolis/static_doc feat(runtime): make static resource management consistent with 2.0	2023-02-27 16:43:27 +08:00
Bo Chen	3ac6f29e95	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v30.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #6375 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-02-24 10:20:29 -08:00
Bo Chen	262daaa2ef	versions: Upgrade to Cloud Hypervisor v30.0 Details of this release can be found in our new roadmap project as iteration v30.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #6375 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-02-24 10:19:46 -08:00
Jeremi Piotrowski	192df84588	agent: always use cgroupfs when running as init The logic to decide which cgroup driver is used is currently based on the cgroup path that the host provides. This requires host and guest to use the same cgroup driver. If the guest uses kata-agent as init, then systemd can't be used as the cgroup driver. If the host requests a systemd cgroup, this currently results in a rustjail panic: thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: I/O error: No such file or directory (os error 2) Caused by: No such file or directory (os error 2)', rustjail/src/cgroups/systemd/manager.rs:44:51 stack backtrace: 0: 0x7ff0fe77a793 - std::backtrace_rs::backtrace::libunwind::trace::h8c197fa9a679d134 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5 1: 0x7ff0fe77a793 - std::backtrace_rs::backtrace::trace_unsynchronized::h9ee19d58b6d5934a at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5 2: 0x7ff0fe77a793 - std::sys_common::backtrace::_print_fmt::h4badc450600fc417 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:65:5 3: 0x7ff0fe77a793 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::had334ddb529a2169 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:44:22 4: 0x7ff0fdce815e - core::fmt::write::h1aa7694f03e44db2 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/fmt/mod.rs:1209:17 5: 0x7ff0fe74e0c4 - std::io::Write::write_fmt::h61b2bdc565be41b5 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/io/mod.rs:1682:15 6: 0x7ff0fe77cd3f - std::sys_common::backtrace::_print::h4ec69798b72ff254 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:47:5 7: 0x7ff0fe77cd3f - std::sys_common::backtrace::print::h0e6c02048dec3c77 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:34:9 8: 0x7ff0fe77c93f - std::panicking::default_hook::{{closure}}::hcdb7e705dc37ea6e at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:267:22 9: 0x7ff0fe77d9b8 - std::panicking::default_hook::he03a933a0f01790f at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:286:9 10: 0x7ff0fe77d9b8 - std::panicking::rust_panic_with_hook::he26b680bfd953008 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:688:13 11: 0x7ff0fe77d482 - std::panicking::begin_panic_handler::{{closure}}::h559120d2dd1c6180 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:579:13 12: 0x7ff0fe77d3ec - std::sys_common::backtrace::__rust_end_short_backtrace::h36db621fc93b005a at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:137:18 13: 0x7ff0fe77d3c1 - rust_begin_unwind at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5 14: 0x7ff0fda52ee2 - core::panicking::panic_fmt::he7679b415d25c5f4 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14 15: 0x7ff0fda53182 - core::result::unwrap_failed::hb71caff146724b6b at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5 16: 0x7ff0fe5bd738 - <rustjail::cgroups::systemd::manager::Manager as rustjail::cgroups::Manager>::apply::hd46958d9d807d2ca 17: 0x7ff0fe606d80 - <rustjail::container::LinuxContainer as rustjail::container::BaseContainer>::start::{{closure}}::h1de806d91fcb878f 18: 0x7ff0fe604a76 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1749c148adcc235f 19: 0x7ff0fdc0c992 - kata_agent::rpc::AgentService::do_create_container::{{closure}}::{{closure}}::hc1b87a15dfdf2f64 20: 0x7ff0fdb80ae4 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h846a8c9e4fb67707 21: 0x7ff0fe3bb816 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h53de16ff66ed3972 22: 0x7ff0fdb519cb - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1cbece980286c0f4 23: 0x7ff0fdf4019c - <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::hc8e72d155feb8d1f 24: 0x7ff0fdfa5fd8 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h0a407ffe2559449a 25: 0x7ff0fdf033a1 - tokio::runtime::task::raw::poll::h1045d9f1db9742de 26: 0x7ff0fe7a8ce2 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h4924ae3464af7fbd 27: 0x7ff0fe7afb85 - tokio::runtime::task::raw::poll::h5c843be39646b833 28: 0x7ff0fe7a05ee - std::sys_common::backtrace::__rust_begin_short_backtrace::ha7777c55b98a9bd1 29: 0x7ff0fe7a9bdb - core::ops::function::FnOnce::call_once{{vtable.shim}}::h27ec83c953360cdd 30: 0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hed812350c5aef7a8 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9 31: 0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc7df8e435a658960 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9 32: 0x7ff0fe7801d5 - std::sys::unix::thread::Thread::new::thread_start::h575491a8a17dbb33 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys/unix/thread.rs:108:17 Forward the value of "init_mode" to AgentService, so that we can force cgroupfs when systemd is unavailable. Fixes: #5779 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-02-24 14:02:11 +01:00
Jeremi Piotrowski	b0691806f1	agent: determine value of use_systemd_cgroup before LinuxContainer::new() Right now LinuxContainer::new() gets passed a CreateOpts struct, but then modifies the use_systemd_cgroup field inside that struct. Pull the cgroups path parsing logic into do_create_container, so that CreateOpts can be immutable in LinuxContainer::new. This is just moving things around, there should be no functional changes. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-02-24 13:46:37 +01:00
XDTG	dc86d6dac3	runtime: use filepath.Clean() to clean the mount path Fix path check bypassed issuse introduced by #6082, use filepath.Clean() to clean path before check Fixes: #6082 Signed-off-by: XDTG <click1799@163.com>	2023-02-24 15:48:09 +08:00
Yohei Ueda	c4ef5fd325	agent: don't set permission of existing directory This patch fixes the issue that do_copy_file changes the directory permission of the parent directory of a target file, even when the parent directory already exists. Fixes #6367 Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>	2023-02-24 16:43:59 +09:00
Feng Wang	cbe6ad9034	runtime: support non-root for clh This change enables to run cloud-hypervisor VMM using a non-root user when rootless flag is set true in the configuration Fixes: #2567 Signed-off-by: Feng Wang <fwang@confluent.io>	2023-02-22 13:57:09 -08:00
Fabiano Fidêncio	44a780f262	Merge pull request #6262 from jepio/jepio/initrd-dev-nodes osbuilder: Include minimal set of device nodes in ubuntu initrd	2023-02-22 20:34:13 +01:00
GabyCT	a0b1f81867	Merge pull request #5958 from Apokleos/kata-ctl-exec kata-ctl/exec: add new command exec to enter guest VM.	2023-02-22 12:07:44 -06:00
Fabiano Fidêncio	109071855d	Merge pull request #6124 from Alex-Carter01/snp-kernel-config kernel: Add CONFIG_SEV_GUEST to SEV kernel config	2023-02-22 18:42:35 +01:00
David Esparza	5e2fe5f932	Merge pull request #6332 from jodh-intel/runtime-rs-ch-config-convert runtime-rs: Improve Cloud Hypervisor config handling	2023-02-22 10:15:50 -06:00
GabyCT	5c6e56931f	Merge pull request #6312 from Amulyam24/virtiofsd-fix virtiofsd: update to a valid path on ppc64le	2023-02-22 08:57:51 -06:00
James O. D. Hunt	3483272bbd	runtime-rs: ch: Enable initrd usage Allow an initrd/initramfs image to be used with Cloud Hypervisor, which is handled differently to the default rootfs image type. Fixes: #6335. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-02-22 10:55:01 +00:00
James O. D. Hunt	fbee6c820e	runtime-rs: Improve Cloud Hypervisor config handling Replace `cloud_hypervisor_vm_create_cfg()` with a set of `TryFrom` trait implementations in the new CH specific `convert.rs` to allow the generic `Hypervisor` configuration to be converted into the CH specific `VmConfig` type. Note that device configuration is not currently handled in `convert.rs` (it's handled in `inner_device.rs`). This change removes the old hard-coded CH specific configuration. Fixes: #6203. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-02-22 10:48:05 +00:00
Chao Wu	578f2e7c2e	Merge pull request #6080 from openanolis/rem runtime-rs: cleanup kata host share path	2023-02-22 17:45:24 +08:00
GabyCT	7aff118c82	Merge pull request #6236 from jepio/jepio/osbuilder-fix-default-make-target osbuilder: fix default build target in makefile	2023-02-21 17:00:21 -06:00
Alex Carter	1bff1ca30a	kernel: Add CONFIG_SEV_GUEST to SEV kernel config Adding kernel config to sev case since it is needed for SNP and SNP will use the SEV kernel. Incrementing kernel config version to reflect changes Fixes: #6123 Signed-off-by: Alex Carter <Alex.Carter@ibm.com>	2023-02-21 16:48:45 +00:00
GabyCT	fc5c62a5a1	Merge pull request #6330 from c3d/issue/6329-contribution-link-in-devguide devguide: Add link to the contribution guidelines	2023-02-21 09:17:20 -06:00
Fabiano Fidêncio	ab5b45f615	Merge pull request #6340 from fidencio/topic/ensure-go-binaries-can-still-run-on-ubuntu-2004 kata-deploy: Ensure go binaries can run on Ubuntu 20.04	2023-02-21 13:52:18 +01:00
Zhongtao Hu	4f20cb7ced	Merge pull request #6325 from HerlinCoder/herlincoder/config-manager dragonball: config_manager: preserve device when update	2023-02-21 17:51:41 +08:00
Jeremi Piotrowski	ad8968c8d9	rustjail: print type of cgroup manager Since the cgroup manager is wrapped in a dyn now, the print in LinuxContainer::new has been useless and just says "CgroupManager". Extend the Debug trait for 'dyn Manager' to print the type of the cgroup manager so that it's easier to debug issues. Fixes: #5779 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-02-21 10:07:03 +01:00
SinghWang	b4a1527aa6	kata-deploy: Fix static shim-v2 build on arm64 Following Jong Wu suggestion, let's link /usr/bin/musl-gcc to /usr/bin/aarch64-linux-musl-gcc. Fixes: #6320 Signed-off-by: SinghWang <wangxin_0611@126.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-21 10:00:28 +01:00
Fabiano Fidêncio	2c4f8077fd	Revert "shim-v2: Bump Ubuntu container image to 22.04" This reverts commit `9d78bf9086`. Golang binaries are built statically by default, unless linking against CGO, which we do. In this case we dynamically link against glibc, causing us troubles when running a binary built with Ubuntu 22.04 on Ubuntu 20.04 (which will still be supported for the next few years ...) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-21 10:00:28 +01:00
Fabiano Fidêncio	73d0ca0bd5	Merge pull request #6334 from fidencio/topic/fix-push-to-registry-behaviour Revert "workflows: Push the builder image to quay.io"	2023-02-21 10:00:13 +01:00
Bin Liu	5c16e98d4f	Merge pull request #6322 from Tim-Zhang/remove-remain-unsafe-impl Remove all remaining unsafe impl	2023-02-21 14:08:05 +08:00
Fabiano Fidêncio	afaccf924d	Revert "workflows: Push the builder image to quay.io" This reverts commit `b835c40bbd`. Right now I'm reverting this one as this should only run after commits get pushed to our repo, not on very PR.	2023-02-20 18:37:28 +01:00
Fabiano Fidêncio	b1fd4b093b	Merge pull request #6319 from singhwang/main kata-deploy: Fix building the kata static firecracker arm64 package occurred an error	2023-02-20 18:04:31 +01:00
Christophe de Dinechin	4c39c4ef9f	devguide: Add link to the contribution guidelines New developers are often confused by some of our requirements, notably porting labels. While our CONTRIBUTING.md file points to the solution, the developer's guide does not. Add a link there. Fixes: #6329 Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>	2023-02-20 15:27:19 +01:00
Fabiano Fidêncio	a3b615919e	Merge pull request #6323 from fidencio/topic/fix-make-shim-v2-tarball-on-aarch64 shim-v2: Bump Ubuntu container image to 22.04	2023-02-20 14:57:34 +01:00
Jeremi Piotrowski	76e926453a	osbuilder: Include minimal set of device nodes in ubuntu initrd When starting an initrd the kernel expects to find /dev/console in the initrd, so that it can connect it as stdin/stdout/stderr to the /init process. If the device node is missing the kernel will complain that it was unable to open an initial console. If kata-agent is the initrd init process, it will also result in log messages not being logged to console and thus not forwarded to host syslog. Add a set of standard device nodes for completeness, so that console logging works. To do that we install the makedev packge which provides a MAKEDEV helper that knows the major/minor numbers. Unfortunately the debian package tries to create devnodes from postinst, which can be suppressed if systemd-detect-virt is present. That's why we create a small dummy script that matches what systemd-detect-virt would output (anything is enough to suppress mknod). Fixes: #6261 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-02-20 11:15:56 +01:00
Fabiano Fidêncio	6a0ac2b3a5	Merge pull request #6310 from kata-containers/topic/cache-artefacts-container-builder packaging: Cache the container used to build the kata-deploy artefacts	2023-02-20 11:02:53 +01:00
James O. D. Hunt	0dea57c452	Merge pull request #6309 from gabevenberg/always-check-deps utils: always check some dependencies.	2023-02-20 08:31:56 +00:00
SinghWang	697ec8e578	kata-deploy: Fix kata static firecracker arm64 package build error When building the kata static arm64 package, the stages of firecracker report errors. Fixes: #6318 Signed-off-by: SinghWang <wangxin_0611@126.com>	2023-02-20 16:10:18 +08:00
Helin Guo	ced3c99895	dragonball: config_manager: preserve device when update DeviceConfigInfo contains config and device, so when we want to do update we could simply update config part of the info, and device would not be changed during update. Fixes: #6324 Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>	2023-02-20 14:34:09 +08:00
Tim Zhang	da8a6417aa	runtime-rs: remove all remaining unsafe impl Fixes: #6307 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-02-20 14:29:59 +08:00
Tim Zhang	0301194851	dragonball: use crossbeam_channel in VmmService instead of mpsc::channel Because crossbeam_channel has more features and better performance than mpsc::channel and finally rust replace its channel implementation with crossbeam_channel on version 1.67 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-02-20 14:29:57 +08:00
Fabiano Fidêncio	9d78bf9086	shim-v2: Bump Ubuntu container image to 22.04 Let's bump the base container image to use the 22.04 version of Ubuntu, as it does bring up-to-date package dependencies that we need to statically build the runtime-rs on aarch64. Fixes: #6320 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-20 07:14:09 +01:00
Fabiano Fidêncio	299fc35c37	Merge pull request #6304 from fidencio/topic/switch-the-default-x86_64-rootfs-image-to-ubuntu versions: Use ubuntu as the default distro for the rootfs-image	2023-02-17 19:29:10 +01:00
Gabe Venberg	3cfce5a709	utils: improved unsupported distro message. previously, if installing on unkown distro, script would tell user that their distro was unsupported. Changed error message prompting user to install dependecies manually, then retry. Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>	2023-02-17 09:06:26 -06:00
Bin Liu	f44dae75c9	Merge pull request #6267 from jongwooo/github-action/replace-deprecated-command-with-environment-file github-action: Replace deprecated command with environment file	2023-02-17 22:54:12 +08:00
Fabiano Fidêncio	6a29088b81	Merge pull request #6298 from amshinde/update-release-doc docs: Change the order of release step	2023-02-17 15:46:12 +01:00
Ji-Xinyou	919d19f415	feat(runtime): make static resource management consistent with 2.0 * add doc in the configuration * make entry consistent with 2.0 Fixes: #6313 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2023-02-17 21:36:56 +08:00
Bin Liu	b7fe29f033	Merge pull request #6308 from Tim-Zhang/remove-unnecessary-send-and-sync runtime-rs: remove unnecessary Send/Sync trait implement	2023-02-17 19:53:54 +08:00
Fabiano Fidêncio	b835c40bbd	workflows: Push the builder image to quay.io Let's push the builder images to a registry, so we can take advantage of those on each step of our building process. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	781ed2986a	packaging: Allow passing a container builder to the scripts This, combined with the effort of caching builder images and only performing the build itself inside the builder images, is the very first step for reproducible builds for the project. Reproducible builds are quite important when we talk about Confidential Containers, as users may want to verify the content used / provided by the CSPs, and this is the first step towards that direction. Fixes: #5517 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	45668fae15	packaging: Use existing image to build td-shim Let's first try to pull a pre-existing image, instead of building our own, to be used as a builder image for the td-shim. This will save us some CI time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	e8c6bfbdeb	packaging: Use existing image to build td-shim Let's first try to pull a pre-existing image, instead of building our own, to be used as a builder image for the td-shim. This will save us some CI time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	3fa24f7acc	packaging: Add infra to push the OVMF builder image Let's add the needed infra for building and pushing the OVMF builder image to the Kata Containers' quay.io registry. Fixes: #5477 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	f076fa4c77	packaging: Use existing image to build OVMF Let's first try to pull a pre-existing image, instead of buildinf our own, to be used as a builder image for OVMF. This will save us some CI time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	c7f515172d	packaging: Add infra to push the QEMU builder image Let's add the needed infra for only building and pushing the QEMU builder image to the Kata Containers' quay.io registry. Fixes: #5481 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	fb7b86b8e0	packaging: Use existing image to build QEMU Let's first try to pull a pre-existsing image, instead of building our own, to be used as a builder image for QEMU. This will save us some CI time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	d0181bb262	packaging: Add infra to push the virtiofsd builder image Let's add the needed infra for only building and pushing the virtiofsd builder image to the Kata Containers' quay.io registry. Fixes: #5480 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	7c93428a18	packaging: Use existing image to build virtiofsd Let's first try to pull a pre-existing image, instead of building our own, to be used as a builder image for the virtiofsd. This will save us some CI time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	8c227e2471	virtiofsd: Pass the expected toolchain to the build container Let's ensure we're building virtiofsd with a specific toolchain that's known to not cause any issues, instead of always using the latest one. On each bump of the virtiofsd, we'll make sure to adjust this according to what's been used by the virtiofsd community. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:48 +01:00
Fabiano Fidêncio	7ee00d8e57	packaging: Add infra to push the shim-v2 builder image Let's add the needed infra for only building and pushing the shim-v2 builder image to the Kata Containers' quay.io registry. Fixes: #5478 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:47 +01:00
Fabiano Fidêncio	24767d82aa	packaging: Use existing image to build the shim-v2 Let's try to pull a pre-existing image, instead of building our own, to be used as a builder for the shim-v2. This will save us some CI time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 12:06:24 +01:00
Amulyam24	e84af6a620	virtiofsd: update to a valid path on ppc64le Currently the symbolic link for virtiofsd which is used as a valid path is not updated on every CI run. Fix it by using the actual path of installation. Fixes: #6311 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-02-17 16:22:39 +05:30
Fabiano Fidêncio	6c3c771a52	packaging: Add infra to push the kernel builder image Let's add the needed infra for only building and pushing the kernel builder image to the Kata Containers' quay.io registry. Fixes: #5476 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 11:30:28 +01:00
Fabiano Fidêncio	b9b23112bf	packaging: Use existing image to build the kernel Let's first try to pull a pre-existing image, instead of building our own, to be used as a builder image for the kernel. This will save us some CI time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 11:30:28 +01:00
Fabiano Fidêncio	869827d77f	packaging: Add push_to_registry() This function will push a specific tag to a registry, whenever the PUSH_TO_REGISTRY environment variable is set, otherwise it's a no-op. This will be used in the future to avoid replicating that logic in every builder used by the kata-deploy scripts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 11:30:21 +01:00
Fabiano Fidêncio	e69a6f5749	packaging: Add get_last_modification() Let's add a function to get the hash of the last commit modifying a specific file. This will help to avoid writing `git rev-list ...` into every single build script used by the kata-deploy. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 10:39:33 +01:00
Fabiano Fidêncio	6c05e5c67a	packaging: Add and export BUILDER_REGISTRY BUILD_REGISTRY, which points to quay.io/kata-containers/builder, will be used for storing the builder images used to build the artefacts via the kata-deploy scripts. The plan is to tag, whenever it's possible and makes sense, images like: * ${BUILDER_REGISTRY}:${component}-${unique_identifier} Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-17 10:39:33 +01:00
Fabiano Fidêncio	bd9af5569f	Merge pull request #6296 from fidencio/topic/dont-build-runtime-rs-for-ppc64le-2nd-try runtime-rs: Don't build on Power, don't break on Power.	2023-02-17 10:08:39 +01:00
Gabe Venberg	1047840cf8	utils: always check some dependencies. Every dependency in check_deps is used inside the script (apart from git, which may be a historical artifact), and therefore should be checked even when the -f option is passed to the script. Simply changed at what point check_deps is called in order to always run it. Fixes #6302. Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>	2023-02-16 23:00:19 -06:00
Tim Zhang	95e3364493	runtime-rs: remove unnecessary Send/Sync trait implement Send and Sync are automatically derived traits, if a type is composed entirely of Send or Sync types, then it is Send or Sync. Almost all primitives are Send and Sync, so we don't need to implement them manually most of the time. Fixes: #6307 Signed-off-by: Tim Zhang <tim@hyper.sh>	2023-02-17 11:51:13 +08:00
Archana Shinde	a96ba99239	actions: Use `git-diff` to get changes in kernel dir Use `git-diff` instead of legacy `git-whatchanged` to get differences in the packaging/kernel directory. This also fixes a bug by grepping for the kernel directory in the output of the git command. Fixes: #6210 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-16 17:33:41 -08:00
Archana Shinde	619ef54452	docs: Change the order of release step When a new stable branch is created, it is necessary to change the references in the tests repo from main to the new stable branch. However this step needs to be performed after the repos have been tagged as the `tags_repos.sh` script is the one that creates the new branch. Clarify this in the documentation and move the step to change branch references in test repo after repos have been tagged. Fixes: #1824 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-16 12:12:21 -08:00
Fabiano Fidêncio	a161d11920	versions: Use ubuntu as the default distro for the rootfs-image Currently ubuntu is already the default distro for all the architectures but x86_64, which uses clearlinux. However, our CI does not test the clearlinux image we ship. Taking a look at our CI code [0], we've been using ubuntu as base for the tests for a few years already, if not forever. The minimum we can do is to switch to distributing ubuntu, as the tested rootfs-image, and then decide later on whether we should switch back to clearlinux (once we switch our CI to using that, and make sure all tests will be green), or if we move to slimmer distro, such as alpine. [0]: `0a39dd1a01/.ci/install_kata_image.sh (L44)` Fixes: #6303 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-16 20:30:40 +01:00
Fabiano Fidêncio	be40683bc5	runtime-rs: Add a generic powerpc64le-options.mk There's a check in the runtime-rs Makefile that basically checks whether the `arch/$arch-options.mk` exists or not and, if it doesn't, the build is just aborted. With this in mind, let's create a generic powerpc64le-options.mk file and not bail when building for this architecture. Fixes: #6142 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-16 16:29:24 +01:00
Fabiano Fidêncio	47c058599a	packaging/shim-v2: Install the target depending on the arch/libc In the `install_go_rust.sh` file we're adding a x86_64-unknown-linux-musl target unconditionally. That should be, instead, based in the ARCH of the host and the appropriate LIBC to be used with that host. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-16 16:29:24 +01:00
Fabiano Fidêncio	c1602c848a	Merge pull request #6300 from openanolis/footloose runtime-rs: handle sys_dir bind volume	2023-02-16 12:53:15 +01:00
alex.lyn	b582c0db86	kata-ctl/exec: add new command exec to enter guest VM. The patchset will help users to easily enter guest VM by debug console sock. In order to enter guest VM smoothly, users needs to do some configuration, options as below: (1) Set debug_console_enabled = true with default vport 1026. (2) Or add agent.debug_console agent.debug_console_vport=<PORT> into kernel_params, and the vport is <PORT> you set. The detail of usage: $ kata-ctl exec -h kata-ctl-exec Enter into guest VM by debug console USAGE: kata-ctl exec [OPTIONS] <SANDBOX_ID> ARGS: <SANDBOX_ID> pod sandbox ID Fixes: #5340 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-02-16 17:05:53 +08:00
Yushuo	07802a19dc	runtime-rs: handle sys_dir bind volume For some cases, users will mount system directories as bind volume. We should not bind mount these kind of directories in the host as it does not make sense. Fixes: #6299 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2023-02-16 15:45:33 +08:00
Bin Liu	629a31ec6e	Merge pull request #6287 from lifupan/main sandbox: set the dns for the sandbox	2023-02-16 15:00:01 +08:00
Fabiano Fidêncio	f5b28736ce	Merge pull request #6294 from fidencio/topic/only-change-configs-if-the-config-files-exist packaging/shim-v2: Only change the config if the file exists	2023-02-16 07:13:28 +01:00
Fupan Li	04e930073c	sandbox: set the dns for the sandbox The rust agent had supported to set the guest dns server in start sandbox request, thus add the dns in the runtime side. Fixes:#6286 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2023-02-16 11:25:02 +08:00
Fupan Li	32ebe1895b	agent: fix the issue of creating the dns file We should make sure the dns's source file's parent directory exist, otherwise, it would failed to create the file directly. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2023-02-16 11:24:54 +08:00
Peng Tao	139ad8e95f	Merge pull request #6201 from jodh-intel/runtime-rs-add-cloud-hypervisor runtime-rs: Add basic CH implementation	2023-02-16 11:23:04 +08:00
Archana Shinde	eba2bb275d	Merge pull request #6284 from amshinde/revert-kata-deploy-changes-after-3.1.0-rc0-release release: Revert kata-deploy changes after 3.1.0-rc0 release	2023-02-15 14:50:12 -08:00
Archana Shinde	4a35d5fa6e	Merge pull request #6283 from amshinde/3.1.0-rc0-branch-bump # Kata Containers 3.1.0-rc0	2023-02-15 13:00:43 -08:00
Chelsea Mafrica	f9db0c5a86	Merge pull request #6285 from cmaf/assisted-pr-4216 Assisted PR \| docs: Update how-to-use-kata-containers-with-firecracker.md	2023-02-15 09:40:01 -08:00
jongwooo	44aaec9020	github-action: Replace deprecated command with environment file In workflow, `set-output` command is deprecated and will be disabled soon. This commit replaces the deprecated `set-output` command with putting a value in the environment file `$GITHUB_OUTPUT`. Fixes #6266 Signed-off-by: jongwooo <jongwooo.han@gmail.com>	2023-02-16 01:41:03 +09:00
Hyounggyu Choi	a68c5004f8	packaging/shim-v2: Only change the config if the file exists Let's not try to sed a file that doesn't exist, which may be the case depending on the architecture we're building the shim-v2 for. This is a partial-forward port of `f24c47ea47`. Fixes: #6293 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-02-15 17:00:53 +01:00
Willem Dendauw	9304889330	docs: Update how-to-use-kata-containers-with-firecracker.md Removed the `` around containerd, because when you execute this as a script it runs the containerd command within the script, which it should not do. Fixes #4217 Signed-off-by: Willem Dendauw <willem.dendauw@hotmail.com>	2023-02-14 15:53:26 -08:00
Archana Shinde	ee76b398b3	release: Revert kata-deploy changes after 3.1.0-rc0 release As 3.1.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup tags back to "latest", and re-add the kata-deploy-stable and the kata-cleanup-stable files. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-14 15:47:51 -08:00
Archana Shinde	5988199ada	release: Kata Containers 3.1.0-rc0 - kata-deploy: Install protobuf-compiler explicitly in shim-v2 Dockerfile - runtime: tracing: Fix missing ctx return - runtime: add reconnect timeout for vhost user block - SEV: Update ReducedPhysBits - shim-v2/build.sh: Only build runtime-rs for the supported arches - kata-ctl: Expand unit tests for CPU check - runtime: support cgroup v2 metrics marshal guest metrics - Typo: change tabs in comment to spaces - rootfs: support EROFS filesystem - versions: Update runc version - runtime: Improve documentation of appendFDs - Minor cleanups in make file - main \| docs: Fix missing critical steps in how-to-hotplug-memory-arm64.md - Action check kernel config version - clh: Enforce API timeout only for vm.boot request - virtiofsd: change cache mod to const - runtime-rs: ignor "no such process" error when delete cgroup for a thread to let it go - kernel: Add console kernel config for s390 - runtime: remove not used shim configurations - improvement: Fix naming conventions for span name and log subsystem - Dragonball: add cpu resize ability - arm64/CI: fix unit test failure on arm64 - CI: Make docker version stick to v20.10 in ubuntu:20.04 for s390x\|ppc64le - virtiofsd: fix the build on ppc64le - runtime:fix stat uds path - cni: Update cni plugins version to 1.2.0 - Built-in Sandbox: add more unit tests for dragonball. Part 5 - runtime: Drop QEMU log file support - docs: Add documentation for building agent with seccomp support. - Add kernel-dragonball-experimental to kata-deploy, kata-deploy-test, and the release - runtime-rs: add missing config section for share-fs - runtime: Add hmp for qemu - upcall: add document for upcall - runtime: Start QEMU undaemonized and get logs - docs: Update url link in QAT documentation - versions: update cni plugins version - versions: Upgrade to Cloud Hypervisor v29.0 - runtime: Use consts in `kata-runtime check` - versions: Bump QEMU to v7.2.0 - agent: Eliminate unnecessary metrics - runtime:all APIs are hang in the service.mu - Utility functions for kata-env - versions: Update conmon version - runtime: paas enablevhostuserstore annotation to hypervisor config - runk: Upgrade liboci-cli to v0.0.4 - runtime: use system pagesize for hugepage test - dependency: update cgroups-rs - runtime: Use git rev-parse for the kata-monitor tag - virtcontainers: split out linux-specific bits for mount, factory - Add darwin skeletons - vendor: revendor netlink to get latest - Address issues with the initial vCPU pinning functionality - virtcontainers: Fix misspelling in error message - runtime: add test generated file to .gitignore - runtime: fix up disable_netns handling - docs: add hint of probing loop module - tools: add --locked option for cargo install - runtime-rs: add Single Container support - virtcontainers: tests: Ensure Linux specific tests are just run on Linux - Change cache mode from none to never - tools: Fix indentation for setup aks script - virtcontainers: fs_share: Add Darwin skeleton - virtcontainers: Add a Virtualization.framework skeleton - kata-ctl: remove get_kata_version_by_url function - kata-ctl: fix build error on s390x - virtcontainers: Introduce hypervisor_darwin - runtime: Define Darwin handled signals list - nydus: net-ns handling needs to be only executed on Linux hosts - clh: Ensure it works with Docker / Moby - agent: refactor guest hooks - fix moby prestart hook handling - schedcore: Make buildable on !linux - Built-in Sandbox: add more unit tests for dragonball. Part 4 - runtime-rs: cleanup the run dir of hypervisor when shut down - Feat: implementation of kata-ctl direct-volume operations - Runtime: Clarify mutability of global var - kata-runtime: add rust runtime path for kata-runtime exec - versions: Upgrade to Cloud Hypervisor v28.1 - runtime-rs: add dbs-upcall feature - runtime/Makefile: Get some bits happy on darwin - docs: remove old and misleading instructions for minikube - packaging: fix indents in build-kernel.sh - kernel: adding kmod to do docker env - versions: Update the rust toolchain to 1.66.0 - kata-ctl: skip test if access GitHub.com fail - agent: unset `CC` for cross-build - runtime-rs: enable hugepage - runtime-rs: Clean up mount points shared to guest - kata-ctl: fix checkcpu bug in non-x86 arches `d144ded12` release: Adapt kata-deploy for 3.1.0-rc0 `8e3863cec` kata-deploy: Install protobuf-compiler explicitly in shim-v2 Dockerfile `c45391991` runtime: tracing: Fix missing ctx return `4139d68d5` runtime-rs: Include target install in conditional branch `ca02c9f51` runtime: add reconnect timeout for vhost user block `2f5bc0f40` kata-ctl: Expand unit tests for CPU check `67b8f0773` SEV: Update ReducedPhysBits `bdf20b5d2` rootfs: support EROFS filesystem `fff0e50a7` versions: Update runc version `ed02c8a05` docs: add guide for building rootfs with EROFS `01765e173` runtime: support cgroup v2 metrics marshal guest metrics `49326fe4e` fix(clippy): fix hypervisor clippy checks `94b1d9814` cargo: Update Cargo.lock files `f1855594a` make: Get rid of verbose output while creating tar `c3836010a` make: clean up obsolete targets `ac64b021a` clh: Enforce API timeout only for vm.boot request `56071c6e7` virtiofsd: change cache mod to const `5d37d31ac` cgroups: upgrade cgroupfs to 0.3.1 `ab59a65c9` runtime-rs: neglect a certain error when delete cgroup `390916b33` runtime: remove not used shim configurations `9794c52c6` improvement: Fix naming conventions for span name and log subsystem `f49b89b63` CI: Set docker version to v20.10 in ubuntu:20.04 for s390x\|ppc64le `3c24e2340` README: Update Readme under packaging/kernel `d73f3a8a2` github-action: Add step to verify kernel config version id updated `59f104c02` runtime: skip unit test that fail regularly on aarch64 `b7dd97cac` kata-ctl: fix permission deny issue in test_add_remove `57c5e5629` Dragonball: add cpu resize ability `3c48f2202` runtime: Improve documentation of appendFDs `856ab6687` virtiofsd: fix the build on ppc64le `f83115a83` docs: Fix missing critical steps in how-to-hotplug-memory-arm64.md `e071d9251` Typo: change tabs in comment to spaces `56f0a27fe` kernel: Add console kernel config for s390 `334c4b8bd` runtime: Drop QEMU log file support `3a63e3c1f` cni: Update cni plugins version to 1.2.0 `510798155` dragonball: Improve test cases `dc90c6e30` dragonball: add more unit test for vm `c07135535` runtime-rs: Improve s390x error message `4e2db96ef` runtime-rs: Don't try to build on Power `8e8c720d5` kata-deploy-push: Ensure we build Dragonball specific kernel `1e531b44d` runtime:fix stat uds path `9092c23a2` runtime: Add hmp for qemu `b7f4e96ff` kata-deploy-test: Ensure we build dragonball specific kernel `063dec37c` release: Add the dragonball-experimental kernel `0b3c91d2a` kata-deploy: Add kernel-dragonball-experimental target `00dcd900f` docs: Add documentation for building agent with seccomp support. `2b779cba0` docs: Update url link in QAT documentation `39fe4a4b6` runtime: Collect QEMU's stderr `a5319c6be` runtime: Start QEMU undaemonized `bf4e3a618` runtime: Launch QEMU with cmd.Start() `8a1723a5c` runtime: Pre-establish the QMP connection `8a4f08cb0` govmm: Optionally pass QMP listener to QEMU `219bb8e7d` govmm: Optionally start QMP with a pre-configured connection `a85d0e465` versions: update cni plugins version `676d02850` versions: Bump QEMU to v7.2.0 `861c38b6a` versions: Upgrade to Cloud Hypervisor v29.0 `ba87e0afe` runtime: Use consts in `kata-runtime check` `9f490d16f` upcall: add document for upcall `596037e20` versions: Update conmon version `095e8fdef` runk: Use the original Kill command instead of the customed it. `0f9e23a3d` runk: Upgrade liboci-cli to v0.0.4 `69fc8de71` runtime:all APIs are hang in the service.mu `8d4c2cf1b` kata-ctl: Allow certain constants to go unused `64c11a66f` kata-ctl: Have function to get cpu details to run on specific arch `923cd3fda` virtcontainers: split out Linux parts from mount `cf1bae352` runtime: paas enablevhostuserstore annotation to hypervisor config `1592a385e` dependency: update cgroups-rs `60ff230d8` virtcontainers: Split the factory package into Linux and Darwin bits `76437a972` runtime: Use git rev-parse for the kata-monitor tag `a9626682a` virtcontainers: resourcecontrol: Add skeleton for Darwin `ea06fe3af` virtcontainers: Add a Network API skeleton for Darwin `6ee550e9a` runtime: vCPUs pinning is sandbox specific, not hypervisor `6199b6917` runtime-rs: change cache mode `a33a22ccd` runtime-rs: add missing config section for share-fs `e3d3b72fa` virtcontainers: use resource control for setting CPU affinity `f137048be` resource-control: add helper function for setting CPU affinity `73216a810` vendor: revendor netlink to get latest `fc17d7cc4` virtcontainers: Fix misspelling in error message `12fd6ffc1` runtime: fix up disable_netns handling `64c9114a3` tools: add --locked option for cargo install `7eb43cec1` runtime: add test generated file to .gitignore `8551853cf` runtime: use system pagesize for hugepage test `86a82cace` runtime: change cache mode from none to never `82c59efd6` runtime-rs: change cache mode from none to never `7b309b578` kata-types: change cache mode from none to never `fee4e7c7c` docs: change cache mode from none to never `594b57d08` utils: Add utility functions to get cpu and distro details. `d33e34361` check: Move PROC_CPUINFO from architecture specific files `f8a93a1de` tools: Fix indentation for setup aks script `03de5f41b` kata-ctl: remove get_kata_version_by_url function `464d4c94d` runtime-rs: process single_container `5f9c892e4` kata-types: add single_container support `fa9ae9362` virtcontainers: Add a Virtualization.framework skeleton `d48b22bb1` virtcontainers: fs_share: add Darwin skeleton `fafc7a8b1` virtcontainers: tests: Ensure Linux specific tests are just run on Linux `efa4fc0b2` clh: Add hotplug support for network devices `1074d2c1d` clh: Make vmAddNetPutRequest capable of doing hotplugs `9ec8a1398` virtcontainers: introduce hypervisor_darwin `8bb68a9f2` vc/network: skip existing endpoints when scanning for new ones `c21a8d5ff` kata-ctl: fix build error on s390x `3b4420eb8` runtime: Define Darwin handled signals list `24b05a99b` schedcore: Make buildable on !linux `3886aad19` nydus: net-ns handling needs to be only executed on Linux hosts `e256903af` runtime-rs: cleanup the run dir of hypervisor when shut down `937a41346` kata-ctl: add unit tests for volume ops `8451db7c0` kata-ctl: direct-volume: add Add and Remove handlers `2d4b2cf72` runtime-rs: add POST method to shim-client `cae78a685` kata-ctl: add constants for direct-volume commands `652021ad9` versions: Upgrade to Cloud Hypervisor v28.1 `d08538912` vc: fix up UT for CreateSandbox API change `578a9c25f` vc: rescan network endpoints after running prestart hooks `cb84b0fb0` katautils: run prestart hooks after starting VM `079462d2e` runk: Fix needless_borrow warning `2c24fcf34` runtime-rs: Fix clippy::bool-to-int-with-if warnings `025e78341` runtime-rs: Fix needless_borrow warnings `4fb163d57` runtime-rs: Allow clippy:box_default warnings `20121fcda` runtime-rs: Fix unnecessary_cast warnings `b95364a14` dragonball: Allow question_mark warning in allocate_device_resources() `0b2f060bf` dragonball: Fix unnecessary_cast warnings `a545a6593` agent: Allow clippy::question_mark warning in Namespace{} `9ced34dd2` agent: Fix explicit_auto_deref warnings `f77220490` agent: Fix needless_borrow warnings `7bcdc9049` rustjail: Fix unnecessary_cast warnings `41d7dbaae` rustjail: Fix needless_borrow warnings `2a73e057d` kata-types: Fix unnecessary_cast warnings `cf9ef1833` kata-types: Fix needless_borrow warnings `126187e81` safe-path: Fix needless_borrow warnings `bb78d35db` kata-sys-util: Fix "match-like-matches-macro" warning `668e65240` kata-sys-util: Fix unnecessary_cast warnings `c1a8d89a7` kata-sys-util: Fix needless_borrow warnings `c9c38e6d0` logging: Allow clippy::type-complexity warning `ffd6fbb6b` logging: Fix needless_borrow warnings `60df30015` protocols: Fix unnecessary_cast warnings `56e7b5d0f` runtime/Makefile: Get some bits happy on darwin `0bbeb34b4` protocols: Fix needless_borrow warnings `dfea6c7d2` versions: Update the rust toolchain to 1.66.0 `86ee24b33` Runtime: Clarify mutability of global var `dae667062` kata-runtime: add rust runtime path for kata-runtime exec `a2e3715e0` upcall: remove upcall client when stopping vm `31591d791` dragonball: fix unit test failure case about Kvm. `2b02e0a9b` dragonball: add more unit test for vcpu manager `85f9094f1` agent: refactor guest hooks `360506225` runtime-rs: add dbs-upcall feature `03a0c9d78` kata-ctl: skip test if access GitHub.com fail `1dcbda3f0` kata-ctl: update Cargo.lock `b4b5d8150` docs: remove old and misleading instructions for minikube `0fe24e08b` packaging: fix indents in build-kernel.sh `3480780bd` kata-ctl: add check framework support for non-x86 `1bd533f10` kata-ctl: let check framework arch-agnostic `fd77eebd4` runtime-rs: fix the issues mentioned in the code review `0e6920790` runtime-rs: Clean up mount points shared to guest `ecb28e2b1` kernel: adding kmod to do docker env `087515a46` agent: unset `CC` for cross-build `bf8848f92` agent: Eliminate unnecessary metrics `f8a48ab41` docs: add hint of probing loop module `afaf17f42` runtime-rs: enable container hugepage `fc4a67eec` runtime-rs: enable vm hugepage Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-14 15:47:44 -08:00
Archana Shinde	d144ded12c	release: Adapt kata-deploy for 3.1.0-rc0 kata-deploy files must be adapted to a new release. The cases where it happens are when the release goes from -> to: * main -> stable: * kata-deploy-stable / kata-cleanup-stable: are removed * stable -> stable: * kata-deploy / kata-cleanup: bump the release to the new one. There are no changes when doing an alpha release, as the files on the "main" branch always point to the "latest" and "stable" tags. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-14 15:47:44 -08:00
Fabiano Fidêncio	0d2a7f8324	Merge pull request #6273 from BbolroC/fix-protobuf-s390x-ppc64le kata-deploy: Install protobuf-compiler explicitly in shim-v2 Dockerfile	2023-02-14 22:25:20 +01:00
James O. D. Hunt	bbc733d6c8	docs: runtime-rs: Add CH status details Add a few details about the current state of the Cloud Hypervisor (CH) runtime-rs external hypervisor implementation with pointers to the appropriate issues. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-02-14 15:38:46 +00:00
James O. D. Hunt	37b594c0d2	runtime-rs: Add basic CH implementation Add a basic runtime-rs `Hypervisor` trait implementation for Cloud Hypervisor (CH). > Notes: > > - This only supports a default Kata configuration for CH currently. > > - Since this feature is still under development, `cargo` features have > been added to enable the feature optionally. The default is to not enable > currently since the code is not ready for general use. > > To enable the feature for testing and development, enable the > `cloud-hypervisor` feature in the `virt_container` crate and enable the > `cloud-hypervisor` feature for its `hypervisor` dependency. Fixes: #5242. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-02-14 15:38:39 +00:00
James O. D. Hunt	5f6d747e6d	Merge pull request #6272 from cmaf/tracing-clh-returnctx-startVM runtime: tracing: Fix missing ctx return	2023-02-14 08:17:45 +00:00
Bin Liu	e812c5ce66	Merge pull request #6076 from zhaojizhuang/reconnect runtime: add reconnect timeout for vhost user block	2023-02-14 10:39:20 +08:00
Archana Shinde	7b4e5751ca	Merge pull request #5007 from larrydewey/update-rpb-main SEV: Update ReducedPhysBits	2023-02-13 14:56:38 -08:00
Hyounggyu Choi	87d197ef20	Merge pull request #6143 from fidencio/topic/only-build-runtime-rs-for-x86_64-and-arm shim-v2/build.sh: Only build runtime-rs for the supported arches	2023-02-13 23:43:10 +01:00
Hyounggyu Choi	8e3863cecb	kata-deploy: Install protobuf-compiler explicitly in shim-v2 Dockerfile This is to install a missing binary protoc in shim-v2 Dockerfile. Fixes: #6244 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com> (cherry picked from commit `10603e3def`)	2023-02-13 22:29:19 +01:00
Chelsea Mafrica	c453919911	runtime: tracing: Fix missing ctx return Normally we return the context when creating a trace span so that the ordering of spans w.r.t. calls is maintained in tracing output. Add missing context for StartVM() for Cloud Hypervisor. Fixes #6271 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-02-13 12:37:52 -08:00
Chelsea Mafrica	036d3a4088	Merge pull request #5920 from cmaf/kata-ctl-check-cpu-unit-tests-1 kata-ctl: Expand unit tests for CPU check	2023-02-13 12:21:58 -08:00
Hyounggyu Choi	4139d68d51	runtime-rs: Include target install in conditional branch A Makefile target `install` should be included in the conditional branch as default and test. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-02-13 21:13:32 +01:00
James O. D. Hunt	545151829d	kata-types: Add Cloud Hypervisor (CH) definitions Implement `ConfigPlugin` trait for Cloud Hypervisor (CH). Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-02-13 10:25:29 +00:00
zhaojizhuang	ca02c9f512	runtime: add reconnect timeout for vhost user block Fixes: #6075 Signed-off-by: zhaojizhuang <571130360@qq.com>	2023-02-13 14:33:46 +08:00
Zhongtao Hu	2dd2421ad0	runtime-rs: cleanup kata host share path cleanup the /run/kata-containers/shared/sandboxes/pid path Fixes:#5975 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-02-13 13:07:07 +08:00
Bin Liu	95602c8c08	Merge pull request #5999 from yaoyinnan/5998/feat/cgroup-metrics runtime: support cgroup v2 metrics marshal guest metrics	2023-02-11 19:26:24 +08:00
Bin Liu	8a9392fd9d	Merge pull request #6188 from yahaa/Typo-fix Typo: change tabs in comment to spaces	2023-02-11 11:19:11 +08:00
Bin Liu	ecbd94d80c	Merge pull request #6064 from yaoyinnan/6063/feat/rootfs-erofs rootfs: support EROFS filesystem	2023-02-11 11:10:23 +08:00
Chelsea Mafrica	2f5bc0f408	kata-ctl: Expand unit tests for CPU check Change unit tests for CPU check to table-driven tests and expand test cases including temp files for cpuinfo. Fixes #5919 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-02-10 14:18:44 -08:00
Larry Dewey	67b8f0773f	SEV: Update ReducedPhysBits Updating this field, as `cpuid` provides host level data, which is not what a guest would expect for Reduced Phsycial Bits. In almost all cases, we should be using `1` for the value here. Amend: Adding unit test change. Fixes: #5006 Signed-off-by: Larry Dewey <larry.dewey@amd.com>	2023-02-10 13:19:33 -06:00
yaoyinnan	bdf20b5d26	rootfs: support EROFS filesystem For kata containers, rootfs is used in the read-only way. EROFS can noticably decrease metadata overhead. On the basis of supporting the EROFS file system, it supports using the config parameter to switch the file system used by rootfs. Fixes: #6063 Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2023-02-11 00:44:13 +08:00
GabyCT	bd1e8a2a24	Merge pull request #6252 from GabyCT/topic/upruncversion versions: Update runc version	2023-02-10 08:46:26 -06:00
GabyCT	86501d5f6f	Merge pull request #6200 from gkurz/improve-appendFDs-doc runtime: Improve documentation of appendFDs	2023-02-09 15:50:37 -06:00
Gabriela Cervantes	fff0e50a73	versions: Update runc version This PR updates the runc version. This new version include changes in: - Fix mounting via wrong proc fd. When the user and mount namespaces are used, and the bind mount is followed by the cgroup mount in the spec, the cgroup was mounted using the bind mount's mount fd. - Switch kill() in libcontainer/nsenter to sane_kill(). - Fix "permission denied" error from runc run on noexec fs. - Fix failed exec after systemctl daemon-reload. Due to a regression in v1.1.3, the DeviceAllow=char-pts rwm rule was no longer added and was causing an error open /dev/pts/0: operation not permitted: unknown when systemd was reloaded. Fixes #6251 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-02-09 21:16:41 +00:00
Archana Shinde	b67a1da187	Merge pull request #6166 from amshinde/make-cleanup Minor cleanups in make file	2023-02-09 11:24:48 -08:00
yaoyinnan	ed02c8a051	docs: add guide for building rootfs with EROFS Add guide for building rootfs with EROFS. Fixes: #6063 Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2023-02-09 20:07:51 +08:00
yaoyinnan	01765e1734	runtime: support cgroup v2 metrics marshal guest metrics Support to use cgroup v2 metrics marshal guest metrics. Fixes: #5998 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2023-02-09 19:14:09 +08:00
yaoyinnan	49326fe4e1	fix(clippy): fix hypervisor clippy checks Fix hypervisor clippy checks. Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2023-02-09 14:32:27 +08:00
Jianyong Wu	6f86fb8e27	Merge pull request #6183 from singhwang/main main \| docs: Fix missing critical steps in how-to-hotplug-memory-arm64.md	2023-02-09 09:26:11 +08:00
Archana Shinde	94b1d9814c	cargo: Update Cargo.lock files The cargo.locks file under src/libs and agent-ctl seem to be outdated. Updating these. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-08 13:50:54 -08:00
Archana Shinde	f1855594a2	make: Get rid of verbose output while creating tar We already have verbose output while merging the builds from various build targets. Getting rid of verbose output to speed up. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-08 13:41:41 -08:00
Archana Shinde	c3836010a8	make: clean up obsolete targets Cleanup targets that have been removed in the past when the makefile for kata-deploy was included. Instead, add targets from the makefile under local-build kata-deploy. Fixes: #6165 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-08 13:41:40 -08:00
Archana Shinde	a482b0d410	Merge pull request #6209 from amshinde/action-check-kernel-config-version Action check kernel config version	2023-02-08 10:34:54 -08:00
Bin Liu	407d3146e6	Merge pull request #6234 from UiPath/fix-clh-timeout clh: Enforce API timeout only for vm.boot request	2023-02-08 21:33:56 +08:00
Tim Zhang	d4f8f3a779	Merge pull request #6152 from liubin/fix/6151-refactor-cache-mod-const virtiofsd: change cache mod to const	2023-02-08 17:53:57 +08:00
Alexandru Matei	ac64b021a6	clh: Enforce API timeout only for vm.boot request launchClh already has a timeout of 10seconds for launching clh, e.g. if launchClh or setupVirtiofsDaemon takes a few seconds the context's deadline will already be expired by the time it reaches bootVM Fixes #6240 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-02-08 11:14:51 +02:00
Bin Liu	56071c6e7b	virtiofsd: change cache mod to const Change cache mod from literal to const and place them in one place. Also set default cache mode from `none` to `never` in `pkg/katautils/config-settings.go.in`. Fixes: #6151 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-02-08 15:06:52 +08:00
Zhongtao Hu	2752225360	Merge pull request #6193 from jongwu/cgroup_del_err runtime-rs: ignor "no such process" error when delete cgroup for a thread to let it go	2023-02-08 10:30:12 +08:00
Bin Liu	93b3d0a28e	Merge pull request #6163 from BbolroC/kernel-config-s390 kernel: Add console kernel config for s390	2023-02-08 10:02:38 +08:00
Bin Liu	71a3b73cb0	Merge pull request #6223 from d3c3mber/rm-unused-shim-config runtime: remove not used shim configurations	2023-02-08 10:00:52 +08:00
Jeremi Piotrowski	0a21ad78b1	osbuilder: fix default build target in makefile The .dracut_rootfs.done file is accidentally being picked up as the default target, regardless of BUILD_METHOD. Move the 'all' target definition up, so that it's the default (=first) target in the makefile. Additionally make the .dracut_rootfs.done target conditional on the right BUILD_METHOD being selected, as building it doesn't make sense with BUILD_METHOD=distro. Fixes: #6235 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-02-07 18:36:03 +01:00
Jianyong Wu	5d37d31ac7	cgroups: upgrade cgroupfs to 0.3.1 Trait method cause for std::error::Error is deprecated thus need replace it with source method for cgroups-fs::error::ErrorKind. Fixes: #6192 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-02-07 18:09:31 +08:00
Jianyong Wu	ab59a65c92	runtime-rs: neglect a certain error when delete cgroup Delete cgroup for a thread which may exit can lead to panic. Just neglect that error is harmless also avoid this failure. Fixes: #6192 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-02-07 18:09:31 +08:00
wllenyj	9a01d4e446	dragonball: add more unit test for virtio-blk device. Added more unit tests for virtio-blk device. Fixes: #4899 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2023-02-07 17:16:11 +08:00
d3c3mber	390916b33c	runtime: remove not used shim configurations ShimPath and ShimDebug are not needed anymore. Fixes: #6147 Signed-off-by: d3c3mber <tangbo_gl_2022@163.com>	2023-02-07 14:06:12 +08:00
Bin Liu	8ae14f6a55	Merge pull request #6208 from joannejchen/fix-naming-conventions improvement: Fix naming conventions for span name and log subsystem	2023-02-07 13:43:37 +08:00
joannejchen	9794c52c65	improvement: Fix naming conventions for span name and log subsystem Normally, the span name should be the same as the function name, and the log subsystem should not contain spaces. Fixes #6153 Signed-off-by: joannejchen <chenjjoanne@gmail.com>	2023-02-06 08:25:49 -06:00
Bin Liu	df93439c3b	Merge pull request #6009 from openanolis/dragonball/add_cpu_resize Dragonball: add cpu resize ability	2023-02-05 19:54:08 +08:00
Archana Shinde	d3bb254188	utils: Add function to check vhost-vsock Add function to check if the host-system has the vhost-vsock kernel module. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-03 15:41:59 -08:00
GabyCT	7fc35f19eb	Merge pull request #6056 from jongwu/perm_deny arm64/CI: fix unit test failure on arm64	2023-02-03 10:53:38 -06:00
Greg Kurz	1660d5651f	Merge pull request #6212 from BbolroC/fix-docker-buildx-s390x CI: Make docker version stick to v20.10 in ubuntu:20.04 for s390x\|ppc64le	2023-02-03 17:05:55 +01:00
Hyounggyu Choi	f49b89b632	CI: Set docker version to v20.10 in ubuntu:20.04 for s390x\|ppc64le This is to make a docker version to v20.10 in docker upstream image ubuntu:20.04 for s390x and ppc64le. Fixes: #6211 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-02-03 14:21:23 +01:00
Archana Shinde	3c24e23409	README: Update Readme under packaging/kernel Update Readme to instruct users to increment the kata config version for any changes made to configs or patches under packaging/kernel. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-02 22:43:24 -08:00
Archana Shinde	d73f3a8a26	github-action: Add step to verify kernel config version id updated The version mentioned in the `kata_config_version` needs to be updated for any kernel config change or changed to the patches applied. Without this, CI would not test with the latest kernel changes. We use to enforce this earlier as part of CI when `packaging` was a standalone repo. Add back this check as part of a github action so that the check is performed early on instead of a CI job. Fixes: #6210 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-02-02 22:42:54 -08:00
Jianyong Wu	59f104c022	runtime: skip unit test that fail regularly on aarch64 There are lots of unit test cases fails regularly on aarch64, including TestIOCopy, create_tmpfs. Temporarily skip it for now and enable it after them get fixed. Fixes: #6194 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-02-03 11:34:39 +08:00
Jianyong Wu	b7dd97cac6	kata-ctl: fix permission deny issue in test_add_remove test_add_remove and test_get_sandbox_id_for_volume need root user, but test_drop_privs can temporarily change the user to "nobody" that can lead to the failure of these tests. Serialise these three tests can fix it. Fixes: #6055 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-02-03 11:34:39 +08:00
GabyCT	968f5b4031	Merge pull request #6140 from Amulyam24/rust-vitiofsd virtiofsd: fix the build on ppc64le	2023-02-02 14:30:26 -06:00
Chao Wu	57c5e5629b	Dragonball: add cpu resize ability Add cpu resize ability upon upcall communication channel. Runtime could use ResizeVcpu VmmAction and pass the desired vCPU number to the Dragonball hypervisor. Dragonball will trigger the device manager service in guest kernel's upcall server to do cpu resize. Fixes: #6008 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-02-03 00:26:33 +08:00
Greg Kurz	3c48f2202c	runtime: Improve documentation of appendFDs The cmd.ExtraFiles feature that is used to implement appendFDs takes an array of arbitray file descriptors and internally renumbers them to be consecutive starting from 3, using dup2(). This isn't especially obvious : document it for the sake of clarity. Fixes #6199 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-02-02 12:52:10 +01:00
Amulyam24	856ab66871	virtiofsd: fix the build on ppc64le link-self-contained is not supported on ppc64le rust target. Hence, do not pass it while building virtiofsd. Fixes: #6195 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-02-02 13:59:12 +05:30
SinghWang	f83115a838	docs: Fix missing critical steps in how-to-hotplug-memory-arm64.md The key steps in how-to-hotplug-memory-arm64.md are missing, resulting in the kata qemu pod not being created successfully. Fixes: #6105 Signed-off-by: SinghWang <wangxin_0611@126.com>	2023-02-02 12:12:39 +08:00
yahaa	e071d9251f	Typo: change tabs in comment to spaces Fixes: #6150 Signed-off-by: yahaa <1477765176@qq.com>	2023-02-02 12:08:33 +08:00
Peng Tao	a34f36f8f4	Merge pull request #6149 from openanolis/fix_kata_runtime runtime:fix stat uds path	2023-02-02 11:00:07 +08:00
GabyCT	d6945200cc	Merge pull request #6170 from amshinde/update-cni-version cni: Update cni plugins version to 1.2.0	2023-02-01 09:18:14 -06:00
Chao Wu	c282a1c709	Merge pull request #5616 from wllenyj/dragonball-ut-5 Built-in Sandbox: add more unit tests for dragonball. Part 5	2023-01-31 21:12:05 +08:00
Peng Tao	09d416fe43	Merge pull request #6174 from gkurz/remove-qemu-log-file runtime: Drop QEMU log file support	2023-01-31 17:56:04 +08:00
Hyounggyu Choi	56f0a27fef	kernel: Add console kernel config for s390 This config is to update console kernel config for s390. Fixes: #6162 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-01-31 10:44:07 +01:00
Greg Kurz	334c4b8bdc	runtime: Drop QEMU log file support The QEMU log file is essentially about fine grain tracing of QEMU internals and mostly useful for developpers, not production. Notably, the log file isn't limited in size, nor rotated in any way. It means that a container running in the VM could possibly flood the log file with a guest triggerable trace. For example, on openshift, the log file is supposed to reside on a per-VM 14 GiB tmpfs mount. This means that each pod running with the kata runtime could potentially consume this amount of host RAM which is not acceptable. Error messages are best collected from QEMU's stderr as kata is doing now since PR #5736 was merged. Drop support for the QEMU log file because it doesn't bring any value but can certainly do harm. Fixes #6173 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-01-31 09:20:29 +01:00
Archana Shinde	3a63e3c1f7	cni: Update cni plugins version to 1.2.0 A new release was made for the cni plugins. Use the new version for the CI. Fixes: #6169 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-01-30 22:33:34 -08:00
Chelsea Mafrica	1648b85e2d	Merge pull request #6137 from amshinde/agent-seccomp-doc docs: Add documentation for building agent with seccomp support.	2023-01-30 19:08:15 -08:00
wllenyj	510798155d	dragonball: Improve test cases The same EpollManager should be used instead of creating two. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2023-01-31 10:51:51 +08:00
wllenyj	dc90c6e30b	dragonball: add more unit test for vm Added more unit tests for vm module. Fixes: #4899 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2023-01-31 10:51:51 +08:00
Fabiano Fidêncio	c071355359	runtime-rs: Improve s390x error message Nothing much to add, let's just make the message more clear. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-30 20:32:07 +01:00
Fabiano Fidêncio	4e2db96ef7	runtime-rs: Don't try to build on Power As done for s390x, let's just skip the runtime-rs build for Power. Fixes: #6142 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-30 20:32:07 +01:00
Bin Liu	b29cbbfd2c	Merge pull request #6141 from fidencio/topic/upcall-follow-up Add kernel-dragonball-experimental to kata-deploy, kata-deploy-test, and the release	2023-01-30 19:48:18 +08:00
Fabiano Fidêncio	8e8c720d51	kata-deploy-push: Ensure we build Dragonball specific kernel As the dragonball specific kernel is now part of the release, let's make sure we build it as part of the kata-deploy-push action. Fixes: #5859 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-30 09:40:34 +01:00
Zhongtao Hu	c1dd9b9777	Merge pull request #6023 from openanolis/missing_config runtime-rs: add missing config section for share-fs	2023-01-30 15:45:22 +08:00
Bin Liu	653e00dff8	Merge pull request #6146 from zhaojizhuang/add-hmp runtime: Add hmp for qemu	2023-01-30 15:43:53 +08:00
Peng Tao	de45f62096	Merge pull request #6081 from openanolis/chao/update_upcall_doc upcall: add document for upcall	2023-01-30 12:03:11 +08:00
Zhongtao Hu	1e531b44dc	runtime:fix stat uds path os.Stat("unix:///run/vc/sbs/sid/shim-monitor.sock") will fail, should be os.Stat("/run/vc/sbs/sid/shim-monitor.sock") Fixes:#6148 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-01-29 15:08:13 +08:00
zhaojizhuang	9092c23a2e	runtime: Add hmp for qemu Fixes: #6092 Signed-off-by: zhaojizhuang <571130360@qq.com>	2023-01-29 14:22:04 +08:00
Fabiano Fidêncio	b7f4e96ff3	kata-deploy-test: Ensure we build dragonball specific kernel As the dragonball specific kernel is now part of the release, let's make sure we build it as part of the kata-deploy-test action. Fixes: #5859 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-28 10:55:39 +01:00
Fabiano Fidêncio	063dec37c2	release: Add the dragonball-experimental kernel Let's add the dragonball specific kernel, which takes advantage of upcall, as part of the release tarball, so it can be used from the release tarball / kata-deploy. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-28 10:55:39 +01:00
Fabiano Fidêncio	0b3c91d2a2	kata-deploy: Add kernel-dragonball-experimental target As Chao Wu added the support for building the dragonball kernel as a new experimental kernel, let's make sure we reflect that as part of the kata-deploy build scripts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-28 10:55:39 +01:00
Greg Kurz	af125b1498	Merge pull request #5736 from gkurz/no-qemu-daemonize runtime: Start QEMU undaemonized and get logs	2023-01-27 16:33:48 +01:00
Archana Shinde	00dcd900f9	docs: Add documentation for building agent with seccomp support. The default for the agent today is building with seccomp support. However, additional steps need to be taken for building against musl such as installing the static seccomp library for musl. Add documentation to explain this. Fixes #6136 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-01-26 10:58:38 -08:00
Archana Shinde	461b32491f	Merge pull request #6131 from GabyCT/topic/updateqatdoc docs: Update url link in QAT documentation	2023-01-25 17:07:54 -08:00
Gabriela Cervantes	2b779cba00	docs: Update url link in QAT documentation This PR updates the url link in QAT documentation. Fixes #6130 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-01-25 15:27:29 +00:00
Fabiano Fidêncio	392c87550f	Merge pull request #6111 from littlejawa/bump_cni_plugins_to_120 versions: update cni plugins version	2023-01-25 12:40:55 +01:00
Greg Kurz	39fe4a4b6f	runtime: Collect QEMU's stderr LaunchQemu now connects a pipe to QEMU's stderr and makes it usable by callers through a Go io.ReadCloser object. As explained in [0], all messages should be read from the pipe before calling cmd.Wait : introduce a LogAndWait helper to handle that. Fixes #5780 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-01-24 23:09:17 +01:00
Greg Kurz	a5319c6be6	runtime: Start QEMU undaemonized QEMU has always been started daemonized since the beginning. I could not find any justification for that though, but it certainly introduces a problem : QEMU stops logging errors when started this way, which isn't accaptable from a support standpoint. The QEMU community discourages the use of -daemonize ; mostly because libvirt, QEMU's primary consummer, doesn't use this option and prefers getting errors from QEMU's stderr through a pipe in order to enforce rollover. Now that virtcontainers knows how to start QEMU with a pre- established QMP connection, let's start QEMU without -daemonize. This requires to handle the reaping of QEMU when it terminates. Since cmd.Wait() is blocking, call it from a goroutine. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-01-24 23:09:11 +01:00
Greg Kurz	bf4e3a618f	runtime: Launch QEMU with cmd.Start() LaunchCustomQemu() currently starts QEMU with cmd.Run() which is supposed to block until the child process terminates. This assumes that QEMU daemonizes itself, otherwise LaunchCustomQemu() would block forever. The virtcontainers package indeed enables the Daemonize knob in the configuration but having such an implicit dependency on a supposedly configurable setting is ugly and fragile. cmd.Run() is : func (c *Cmd) Run() error { if err := c.Start(); err != nil { return err } return c.Wait() } Let's open-code this : govmm calls cmd.Start() and returns the cmd to virtcontainers which calls cmd.Wait(). If QEMU doesn't start, e.g. missing binary, there won't be any errors to collect from QEMU output. Just drop these lines in govmm. Similarily there won't be any log file to read from in virtcontainers. Drop that as well. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-01-24 23:09:11 +01:00
Greg Kurz	8a1723a5cb	runtime: Pre-establish the QMP connection Running QEMU daemonized ensures that the QMP socket is ready to accept connections when LaunchQemu() returns. In order to be able to run QEMU undaemonized, let's handle that part upfront. Create a listener socket and connect to it. Pass the listener to QEMU and pass the connected socket to QMP : this ensures that we cannot fail to establish QMP connection and that we can detect if QEMU exits before accepting the connection. This is basically what libvirt does. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-01-24 23:09:11 +01:00
Greg Kurz	8a4f08cb0f	govmm: Optionally pass QMP listener to QEMU QEMU's -qmp option can be passed the file descriptor of a socket that is already in listening mode. This is done with by passing `fd=XXX` to `-qmp` instead of a path. Note that these two options are mutually exclusive : QEMU errors out if both are passed, so we check that as well in the validation function. While here add the `path=` stanza in the path based case for clarity. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-01-24 23:08:48 +01:00
Greg Kurz	219bb8e7d0	govmm: Optionally start QMP with a pre-configured connection When QEMU is launched daemonized, we have the guarantee that the QMP socket is available. In order to launch a non-daemonized QEMU, the QMP connection should be created before QEMU is started in order to avoid a race. Introduce a variant of QMPStart() that can use such an existing connection. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-01-24 19:16:47 +01:00
Julien Ropé	a85d0e465c	versions: update cni plugins version Use cni plugins v1.2.0 to get latest fixes. Fixes: #6110 Signed-off-by: Julien Ropé <jrope@redhat.com>	2023-01-23 14:24:29 +01:00
Bo Chen	40c6904324	Merge pull request #6098 from likebreath/0117/clh_v29.0 versions: Upgrade to Cloud Hypervisor v29.0	2023-01-18 10:59:40 -08:00
GabyCT	421a33f846	Merge pull request #6096 from dcantah/kataruntime-use_hyp_consts runtime: Use consts in `kata-runtime check`	2023-01-18 10:54:42 -06:00
Fabiano Fidêncio	980a2c7794	Merge pull request #6103 from fidencio/topic/bump-qemu-to-7.2.0 versions: Bump QEMU to v7.2.0	2023-01-18 17:38:47 +01:00
Fabiano Fidêncio	676d028504	versions: Bump QEMU to v7.2.0 As QEMU released its v7.2.0 version in December last year, last do the bump on our side. A few configuration options have been removed between the v6.2.0 (the version we currently use) and v7.2.0, so those have also been dropped from our configure-hypervison.sh script (for this specific version). Also, we're explicitly setting --disable-virtiofsd for the platforms that we're testing using the rust version. See: `a8d6abe129/docs/about/deprecated.rst (virtiofsd)` Fixes: #6102 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-18 13:11:12 +01:00
Bin Liu	083facd5ae	Merge pull request #5256 from Yuan-Zhuo/fix-agent-metrics agent: Eliminate unnecessary metrics	2023-01-18 11:43:37 +08:00
Peng Tao	7d1a604bad	Merge pull request #6060 from ls-ggg/6055/service.mu-deadlock runtime:all APIs are hang in the service.mu	2023-01-18 10:50:00 +08:00
Chelsea Mafrica	fa1f08f5da	Merge pull request #5812 from amshinde/kata-ctl-env-util Utility functions for kata-env	2023-01-17 18:45:54 -08:00
Bo Chen	861c38b6aa	versions: Upgrade to Cloud Hypervisor v29.0 Details of this release can be found in our new roadmap project as iteration v29.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #6097 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-01-17 15:45:23 -08:00
David Esparza	c8596a4065	Merge pull request #6085 from GabyCT/topic/uconmonversion versions: Update conmon version	2023-01-17 11:33:02 -06:00
Danny Canter	ba87e0afea	runtime: Use consts in `kata-runtime check` Fixes: #6095 We're already importing the virtcontainers package so might as well use the constants for the hypervisor types we're checking against instead of typing the names out in the switch cases. Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-17 06:55:36 -08:00
Chao Wu	9f490d16fe	upcall: add document for upcall In order for users to get better understand of upcall features, we add this document for upcall to illustrate what is upcall and how to enable upcall. fixes: #6054 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-01-17 14:53:47 +08:00
Bin Liu	790f45190b	Merge pull request #6074 from zhaojizhuang/enablevhostuserstore runtime: paas enablevhostuserstore annotation to hypervisor config	2023-01-17 11:43:43 +08:00
Bin Liu	42efe013c1	Merge pull request #6078 from utam0k/libcli-0.4.0 runk: Upgrade liboci-cli to v0.0.4	2023-01-17 09:48:09 +08:00
Gabriela Cervantes	596037e20c	versions: Update conmon version This PR updates the conmon version that we are using in our versions.yaml Fixes #6084 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-01-16 22:20:53 +00:00
utam0k	095e8fdef4	runk: Use the original Kill command instead of the customed it. We can remove the custom kill command. Fixes: #6083 Signed-off-by: utam0k <k0ma@utam0k.jp>	2023-01-16 21:35:47 +09:00
utam0k	0f9e23a3d9	runk: Upgrade liboci-cli to v0.0.4 https://github.com/containers/youki/releases/tag/v0.0.4 Fixes: #6083 Signed-off-by: utam0k <k0ma@utam0k.jp>	2023-01-16 21:35:09 +09:00
Tim Zhang	20196048bf	Merge pull request #6030 from liubin/fix/6029-use-system-hugepagesize runtime: use system pagesize for hugepage test	2023-01-16 16:57:55 +08:00
Fupan Li	a1a7ed98df	Merge pull request #6040 from liubin/fix/6039-update-cgroup-rs dependency: update cgroups-rs	2023-01-16 16:51:41 +08:00
ls	69fc8de712	runtime:all APIs are hang in the service.mu When the vmm process exits abnormally, a goroutine sets s.monitor to null in the 'watchSandbox' function without getting service.mu, This will cause another goroutine to block when sending a message to s.monitor, and it holds service.mu, which leads to a deadlock. For example, the wait function in the file .../pkg/containerd-shim-v2/wait.go will send a message to s.monitor after obtaining service.mu, but s.monitor may be null at this time Fixes: #6059 Signed-off-by: ls <335814617@qq.com>	2023-01-16 14:45:37 +08:00
Archana Shinde	8d4c2cf1b9	kata-ctl: Allow certain constants to go unused The generic constants for cpu vendor and model may be superseded by architecture specific constants. Allow these to be marked as dead code to ignore warnings on architectures where they are overrided. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-01-15 18:07:35 -08:00
Archana Shinde	64c11a66fd	kata-ctl: Have function to get cpu details to run on specific arch This function relies on get_single_cpu function which has configured to compile on amd64 and s390x. Making the function get_generic_cpu_details to compile on these architectures until we resolve the compilation for functions defined in check.rs. This is a temporary solution until we cleanup check.rs to make it build on all architectures. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-01-15 18:07:35 -08:00
Eric Ernst	807eeaafd0	Merge pull request #6047 from egernst/build-kata-monitor-on-darwin runtime: Use git rev-parse for the kata-monitor tag	2023-01-13 15:29:00 -08:00
Eric Ernst	3d573ba579	Merge pull request #6050 from egernst/goos-the-vc virtcontainers: split out linux-specific bits for mount, factory	2023-01-13 15:28:42 -08:00
Eric Ernst	458fe865ea	Merge pull request #6052 from egernst/add-darwin-skeletons Add darwin skeletons	2023-01-13 13:14:16 -08:00
Eric Ernst	923cd3fda1	virtcontainers: split out Linux parts from mount Mount handling is often unique in Linux. Let's ensure that the common parts remain in mount.go, while Linux speific parts are within a linux file. Fixes: #6049 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-13 11:14:56 -08:00
Eric Ernst	54f2b296e3	Merge pull request #6048 from egernst/revendor-netlink vendor: revendor netlink to get latest	2023-01-13 11:08:47 -08:00
Eric Ernst	f82918f872	Merge pull request #6045 from egernst/fix-6044 Address issues with the initial vCPU pinning functionality	2023-01-13 11:06:42 -08:00
GabyCT	9c6e90fd55	Merge pull request #6043 from GabyCT/topic/fixerrormsg virtcontainers: Fix misspelling in error message	2023-01-13 09:16:34 -06:00
zhaojizhuang	cf1bae3521	runtime: paas enablevhostuserstore annotation to hypervisor config Fixes: #6073 Signed-off-by: zhaojizhuang <571130360@qq.com>	2023-01-13 17:07:38 +08:00
Bin Liu	1592a385eb	dependency: update cgroups-rs Update cgroups-rs. Fixes: #6039 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-13 14:00:51 +08:00
Eric Ernst	60ff230d80	virtcontainers: Split the factory package into Linux and Darwin bits - split template - split factory - add stubs for darwin Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 16:51:28 -08:00
Samuel Ortiz	76437a9721	runtime: Use git rev-parse for the kata-monitor tag The .git-commit can be a multiple line file, potentially confusing the Darwin linker for example. Fixes: #6046 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 16:01:58 -08:00
Samuel Ortiz	a9626682af	virtcontainers: resourcecontrol: Add skeleton for Darwin Cgroups do not exist on Darwin, so use an empty implementation for resourcecontrol for the time being. In the process, ensure that the utilized cgroup handling (ie, isSystemdCgroup) is kept in general file, since we use this to help assess/constrain the container spec we pass to the guest. Fixes: #6051 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 15:53:28 -08:00
Samuel Ortiz	ea06fe3afc	virtcontainers: Add a Network API skeleton for Darwin Empty for now. Fixes: #6051 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 15:53:28 -08:00
Eric Ernst	6ee550e9a5	runtime: vCPUs pinning is sandbox specific, not hypervisor While at it, make sure we persist this and fix a misc typo. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 15:44:25 -08:00
Zhongtao Hu	6199b69178	runtime-rs: change cache mode use never as the cache mode if none is configured Fixes:#6020 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-01-12 18:13:50 +08:00
Zhongtao Hu	a33a22ccd1	runtime-rs: add missing config section for share-fs add missing config sections for share-fs Fixes:#6020 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-01-12 18:12:37 +08:00
Peng Tao	2b4b825228	Merge pull request #6032 from liubin/fix/6031-add-test-file-to-gitignore runtime: add test generated file to .gitignore	2023-01-12 15:38:46 +08:00
Peng Tao	4a4232b851	Merge pull request #6037 from bergwolf/github/no-netns runtime: fix up disable_netns handling	2023-01-12 09:58:24 +08:00
Eric Ernst	e3d3b72fa2	virtcontainers: use resource control for setting CPU affinity Let's abstract the CPU affinity, instead of calling linux only code from sandbox. Fixes: #6044 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-11 17:55:53 -08:00
Eric Ernst	f137048be3	resource-control: add helper function for setting CPU affinity Let's abstract the CPU affinity Fixes: #6044 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-11 17:55:53 -08:00
Eric Ernst	73216a8104	vendor: revendor netlink to get latest This'll address issue where netlink couldn't build on Darwin hosts. Fixes: #6026 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-11 17:23:15 -08:00
Gabriela Cervantes	fc17d7cc41	virtcontainers: Fix misspelling in error message This PR fixes a misspelling in the error message when it tries to run a system without Confidential computing support. Fixes #6042 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-01-11 21:58:07 +00:00
GabyCT	c6b7f69040	Merge pull request #5837 from deagon/doc-fix docs: add hint of probing loop module	2023-01-11 12:20:47 -06:00
Tim Zhang	c91b142587	Merge pull request #6035 from liubin/fix/5376-set-a-fixed-cgroups-version tools: add --locked option for cargo install	2023-01-11 20:44:23 +08:00
Peng Tao	12fd6ffc1f	runtime: fix up disable_netns handling With `disable_netns=true`, we should never scan the sandbox netns which is the host netns in such case. Fixes: #6021 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-11 12:25:24 +00:00
Bin Liu	64c9114a39	tools: add --locked option for cargo install There is a broken release of cgroup-rs, but cargo install will not use the version in Cargo.lock, so add the `--locked` option to use the version specified in the Cargo.toml Fixes: #5376 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-11 19:34:46 +08:00
Bin Liu	7eb43cec15	runtime: add test generated file to .gitignore Add test generated file to .gitignore to avoid making the working directory dirty. Fixes: #6031 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-11 17:16:06 +08:00
Bin Liu	8551853cfe	runtime: use system pagesize for hugepage test In TestHandleHugepages it will do a mount operation with different pagesizes, but some systems only support 2M pagesize, test for a 1g pagesize will fail. This commit try to fix by only mount pagesizes under `/sys/kernel/mm/hugepages`, which are supported to mount by the OS. Fixes: #6029 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-11 17:02:58 +08:00
Bin Liu	0ec4aa1a86	Merge pull request #6007 from jongwu/single_container runtime-rs: add Single Container support	2023-01-11 10:55:50 +08:00
Eric Ernst	07e77f5be7	Merge pull request #5994 from dcantah/virtcontainers_tests_darwin virtcontainers: tests: Ensure Linux specific tests are just run on Linux	2023-01-10 17:13:28 -08:00
Fabiano Fidêncio	147c56bb8d	Merge pull request #6019 from liubin/fix/6018-virtiofsd-cache-mod Change cache mode from none to never	2023-01-10 23:12:13 +01:00
Bin Liu	709483425f	Merge pull request #6014 from GabyCT/topic/fixinidentationaks tools: Fix indentation for setup aks script	2023-01-10 17:49:27 +08:00
Bin Liu	8225d8044e	Merge pull request #6003 from dcantah/fs-skeleton virtcontainers: fs_share: Add Darwin skeleton	2023-01-10 17:48:45 +08:00
Bin Liu	86a82cace9	runtime: change cache mode from none to never New Rust virtiofsd's `cache` mode doesn't support `none` mode, we should use `never` to replace it. Fixes: #6018 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-10 17:29:48 +08:00
Bin Liu	82c59efd65	runtime-rs: change cache mode from none to never New Rust virtiofsd's `cache` mode doesn't support `none` mode, we should use `never` to replace it. Fixes: #6018 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-10 16:14:59 +08:00
Bin Liu	7b309b578d	kata-types: change cache mode from none to never New Rust virtiofsd's `cache` mode doesn't support `none` mode, we should use `never` to replace it. Fixes: #6018 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-10 14:21:30 +08:00
Bin Liu	fee4e7c7c4	docs: change cache mode from none to never New Rust virtiofsd's `cache` mode doesn't support `none` mode, we should use `never` to replace it. Fixes: #6018 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-10 14:19:25 +08:00
Eric Ernst	4d53303a7d	Merge pull request #6005 from dcantah/vfw-skeleton virtcontainers: Add a Virtualization.framework skeleton	2023-01-09 15:50:04 -08:00
Archana Shinde	594b57d082	utils: Add utility functions to get cpu and distro details. These functions is meant to be used for the kata-env command. Fixes: #5688 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-01-09 14:36:36 -08:00
Archana Shinde	d33e343613	check: Move PROC_CPUINFO from architecture specific files Move PROC_CPUINFO into check.rs. This file is used accross architectures and does not need to be in arch-specific files. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-01-09 14:31:33 -08:00
Gabriela Cervantes	f8a93a1ded	tools: Fix indentation for setup aks script This PR fixes the indentation for setup aks script being used in tools. Fixes #6013 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-01-09 15:27:50 +00:00
Tim Zhang	6628891666	Merge pull request #5982 from liubin/fix/5981-remove-tests-func kata-ctl: remove get_kata_version_by_url function	2023-01-09 18:18:21 +08:00
Bin Liu	03de5f41b2	kata-ctl: remove get_kata_version_by_url function In `src/tools/kata-ctl/src/check.rs`, there is a function `get_kata_version_by_url` in the tests mod, indeed we can use the `get_kata_all_releases_by_url` in the main mod to replace it. Fixes: #5981 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-09 15:32:16 +08:00
Fupan Li	2b34f0a54f	Merge pull request #5992 from liubin/fix/5987-kata-ctl-s390x-build-error kata-ctl: fix build error on s390x	2023-01-09 15:28:37 +08:00
Bin Liu	1bae41a4d4	Merge pull request #5996 from dcantah/vfw-initial virtcontainers: Introduce hypervisor_darwin	2023-01-09 11:37:02 +08:00
Jianyong Wu	464d4c94de	runtime-rs: process single_container Process single_container like pod_sandbox when create container but like pod_container when get the size info of memory/cpu from oci/spec. Fixes: #6006 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-01-09 10:29:01 +08:00
Jianyong Wu	5f9c892e48	kata-types: add single_container support For now, only pod_sandbox and pod_container are supported. It doesn't cover the case that container started by ctr which is a single_container defined in kata 2.0. port the single_container kata type from kata 2.0 to kata 3.0. Fixes: #6006 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-01-09 10:29:01 +08:00
Samuel Ortiz	fa9ae9362c	virtcontainers: Add a Virtualization.framework skeleton Fixes: #6004 A Virtualization.framework based Hypervisor implementation. This is just stubs for now to eventually get this building. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-08 07:40:21 -08:00
Eric Ernst	d48b22bb13	virtcontainers: fs_share: add Darwin skeleton Fixes: #6002 As a first pass for testing, let's add a skeleton for filesystem sharing support on Darwin.. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-07 19:56:47 -08:00
Bin Liu	2c10b37172	Merge pull request #5991 from dcantah/darwin-sigs runtime: Define Darwin handled signals list	2023-01-07 11:19:48 +08:00
Bin Liu	bc8a6423e0	Merge pull request #5986 from dcantah/nydus-nonetns nydus: net-ns handling needs to be only executed on Linux hosts	2023-01-07 11:19:07 +08:00
Bo Chen	8265aad380	Merge pull request #6001 from fidencio/topic/add-network-hotplug-support-for-clh clh: Ensure it works with Docker / Moby	2023-01-06 13:06:57 -08:00
Eric Ernst	fafc7a8b1a	virtcontainers: tests: Ensure Linux specific tests are just run on Linux Fixes: #5993 Several tests utilize linux'isms like Mounts, bindmounts, vsock etc. Let's ensure that these are still tested on Linux, but that we also skip these tests when on other operating systems (Darwin). This commit just moves tests; there shouldn't be any functional test changes. While the tests still won't be runnable on Darwin/other hosts yet, this is a necessary step forward. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-06 11:09:11 -08:00
Fabiano Fidêncio	efa4fc0b25	clh: Add hotplug support for network devices This is needed in order to have Moby / Docker working properly with Cloud Hypervisor, as Moby / Docker relies on hotplugging a network device to the VM as a preStartHook. Fixes: #5997 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-06 18:59:47 +01:00
Fabiano Fidêncio	1074d2c1d3	clh: Make vmAddNetPutRequest capable of doing hotplugs THe only bit needed for having the vmAddNetPutRequest() capable of dealing with hotplugs, instead of only coldplugs, is making sure it doesn't error out in case a `200` response is returned. The 200 response means: """ The new device was successfully added to the VM instance. """ Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-06 18:55:55 +01:00
Zhongtao Hu	ec18368aba	Merge pull request #5858 from openanolis/refactor-guest-hook agent: refactor guest hooks	2023-01-06 22:28:09 +08:00
Fabiano Fidêncio	175794458f	Merge pull request #5972 from bergwolf/github/hook fix moby prestart hook handling	2023-01-06 14:54:39 +01:00
Eric Ernst	9ec8a13985	virtcontainers: introduce hypervisor_darwin Fixes: #5995 Placeholder skeleton at this point - implementation will be added after basic build refactoring lands. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-06 02:03:34 -08:00
Peng Tao	8bb68a9f28	vc/network: skip existing endpoints when scanning for new ones So that addAllEndpoints() becomes re-entrant and we can use it to scan netns changes. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-06 10:01:19 +00:00
Bin Liu	c21a8d5ff8	kata-ctl: fix build error on s390x Some type is not imported in s390x's mod file. Fixes: #5987 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-06 13:27:28 +08:00
Bin Liu	31abe170fc	Merge pull request #5984 from dcantah/schedcore-nonlinux schedcore: Make buildable on !linux	2023-01-06 10:38:39 +08:00
Samuel Ortiz	3b4420eb8e	runtime: Define Darwin handled signals list Fixes: #5990 Some signals may not be defined on non Linux host OSes, like SIGSTKFLT for example. It's also not defined on certain architectures, but irrelevant for this. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-05 17:50:47 -08:00
Danny Canter	24b05a99b6	schedcore: Make buildable on !linux Fixes: #5983 sched-core only makes sense on Linux hosts. Let's add stub/error for other platforms. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-05 11:51:04 -08:00
Danny Canter	3886aad199	nydus: net-ns handling needs to be only executed on Linux hosts Fixes: #5985 With nydus not being its own pkg, it is challenging to implement cleanly in a virtcontainers package that isn't necesarily Linux-only. The existing code utilizes network namespace code in order to ensure nydus is launched in the host netns. This is very Linux specific - so let's make sure we only carry this out in a linux specific file. In the Darwin case, to allow for compilation at least, let's add a stub for doNetNS. Ideally the nydus and vc code can be refactored / decoupled. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-05 11:48:43 -08:00
Bin Liu	1b46d4fb50	Merge pull request #5611 from wllenyj/dragonball-ut-4 Built-in Sandbox: add more unit tests for dragonball. Part 4	2023-01-05 15:21:36 +08:00
Bin Liu	a40fca1f57	Merge pull request #5976 from yaoyinnan/5825/fix/cleanup-hypervisor runtime-rs: cleanup the run dir of hypervisor when shut down	2023-01-05 15:14:21 +08:00
Zhongtao Hu	8c4c0d2715	Merge pull request #5467 from tzY15368/feat-katactl-direct-vol Feat: implementation of kata-ctl direct-volume operations	2023-01-05 14:06:18 +08:00
Bin Liu	4ab9364aa6	Merge pull request #5946 from dcantah/clarify-var Runtime: Clarify mutability of global var	2023-01-05 13:08:45 +08:00
Bin Liu	649d2d4b8d	Merge pull request #5964 from openanolis/kata-runtime kata-runtime: add rust runtime path for kata-runtime exec	2023-01-05 09:35:21 +08:00
Fabiano Fidêncio	db372d8897	Merge pull request #5974 from likebreath/0103/clh_v28.1 versions: Upgrade to Cloud Hypervisor v28.1	2023-01-04 19:02:35 +01:00
yaoyinnan	e256903af2	runtime-rs: cleanup the run dir of hypervisor when shut down Cleanup the run dir of hypervisor when shut down. Fixes: #5825 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2023-01-04 22:36:39 +08:00
Bin Liu	e2c7e5f172	Merge pull request #5950 from openanolis/upcall_fea runtime-rs: add dbs-upcall feature	2023-01-04 16:20:40 +08:00
Tingzhou Yuan	937a41346e	kata-ctl: add unit tests for volume ops Added table driven unit tests and funcitionality test for functions in volume_ops. `join_path` relies on safe_path::scoped_join to validate the unsafe part of the input. Testcase also takes into account the possibility of specially constructed string that would get b64-encoded into path-like string. Fixes #5341 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>	2023-01-04 01:34:40 -05:00
Tingzhou Yuan	8451db7c0c	kata-ctl: direct-volume: add Add and Remove handlers This commit adds direct-volume command handlers for kata-ctl, including add, remove, stats and resize. Stats and resize makes HTTP over UDS calls to runtime-rs while add and remove runs locally on the host. Fixes #5341 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu> kata-ctl: direct-volume: add Add and Remove handlers This commit adds direct-volume command handlers for kata-ctl, including add, remove, stats and resize. Stats and resize makes HTTP over UDS calls to runtime-rs while add and remove runs locally on the host. Fixes #5341 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>	2023-01-04 01:34:38 -05:00
Tingzhou Yuan	2d4b2cf72c	runtime-rs: add POST method to shim-client partly refactored shim-client to reuse code, added POST method support, and made path string constants public for client imports. Fixes #5341 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>	2023-01-04 01:33:53 -05:00
Tingzhou Yuan	cae78a6851	kata-ctl: add constants for direct-volume commands added direct-volume mountinfo struct and constant path strings to kata-types Fixes #5341 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>	2023-01-04 01:33:51 -05:00
Bin Liu	38a6bc570d	Merge pull request #5947 from dcantah/yq-darwin runtime/Makefile: Get some bits happy on darwin	2023-01-04 14:24:43 +08:00
Bin Liu	3bda4a8194	Merge pull request #5943 from liubin/fix/5942-remove-old-description docs: remove old and misleading instructions for minikube	2023-01-04 12:02:53 +08:00
Bin Liu	5b11201848	Merge pull request #5945 from liubin/fix/5944-indents packaging: fix indents in build-kernel.sh	2023-01-04 11:00:49 +08:00
Bo Chen	652021ad95	versions: Upgrade to Cloud Hypervisor v28.1 This patch upgrade Cloud Hypervisor to its latest bug release v28.1: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v28.1 Fixes: #5973 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-01-03 14:09:44 -08:00
Fabiano Fidêncio	156e4e673b	Merge pull request #5908 from Alex-Carter01/kmod_warning kernel: adding kmod to do docker env	2023-01-03 20:35:22 +01:00
Fabiano Fidêncio	67f0fd505d	Merge pull request #5967 from fidencio/topic/bump-rust-toolchain-to-1.66.0 versions: Update the rust toolchain to 1.66.0	2023-01-03 18:50:16 +01:00
Fabiano Fidêncio	5f5f6ce7a7	Merge pull request #5951 from liubin/fix/5948-check_latest_version kata-ctl: skip test if access GitHub.com fail	2023-01-03 18:49:57 +01:00
Peng Tao	d085389127	vc: fix up UT for CreateSandbox API change Need to adapt the UT as well. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-03 22:30:42 +08:00
Peng Tao	578a9c25f0	vc: rescan network endpoints after running prestart hooks Moby relies on the prestart hooks to configure network endpoints. We should rescan the netns after running them so that the newly added endpoints can be found and plugged to the guest. Fixes: #5941 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-03 22:30:41 +08:00
Fabiano Fidêncio	a3e1257708	Merge pull request #5891 from jtumber-ibm/foreign-cc agent: unset `CC` for cross-build	2023-01-03 14:38:24 +01:00
Peng Tao	cb84b0fb02	katautils: run prestart hooks after starting VM So that we can pass the hypervisor pid to the hook instead of the runtime process's. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-03 10:52:32 +00:00
Fabiano Fidêncio	079462d2eb	runk: Fix needless_borrow warning As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 17:14:13 +01:00
Fabiano Fidêncio	2c24fcf34c	runtime-rs: Fix clippy::bool-to-int-with-if warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to boolean to int conversion using if. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#bool_to_int_with_if Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 17:14:13 +01:00
Fabiano Fidêncio	025e78341e	runtime-rs: Fix needless_borrow warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 17:14:13 +01:00
Fabiano Fidêncio	4fb163d570	runtime-rs: Allow clippy:box_default warnings As the rust toolchain version bump to its 1.66.0 release raised a warning about using Box::default() instead of specifying a type. For now that's something we don't need to change, so let's ignore such warning in this very specific case. See: https://rust-lang.github.io/rust-clippy/master/index.html#box_default Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 17:14:01 +01:00
Fabiano Fidêncio	20121fcda7	runtime-rs: Fix unnecessary_cast warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to unnecessary_cast. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 16:16:39 +01:00
Fabiano Fidêncio	b95364a140	dragonball: Allow question_mark warning in allocate_device_resources() As the rust toolchain version bump to its 1.66.0 release raised a warning about the code being able to be refactored to use `?`. For now that's something we don't need to change, so let's ignore such warning in this very specific case. See: https://rust-lang.github.io/rust-clippy/master/index.html#question_mark Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 15:55:49 +01:00
Fabiano Fidêncio	0b2f060bf3	dragonball: Fix unnecessary_cast warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to unnecessary_cast. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 15:55:42 +01:00
Fabiano Fidêncio	a545a65934	agent: Allow clippy::question_mark warning in Namespace{} As the rust toolchain version bump to its 1.66.0 release raised a warning about the code being able to be refactored to use `?`. For now that's something we don't need to change, so let's ignore such warning in this very specific case. See: https://rust-lang.github.io/rust-clippy/master/index.html#question_mark Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 15:22:20 +01:00
Fabiano Fidêncio	9ced34dd22	agent: Fix explicit_auto_deref warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to explicit_auto_deref. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#explicit_auto_deref Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:59:50 +01:00
Fabiano Fidêncio	f77220490e	agent: Fix needless_borrow warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:58:13 +01:00
Fabiano Fidêncio	7bcdc9049a	rustjail: Fix unnecessary_cast warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to unnecessary_cast. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:42:58 +01:00
Fabiano Fidêncio	41d7dbaaea	rustjail: Fix needless_borrow warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:42:25 +01:00
Fabiano Fidêncio	2a73e057db	kata-types: Fix unnecessary_cast warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to unnecessary_cast. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:28:07 +01:00
Fabiano Fidêncio	cf9ef1833c	kata-types: Fix needless_borrow warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:28:07 +01:00
Fabiano Fidêncio	126187e814	safe-path: Fix needless_borrow warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:28:07 +01:00
Fabiano Fidêncio	bb78d35db8	kata-sys-util: Fix "match-like-matches-macro" warning As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to "match-like-matches-macro". Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#match_like_matches_macro Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:28:07 +01:00
Fabiano Fidêncio	668e652401	kata-sys-util: Fix unnecessary_cast warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to unnecessary_cast. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:28:07 +01:00
Fabiano Fidêncio	c1a8d89a72	kata-sys-util: Fix needless_borrow warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:28:07 +01:00
Fabiano Fidêncio	c9c38e6d01	logging: Allow clippy::type-complexity warning As the rust toolchain version bump to its 1.66.0 release raised a warning about the type complexity used for the closure, and that's something we don't want to change, let's ignore such warning in this very specific case. See: https://rust-lang.github.io/rust-clippy/master/index.html#type_complexity Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:28:07 +01:00
Fabiano Fidêncio	ffd6fbb6b6	logging: Fix needless_borrow warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:18:14 +01:00
Fabiano Fidêncio	60df30015b	protocols: Fix unnecessary_cast warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to unnecessary_cast. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 14:18:14 +01:00
Danny Canter	56e7b5d0fd	runtime/Makefile: Get some bits happy on darwin Substitution in the yq install script doesn't like zsh, and additionally the version of yq we're using doesn't have a darwin/arm64 build so grab the amd64 version and let rosetta work its magic. Additionally swap to abspath from readlink -m for the printing of what binaries to install, as the -m flag doesn't exist on the BSD variant, and this should be the same behavior. Fixes: #5970 Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-02 04:19:58 -08:00
Fabiano Fidêncio	0bbeb34b4c	protocols: Fix needless_borrow warnings As we bumped the rust toolchain to 1.66.0, some new warnings have been raised due to needless_borrow. Let's fix them all here. For more info about the warnings, please, take a look at: https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 12:41:29 +01:00
Fabiano Fidêncio	dfea6c7d21	versions: Update the rust toolchain to 1.66.0 We're doing the bump on main, as we'll need this as part of the CCv0 branch due to the dependencies we have there. Link to the 1.66.0 release: https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1660-2022-12-15 Fixes: #5966 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-02 11:34:00 +01:00
Danny Canter	86ee24b33c	Runtime: Clarify mutability of global var Was about to change `urandomdev` to a constant when I realized it's intentionally mutable so it can be mocked in tests. There's other comments to the same effect so clarify here as well. Fixes: #5965 Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-02 01:13:34 -08:00
Zhongtao Hu	dae6670628	kata-runtime: add rust runtime path for kata-runtime exec add rust runtime path for kata-runtime exec Fixes:#5963 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-12-30 13:34:34 +08:00
Chao Wu	a2e3715e01	upcall: remove upcall client when stopping vm In order to avoid resource leak, we need to remove upcall client in vm and vcpu manager when stopping vm. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-12-28 20:23:39 +08:00
wllenyj	31591d7915	dragonball: fix unit test failure case about Kvm. Due to the wrong use of as_raw_fd, Kvm was dropped twice. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-12-26 11:32:31 +08:00
wllenyj	2b02e0a9bf	dragonball: add more unit test for vcpu manager Added more unit tests for Vcpu Manager. Fixes: #4899 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-12-26 11:31:42 +08:00
Yushuo	85f9094f17	agent: refactor guest hooks We have to execute some hooks both in host and guest. And in /libs/kata-sys-util/src/hooks.rs, the coomon operations are implemented. In this commit, we are going to refactor the code of guest hooks using code in /libs/kata-sys-util/src/hooks.rs. At the same time, we move function valid_env to kata-sys-util to make it usable by both agent and runtime. Fixes: #5857 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2022-12-26 10:15:19 +08:00
Chao Wu	1511587a9a	Merge pull request #5601 from openanolis/hugepage runtime-rs: enable hugepage	2022-12-25 22:35:06 +08:00
Zhongtao Hu	3605062258	runtime-rs: add dbs-upcall feature add dbs-upcall feature to dragonball Fixes:#5949 Depends-on: github.com/kata-containers/tests#5355 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-12-25 19:02:42 +08:00
Bin Liu	03a0c9d78e	kata-ctl: skip test if access GitHub.com fail This commit will call `error_for_status` after `send`, this call will generate errors if status code between 400-499 and 500-599. And sometime access github.com will fail, in this case we can skip the test to prevent the CI failing. Fixes: #5948 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-23 15:12:12 +08:00
Bin Liu	1dcbda3f0f	kata-ctl: update Cargo.lock kata-ctl depends on runtime-rs, and this commit: `fbf294da3f` added a new dependency named shim-interface, this Cargo.lock should be updated too. Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-23 15:06:50 +08:00
Bin Liu	b4b5d8150e	docs: remove old and misleading instructions for minikube Some instructions are old, delete them to prevent misleading. Fixes: #5942 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-23 12:02:46 +08:00
Bin Liu	0fe24e08bb	packaging: fix indents in build-kernel.sh In the function get_kernel, the indents are two tabs, which should be 1 tab. Fixes: #5944 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-22 14:56:06 +08:00
Fupan Li	dc9c8d3357	Merge pull request #5901 from justxuewei/fix/mpleak runtime-rs: Clean up mount points shared to guest	2022-12-21 09:59:25 +08:00
Bin Liu	92b843ac5a	Merge pull request #5924 from jongwu/kata-ctl-checkcpu kata-ctl: fix checkcpu bug in non-x86 arches	2022-12-21 09:16:53 +08:00
Jianyong Wu	3480780bd8	kata-ctl: add check framework support for non-x86 x86 changes the check framwork. Enable them for non-x86 accordingly. Fixes: #5923 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-12-20 11:41:00 +08:00
Jianyong Wu	1bd533f10b	kata-ctl: let check framework arch-agnostic The current check framwork is specific for x86. Refactor the code to let it arch-agnostic. Fixes: #5923 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-12-20 11:41:00 +08:00
Fabiano Fidêncio	2e54c8e887	Merge pull request #5921 from fidencio/3.1.0-alpha1-branch-bump # Kata Containers 3.1.0-alpha1	2022-12-19 15:45:53 +01:00
Bin Liu	6039516802	Merge pull request #5925 from xinydev/fix-docs docs: Remove duplicate sentences	2022-12-19 17:12:15 +08:00
Peng Tao	473f5ff7da	Merge pull request #5861 from mflagey/Docs_Change_build_virtiofsd_in_developer_guide_#5860 docs: Update virtiofsd build script in the developer guide	2022-12-19 17:02:35 +08:00
Bin Liu	0cf443a612	Merge pull request #5915 from openanolis/legacy_device dragonball: refactor legacy device initialization	2022-12-19 13:31:45 +08:00
Xuewei Niu	fd77eebd4d	runtime-rs: fix the issues mentioned in the code review In order to avoid cloning, changed the signature of `ShareFsMount::share_rootfs`, `ShareFsMount::share_volume`, and `ShareFsMount::umount_rootfs` to receive a reference to a config. Fixes: #5898 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2022-12-19 11:46:50 +08:00
Xuewei Niu	0e69207909	runtime-rs: Clean up mount points shared to guest Fixed issues where shared volumes couldn't umount correctly. The rootfs of each container is cleaned up after the container is killed, except for `NydusRootfs`. `ShareFsRootfs::cleanup()` calls `VirtiofsShareMount::umount_rootfs()` to umount mount points shared to the guest, and umounts the bundle rootfs. Fixes: #5898 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2022-12-19 11:46:14 +08:00
Xin Yang	74fa10a235	docs: remove duplicate sentences remove duplicate sentences in spdk docs Fixes: #5926 Signed-off-by: Xin Yang <xinydev@gmail.com>	2022-12-17 11:26:36 +00:00
Bin Liu	e4645642d0	Merge pull request #5877 from openanolis/fix_start_bundle runtime-rs: enable start container from bundle	2022-12-17 08:10:08 +08:00
Wainer Moschetta	339ef99669	Merge pull request #5867 from Alex-Carter01/sev_module_unload kernel building: Add module unload to SEV kernel config	2022-12-16 17:17:53 -03:00
Alex Carter	ecb28e2b13	kernel: adding kmod to do docker env adding kmod to kernel building docker env to remove warning Fixes: #5866 Signed-off-by: Alex Carter <Alex.Carter@ibm.com>	2022-12-16 17:02:47 +00:00
Alex Carter	9f465a58af	kernel: Add "unload" module to SEV config Fixes: #5866 Signed-off-by: Alex Carter <Alex.Carter@ibm.com>	2022-12-16 16:56:56 +00:00
Fabiano Fidêncio	b0896126cf	release: Kata Containers 3.1.0-alpha1 - tools: Add some new gitignore items - shim: return hypervisor's pid not shim's pid - Dragonball: introduce upcall - refactor(shim-mgmt): move client side to libs - kata-ctl: Add --list option - kata-ctl: check: only-list-releases and include-all-releases options - basic framework for QEMU support in runtime-rs - tools: Fix indentation on build kernel script - runtime-rs: fix standalone share fs - runtime-rs: fix sandbox_pidns calculation and oci spec amending - runtime,agent: Add SELinux support for containers inside the guest - kata-sys-util: fix issues where umount2 couldn't get the correct path - agent: Drop the Option for LinuxContainer.cgroup_manager - dragonball: enable kata3.0/dragonball CI on Arm - fix kata deploy error after node reboot. - tools: Fix indentation for ovmf script - runtime: prevent waiting 50 ms minimum for a process exit - runtime-rs: fix high cpu - agent: remove `sysinfo` dependency - runtime-rs: bind mount volumes in sandbox level - docs: Update the rust version in the installation documentation - runtime-rs: fix some variable names and typos - kata-ctl: add host check for aarch64 - kata-ctl: fix dependency version conflict - workflow: fix cargo-deny-runner.yaml syntax error - runtime: Add identification in version for runtime-rs - workflow: call cargo in user's $PATH - runtime-rs: remove the version number from the commit display message - runk: Re-implement start operation using the agent codes - build: update golang version to 1.19.3 - snap: Fix snapcraft setup (unbreak snap releases) - fix(agent): fix iptables binary path in guest - runtime-rs: moving only vCPU threads into sandbox controller - tools: Remove extra tab spaces from kata deploy binaries script - ci: let static checks don't depend on build - actions: use matrix to refactor static checks - agent: support systemd cgroup for kata agent. - actions: skip some jobs using "paths-ignore" filter - runtime: go fix code for 1.19 - doc: update runtime-rs "Build and Install" - runtime: don't fail mkdir if the folder is already created by another process - kernel: add CONFIG_X86_SGX into whitelist - runtime-rs: block on the current thread when setup the network to avoid be take over by other task - Refactor(runtime-rs): add conditional compile for virt-sandbox persist - runtime: add log record to the qemu config method `appendDevices` for… - runtime: Use containerd v1.6.8 - tools: Fix indentation of build static firecracker script - package: add nydus to release artifacts - agent: check if command exist before do ip_tables test - runtime: Support virtiofs queue size for qemu and make it configurable - docs: change mount-info.json to mountInfo.json - docs: update doc "NVIDIA GPU passthrough" - runtime-rs: support vhost-vsock - utils: Add utility function to fetch the kernel version. - versions: update nydusd version - runtime-rs: support nydus v5 and v6 rootfs - Upgrade to Cloud Hypervisor v28.0 - docs: update doc "Setup swap device in guest kernel" - Rust fixes + Golang bump - clh: avoid race condition when stopping clh - tools: Fix indentation of build static virtiofsd script - docs: Fix configuration path - runtime-rs : fix the shim source in the documentation test is ambiguous - versions: update vmm-sys-util and related crates to v0.11.0 - runtime-rs: delete all cargo patches - feat(shim-mgmt): iptables handler - tools: Remove empty spaces from build kernel script - Built-in Sandbox: add more unit tests for dragonball. Part 3 - Dragonball: enable mem_file_path config into hugetlbfs process - runtime-rs:add hypervisor interface capabilities - cloud-hypervisor: Fix GetThreadIDs function - github: Parallelise static checks - runtime-rs: blanks filled & fixes made to virtiofsd launch - vCPUs pinning support for Kata Containers - runtime-rs: fix shared volume permission issue - runk: Ignore an error when calling kill cmd with --all option - runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock - snap: Unbreak docker install - add EnterNetNS in virtcontainers - tools: Fix indentation of build static clh script - virtiofsd: Not use "link-self-contained=yes" on s390x - Kata ctl drop privs - versions: bump golangci-lint version - runtime-rs: generate config files with the default target - docs: Fix volumeMounts in SGX usage example - versions: Update Cloud Hypervisor to b4e39427080 - docs: update rust runtime installation guide - rustjail: Upgrade libseccomp crate to v0.3.0 - makefile: remove sudo when create symbolic link - agent: remove redundant checks - shim: Ensure pagesize is set when reporting hugetlb stats - kata-ctl: Re-enable network tests on s390x (fixes 5438) - agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink - fix readme content error at doc directory - agent: validate hugepage size is supported - Makefile: fix an typo in runtime-rs makefile - qemu: Re-work static-build Dockerfile - Modify agent-url return value in runtime-rs - runtime-rs: regulate the comment in runtime-rs makefile - doc: Update how-to-run-kata-containers-with-SNP-VMs.md - kata-ctl: Disable network check on s390x - virtiofsd: Build inside a container - Dragonball: remove redundant comments in event manager - versions: Update TDX QEMU - runtime-rs: fix typo get_contaier_type to get_container_type - kata-ctl: improve command descriptions for consistency - runtime-rs: force shutdown shim process in it can't exit - versions: Update TDX kernel - ci: skip s390x for dragonball. - Dragonball: delete redundant comments in blk_dev_mgr - kata-ctl: Move development to main branch - runtime-rs: support ephemeral storage for emptydir - docs: fix a typo in rust-runtime-installation-guide - Built-in Sandbox: add more unit tests for dragonball - readme: remove libraries mentioning `b5cfd0958` kata-ctl: Fixed format for check release options `fbf294da3` refactor(shim-mgmt): move client side to libs `ae0dcacd4` tools: Add some new gitignore items `99485d871` shim: return hypervisor's pid not shim's pid `1f28ff683` runtime-rs: add binary to exercise shim proper w/o containerd dependencies `eb8c9d38f` runtime-rs: add launch of a simple qemu process to start_vm() `2f6d0d408` runtime-rs: support qemu in VirtContainer `1413dfe91` runtime-rs: add basic empty boilerplate for qemu driver `a81ced0e3` upcall: add upcall into kernel build script `f5c34ed08` Dragonball: introduce upcall `8dbfc3dc8` kata-ctl: Fixed format for check release options `f3091a9da` kata-ctl: Add kata-ctl check release options `a577df8b7` tools: Fix indentation on build kernel script `b087667ac` kata-deploy: Fix the pod of kata deploy starts to occur an error `79cf38e6e` runtime-rs: clear OCI spec namespace path `62f4603e8` runtime-rs: reset rdma cgroup `5b6596f54` runtime-rs: CreateContainerRequest has Default `e9e82ce28` runtime-rs: fix is_pid_namespace_enabled check `8079a9732` kata-sys-util: fix issues where umount2 couldn't get the correct path `4661ea8d3` runtime-rs: fix standalone share fs `c5abc5ed4` config: speed up rng init when kernel boot for arm64 `3e6114b2e` tools: Fix indentation for ovmf script `7fdbbcda8` agent: Drop the Option for LinuxContainer.cgroup_manager `d04d45ea0` runtime: use pidfd to wait for processes on Linux `e9ba0c11d` runtime: use exponential backoff for process wait `748f22e7d` agent: remove sysinfo dependency `0019d653d` runtime-rs: fix high cpu `46b38458a` docs: Update the rust version in the installation documentation `71491a69c` runtime: move process wait logic to another function `92ebe61fe` runtime: reap force killed processes `fdf0a7bb1` runtime-rs: fix the issues mentioned in the code review `1d823c4f6` runtime-rs: umount and permission controls in sandbox level `527b87141` runtime-rs: bind mount volumes in sandbox level `9ccf2ebe8` agent: add signal value to log `fb2c142f1` runtime-rs: fix some variable names and typos `737420469` kata-ctl: fix dependency version conflict `89574f03f` workflow: call cargo in user's $PATH `d4321ab48` runtime: Add identification in version for runtime-rs `f7fc436be` workflow: fix cargo-deny-runner.yaml syntax error `78532154d` docs: Add description for guest SELinux support `c617bbe70` runtime: Pass SELinux policy for containers to the agent `935476928` agent: Add SELinux support for containers `a75f99d20` osbuilder: Create guest image for SELinux `a9c746f28` kernel: Add kernel configs for SELinux `86cb05883` snap: Fix snapcraft setup (unbreak snap releases) `f443b7853` build: update golang version to 1.19.3 `e12db92e4` runk: Re-implement start operation using the agent codes `e723bad0a` ci: let static checks don't depend on build `69aae0227` actions: use matrix to refactor static checks `a5e4cad4b` kata-ctl: add host check for aarch64 `2edbe389d` runtime-rs: moving only vCPU threads into sandbox controller `340e24f17` actions: skip some job using "paths-ignore" filter `2426ea9bd` doc: update runtime-rs "Build and Install" `67fe703ff` runtime-rs: remove the version number from the commit display message `1d93a9346` fix(agent): fix iptables binary path in guest `1dfd845f5` runtime: go fix code for 1.19 `cd85a44a0` tools: Remove extra tab spaces from kata deploy binaries script `cb199e0ec` kernel: add CONFIG_X86_SGX into whitelist `4b45e1386` runtime: don't fail mkdir if the folder is already created `b987bbc57` runtime-rs: block on the current thread when setup the network `abb9ebeec` package: add nydus to release artifacts `30a7ebf43` runtime: Log invalid devices in QEMU config `2539f3186` runtime: Use containerd v1.6.8 `993d05a42` docs: change mount-info.json to mountInfo.json `d808adef9` runtime-rs: support vhost-vsock `6b2ef66f0` runtime-rs: add conditional compile for virt-sandbox persist `6c1e153a6` docs: update doc "NVIDIA GPU passthrough" `b53171b60` agent: check command before do test_ip_tables `a636d426d` versions: update nydusd version `3bb145c63` runtime: Support virtiofs queue size for qemu and make it configurable `e80a9f09f` utils: Add utility function to fetch the kernel version. `36545aa81` runtime: clh: Re-generate the client code `f4b02c224` versions: Upgrade to Cloud Hypervisor v28.0 `e4a6fbadf` docs: update doc "Setup swap device in guest kernel" `2f5f575a4` log-parser: Simplify check `d94718fb3` runtime: Fix gofmt issues `16b837509` golang: Stop using io/ioutils `66aa330d0` versions: Update golangci-lint `b3a4a1629` versions: bump containerd version `eab8d6be1` build: update golang version to 1.19.2 `e80dbc15d` runtime-rs: workaround Dragonball compilation problem `c3f1922df` fix(fmt): fix cargo fmt to pass static check `a4099dab8` tools: Fix indentation of build static firecracker script `c46814b26` runtime-rs:support nydus v5 and v6 `a04afab74` qemu: early exit from Check if the process was stopped `7e481f217` qemu: set stopped only if StopVM is successful `0e3ac66e7` clh: return faster with dead clh process from isClhRunning `9ef68e0c7` clh: fast exit from isClhRunning if the process was stopped `2631b08ff` clh: don't try to stop clh multiple times `f45fe4f90` versions: update vmm-sys-util and related crates to v0.11.0 `8be081730` tools: Fix indentation of build static virtiofsd script `f8f97c1e2` feat(shim-mgmt): iptables handler `29c75cf12` runtime-rs: delete all cargo patches `9f70a6949` tools: Remove empty spaces from build kernel script `57336835d` dragonball: add more unit test for device manager `233370023` dragonball: add test utils. `3e9c3f12c` docs: Fix configuration path `2adb1c182` Dragonball: enable mem_file_path config into hugetlbfs process `daeee26a1` cloud-hypervisor: Fix GetThreadIDs function `40d514aa2` github: Parallelise static checks `2508d39b7` runtime: added vcpus pinning logics Core VCPU threads pinning logics for issue 4476. Also provided docs. `fef8e92af` runtime-rs:add hypervisor interface capabilities `27b191358` runtime-rs: blanks filled & fixes made to virtiofsd launch `990e6359b` snap: Unbreak docker install `ca69a9ad6` snap: Use metadata for dependencies `df092185e` runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock `16dca4ecd` runk: Ignore an error when calling kill cmd with --all option `b74c18024` runtime-rs: fix shared volume permission issue `936fe35ac` runtime-rs : fix shim source is ambiguous `0ed7da30d` tools: Fix indentation of build static clh script `43fcb8fd0` virtiofsd: Not use "link-self-contained=yes" on s390x The compile option link-self-contained=yes asks rustc to use C library startup object files that come with the compiler, which are not available on the target s390x-unknown-linux-gnu. A build does not contain any startup files leading to a broken executable entry point (causing segmentation fault). `219919e9f` docs: Fix volumeMounts in SGX usage example `c0f5bc81b` cargo: Add Cargo.lock to version control `474927ec9` gitignore: Add gitignore file `699f821e1` utils: Add function to drop priveleges `a6fb4e2a6` versions: bump golangci-lint version `b015f34af` runtime-rs: generate config files with the default target `d7bb4b551` agent: support systemd cgroup for kata agent `144efd1a7` docs: update rust runtime installation guide `abf4f9b29` docs: kata 3.0 Architecture fix readme content error `44d8de892` agent: remove redundant checks `9d286af7b` versions: Update Cloud Hypervisor to b4e39427080 `081ee4871` agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink `e95089b71` kata-ctl: add basic cpu check for s390x `871d2cf2c` kata-ctl: Limit running tests to x86 and use native-tls on s390x `cbd84c3f5` rustjail: Upgrade libseccomp crate to v0.3.0 `748be0fe3` makefile: remove sudo when create symbolic link `227e717d2` qemu: Re-work static-build Dockerfile `72738dc11` agent: validate hugepage size is supported `f74e328ff` Makefile: fix an typo in runtime-rs makefile `f205472b0` Makefile: regulate the comment style for the runtime-rs comments `9f2c7e47c` Revert "kata-ctl: Disable network check on s390x" `ac403cfa5` doc: Update how-to-run-kata-containers-with-SNP-VMs.md `00981b3c0` kata-ctl: Disable network check on s390x `39363ffbf` runtime: remove same function `c322d1d12` kata-ctl: arch: Improve check call `0bc5baafb` snap: Build virtiofsd using the kata-deploy scripts `cb4ef4734` snap: Create a task for installing docker `7e5941c57` virtiofsd: Build inside a container `35d52d30f` versions: Update TDX QEMU `4d9dd8790` runtime-rs: fix typo get_contaier_type to get_container_type `70676d4a9` kata-ctl: improve command descriptions for consistency `9eb73d543` versions: Update TDX kernel `00a42f69c` kata-ctl: cargo: 2021 -> 2018 `fb6327474` kata-ctl: rustfmt + clippy fixes `1f1901e05` dragonball: fix clippy warning for aarch64 `a343c570e` dragonball: enhance dragonball ci `6a64fb0eb` ci: skip s390x for dragonball. `a743e37da` Dragonball: delete redundant comments in blk_dev_mgr `2b345ba29` build: Add kata-ctl to tools list `f7010b806` kata-ctl: docs: Write basic documentation `862eaef86` docs: fix a typo in rust-runtime-installation-guide `26c043dee` ci: Add dragonball test `781e604c3` docs: Reference kata-ctl README `15c343cbf` kata-ctl: Don't rely on system ssl libs `c23584994` kata-ctl: clippy: Resolve warnings and reformat `133690434` kata-ctl: implement CLI argument --check-version-only `eb5423cb7` kata-ctl: switch to use clap derive for CLI handling `018aa899c` kata-ctl: Add cpu check `7c9f9a5a1` kata-ctl: Make arch test run at compile time `b63ba66dc` kata-ctl: Formatting tweaks `cca7e32b5` kata-ctl: Lint fixes to allow the branch to be built `8e7bb8521` kata-ctl: add code for framework for arch `303fc8b11` kata-ctl: Add unit tests cases `d0b33e9a3` versions: Add kata-ctl version entry `002b18054` kata-ctl: Add initial rust code for kata-ctl `b62b18bf1` dragonball: fix clippy warning `2ddc948d3` Makefile: add dragonball components. `3fe81fe4a` dragonball-ut: use skip_if_not_root to skip root case `72259f101` dragonball: add more unit test for vmm actions `9717dc3f7` Dragonball: remove redundant comments in event manager `9c1ac3d45` runtime-rs: return port on agent-url req `89e62d4ed` shim: Ensure pagesize is set when reporting hugetbl stats `8d4ced3c8` runtime-rs: support ephemeral storage for emptydir `046ddc646` readme: remove libraries mentioning `86ad832e3` runtime-rs: force shutdown shim process in it can't exit Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-12-16 09:12:07 +01:00
Zhongtao Hu	21ec766d29	docs: add documents for using bundle to start container add document for using bundle to start container Fixes:#5872 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-12-16 11:13:25 +08:00
Yushuo	d14c3af35c	dragonball: refactor legacy device initialization If the serial path is given, legacy_manager should create socket console based on that path. Or the console should be created based on stdio. Fixes: #5914 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2022-12-15 20:55:01 +08:00
Fabiano Fidêncio	1d266352ea	Merge pull request #5902 from Bevisy/fix-too-many-git-file tools: Add some new gitignore items	2022-12-15 11:29:32 +01:00
Zhongtao Hu	ca39a07a14	runtime-rs: enable start container from bundle enable start container from bundle in this way $ ls ./bundle config.json rootfs $ sudo ctr run -d --runtime io.containerd.kata.v2 --config bundle/config.json test_kata Fixes:#5872 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-12-15 17:28:13 +08:00
Peng Tao	ebb73df6bc	Merge pull request #5899 from Bevisy/fix-outdated-comments shim: return hypervisor's pid not shim's pid	2022-12-15 14:55:54 +08:00
Peng Tao	7210905deb	Merge pull request #5712 from openanolis/chao/upcall Dragonball: introduce upcall	2022-12-15 14:44:56 +08:00
Chao Wu	fad229b853	Merge pull request #5875 from Ji-Xinyou/xyji/refactor-shim-mgmt refactor(shim-mgmt): move client side to libs	2022-12-15 10:59:45 +08:00
David Esparza	1dbd6c8057	Merge pull request #5735 from dborquez/kata-ctl-cli-list kata-ctl: Add --list option	2022-12-14 15:03:21 -06:00
Alex	b5cfd09583	kata-ctl: Fixed format for check release options Fixed formatting for check release options Fixes: #5345 Signed-off-by: Alex <alee23@bu.edu> Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2022-12-14 09:42:57 -06:00
James O. D. Hunt	2e15af777c	Merge pull request #5786 from alexlee-23/main kata-ctl: check: only-list-releases and include-all-releases options	2022-12-14 11:25:36 +00:00
Ji-Xinyou	fbf294da3f	refactor(shim-mgmt): move client side to libs The client side is moved to libs. This is to solve the problem that including clients will bring about messy dependencies. Fixes: #5874 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-12-14 17:42:25 +08:00
Peng Tao	856d4b7361	Merge pull request #5798 from pmores/qemu-support basic framework for QEMU support in runtime-rs	2022-12-14 15:05:33 +08:00
Binbin Zhang	ae0dcacd4a	tools: Add some new gitignore items Add some new ignore items to avoid local builds that cause git to track a lot of files Fixes: #5900 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2022-12-14 11:38:23 +08:00
Binbin Zhang	99485d871c	shim: return hypervisor's pid not shim's pid update outdated code comments Fixes: #3234 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2022-12-14 11:16:11 +08:00
GabyCT	b637d12d19	Merge pull request #5884 from GabyCT/topic/fixbuildscript tools: Fix indentation on build kernel script	2022-12-13 15:28:24 -06:00
Chao Wu	bb4be2a666	Merge pull request #5690 from yipengyin/fix-virtiofsd runtime-rs: fix standalone share fs	2022-12-14 00:16:10 +08:00
James Tumber	087515a46e	agent: unset `CC` for cross-build When `HOST_ARCH` != `ARCH` unset `CC` Specifying a foreign CC is incompatible with building libgit2. Thus after the RUSTFLAGS linker has been set we can safely unset CC to avoid passing this value through the build. Fixes: #5890 Signed-off-by: James Tumber <james.tumber@ibm.com>	2022-12-13 15:30:06 +00:00
Pavel Mores	1f28ff6838	runtime-rs: add binary to exercise shim proper w/o containerd dependencies After building the binary as usual with `cargo build` run it as follows. It needs a configuration.toml in which only qemu keys `path`, `kernel` and `initrd` will initially need to be set. Point them to respective files e.g. from a kata distribution tarball. It also needs to be launched from an exported container bundle directory. One can be created by running mkdir rootfs podman export $(podman create busybox) \| tar -C ./rootfs -xvf - runc spec -b . in a suitable directory. Then launch the program like this: KATA_CONF_FILE=/path/to/configuration-qemu.toml /path/to/shim-ctl Fixes: #5817 Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-12-13 14:55:21 +01:00
Pavel Mores	eb8c9d38ff	runtime-rs: add launch of a simple qemu process to start_vm() The point here is just to get a simplest Kata VM running. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-12-13 14:54:26 +01:00
Pavel Mores	2f6d0d408b	runtime-rs: support qemu in VirtContainer Added registration of qemu config plugin and support for creating Qemu Hypervisor instance. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-12-13 14:54:26 +01:00
Pavel Mores	1413dfe91c	runtime-rs: add basic empty boilerplate for qemu driver This does almost literally nothing so far apart from getting and setting HypervisorConfig. It's mostly copied from/inspired by dragonball. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-12-13 14:53:45 +01:00
Bin Liu	3952fedcd0	Merge pull request #5882 from bergwolf/github/oci-namespaces runtime-rs: fix sandbox_pidns calculation and oci spec amending	2022-12-13 18:32:02 +08:00
Fabiano Fidêncio	f1381eb361	Merge pull request #4813 from ManaSugi/fix/add-selinux-agent runtime,agent: Add SELinux support for containers inside the guest	2022-12-13 11:24:53 +01:00
Yuan-Zhuo	bf8848f926	agent: Eliminate unnecessary metrics DEFAULT_REGISTRY pre-registers many metrics that we don't need or have duplicated. This PR uses a custom register for metrics without interference and ensures that the registration process is executed only once when the program is running. Fixes: #5255 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2022-12-13 16:18:33 +08:00
Fupan Li	015674df16	Merge pull request #5873 from justxuewei/fix/umount2 kata-sys-util: fix issues where umount2 couldn't get the correct path	2022-12-13 15:52:32 +08:00
Chao Wu	a81ced0e3f	upcall: add upcall into kernel build script In order to let upcall being used by Kata Container, we need to add those patches into kernel build script. Currently, only when experimental (-e) and hypervisor type dragonball (-t dragonball) are both enabled, that the upcall patches will be applied to build a 5.10 guest kernel. example commands: sh ./build-kernel.sh -e -t dragonball -d setup fixes: #5642 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-12-13 15:44:55 +08:00
Chao Wu	f5c34ed088	Dragonball: introduce upcall Upcall is a direct communication tool between VMM and guest developed upon vsock. The server side of the upcall is a driver in guest kernel (kernel patches are needed for this feature) and it'll start to serve the requests after the kernel starts. And the client side is in Dragonball VMM , it'll be a thread that communicates with vsock through uds. We want to keep the lightweight of the VM through the implementation of the upcall, through which we could achieve vCPU hotplug, virtio-mmio hotplug without implementing complex and heavy virtualization features such as ACPI virtualization. fixes: #5642 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-12-13 15:44:47 +08:00
Bin Liu	03b6124fc6	Merge pull request #5848 from Yuan-Zhuo/drop-cgmr-option agent: Drop the Option for LinuxContainer.cgroup_manager	2022-12-13 12:09:39 +08:00
Guoqiang Ding	f8a48ab41d	docs: add hint of probing loop module If `loop` module is not probed, it causes error like "losetup: cannot find an unused loop device". Fixes: #5887 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2022-12-13 11:33:42 +08:00
Alex	8dbfc3dc82	kata-ctl: Fixed format for check release options Fixed formatting for check release options Fixes: #5345 Signed-off-by: Alex <alee23@bu.edu>	2022-12-13 03:10:19 +00:00
Bin Liu	add2486259	Merge pull request #5853 from jongwu/test_kata3.0_arm dragonball: enable kata3.0/dragonball CI on Arm	2022-12-13 11:05:17 +08:00
Alex	f3091a9da4	kata-ctl: Add kata-ctl check release options This pull request adds kata-ctl check only-list-releases and include-all-releases Fixes: #5345 Signed-off-by: Alex <alee23@bu.edu>	2022-12-13 03:04:30 +00:00
Gabriela Cervantes	a577df8b71	tools: Fix indentation on build kernel script This PR fixes the indentation on the build kernel script. Fixes #5883 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-12-12 16:37:47 +00:00
Fabiano Fidêncio	740387b569	Merge pull request #5829 from singhwang/main fix kata deploy error after node reboot.	2022-12-12 14:20:14 +01:00
singhwang	b087667ac5	kata-deploy: Fix the pod of kata deploy starts to occur an error If a pod of kata is deployed on a machine, after the machine restarts, the pod status of kata-deploy will be CrashLoopBackOff. Fixes: #5868 Signed-off-by: SinghWang <wangxin_0611@126.com>	2022-12-12 19:11:38 +08:00
Peng Tao	79cf38e6ea	runtime-rs: clear OCI spec namespace path None of the host namespace paths make sense in the guest. Let's clear them all before sending the spec to the agent. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-12-12 11:07:14 +00:00
Peng Tao	62f4603e81	runtime-rs: reset rdma cgroup We don't support rdma cgroups yet. Let's make sure it is reset to empty. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-12-12 09:57:24 +00:00
Peng Tao	5b6596f54e	runtime-rs: CreateContainerRequest has Default We can just use it to initialize the default fields. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-12-12 09:57:24 +00:00
Peng Tao	e9e82ce28b	runtime-rs: fix is_pid_namespace_enabled check We should test is_pid_namespace_enabled before amending the container spec, where the pid namespace path is cleared and resulting sandbox_pidns to always being false. Fixes: #5881 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-12-12 09:54:48 +00:00
Zhongtao Hu	afaf17f423	runtime-rs: enable container hugepage enable the functionality of using hugepages in container Fixes: #5560 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-12-12 17:49:31 +08:00
Xuewei Niu	8079a9732d	kata-sys-util: fix issues where umount2 couldn't get the correct path Strings in Rust don't have \0 at the end, but C does, which leads to `umount2` in the libc can't get the correct path. Besides, calling `nix::mount::umount2` to avoid using an unsafe block is a robust solution. Fixes: #5871 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2022-12-12 11:50:32 +08:00
Yipeng Yin	4661ea8d3b	runtime-rs: fix standalone share fs Standalone share fs should add virtiofs device in setup_device_before_start_vm and return the storages to mount the directory in guest. And it uses hypervisor's jailer root directly instead of jail config. Besides, we tweaked the parameter, so it adapts to rust version virtiofsd now. And its cache policy which forbids caching is "never" now, instead of "none". Hence, we change the default cache mode. Fixes: #5655 Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>	2022-12-12 10:58:09 +08:00
GabyCT	67e82804c5	Merge pull request #5865 from GabyCT/topic/fixspacesovmfscript tools: Fix indentation for ovmf script	2022-12-09 15:33:49 -06:00
Jianyong Wu	c5abc5ed4d	config: speed up rng init when kernel boot for arm64 For now, rng init is too slow for kata3.0/dragonball. Enable random_trust_cpu can speed up rng init when kernel boot. Fixes: #5870 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-12-09 14:20:18 +08:00
Gabriela Cervantes	3e6114b2ef	tools: Fix indentation for ovmf script This PR fixes the indentation for the ovmf script for packaging. Fixes #5864 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-12-08 16:12:20 +00:00
Zhongtao Hu	fc4a67eec3	runtime-rs: enable vm hugepage support vm hugepage,set the hugetlbfs mount point as vm memory path Fixes:#5560 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-12-09 00:01:16 +08:00
Greg Kurz	5ef7ed72ae	Merge pull request #5610 from UiPath/fix-process-wait runtime: prevent waiting 50 ms minimum for a process exit	2022-12-08 11:02:39 +01:00
Mathias Flagey	ebe5c5adf9	docs: Update virtiofsd build script in the developer guide Script to execute to build virtiofsd has been changed in #5426 but not in the doc. This commit update the developer guide. Fixes: #5860 Signed-off-by: Mathias Flagey <mathiasflagey1201@gmail.com>	2022-12-08 09:29:10 +01:00
Peng Tao	0a1d1ec2fa	Merge pull request #5830 from openanolis/fix-high-cpu runtime-rs: fix high cpu	2022-12-08 12:16:06 +08:00
Steve Horsman	39394fa2a8	Merge pull request #5844 from jtumber-ibm/patch-1 agent: remove `sysinfo` dependency	2022-12-07 16:35:05 +00:00
Fupan Li	cce316b5e9	Merge pull request #5607 from justxuewei/feat/sandbox-level-volume runtime-rs: bind mount volumes in sandbox level	2022-12-07 19:23:38 +08:00
Chelsea Mafrica	1ff4185111	Merge pull request #5842 from cyyzero/update_install_guide docs: Update the rust version in the installation documentation	2022-12-06 23:40:35 -08:00
Yuan-Zhuo	7fdbbcda82	agent: Drop the Option for LinuxContainer.cgroup_manager Cgroup manager for a container will always be created. Thus, dropping the option for LinuxContainer.cgroup_manager is feasible and could simplify the code. Fixes: #5778 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2022-12-07 13:40:38 +08:00
Alexandru Matei	d04d45ea05	runtime: use pidfd to wait for processes on Linux Use pidfd_open and poll on newer versions of Linux to wait for the process to exit. For older versions use existing wait logic Fixes: #5617 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-12-06 16:31:05 +02:00
Alexandru Matei	e9ba0c11d0	runtime: use exponential backoff for process wait Initial wait period between checks is 1ms, and the next ones are min(wait_period*5, 50ms) Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-12-06 16:30:58 +02:00
James Tumber	748f22e7d0	agent: remove sysinfo dependency Removes the redundant dependency `sysinfo`. Fixes: #5843 Signed-off-by: James Tumber <james.tumber@ibm.com>	2022-12-06 10:18:53 +00:00
Quanwei Zhou	0019d653d6	runtime-rs: fix high cpu Fixed the issue when using nonblocking, the `tokio::io::copy()` needing to handle EAGAIN, resulting in high CPU usage. Fixes: #5740 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-12-06 14:25:33 +08:00
Chao Wu	326d589ff5	Merge pull request #5822 from liubin/fix/5820-var-name-and-typo runtime-rs: fix some variable names and typos	2022-12-06 14:24:11 +08:00
Zhongtao Hu	c12bb5008d	Merge pull request #5769 from jongwu/check_host_arm kata-ctl: add host check for aarch64	2022-12-06 14:05:52 +08:00
Chen Yiyang	46b38458af	docs: Update the rust version in the installation documentation Rust version in the installation documentation does not match the requirements. Just fix it. Fixes: #5841 Signed-off-by: Chen Yiyang <cyyzero@qq.com>	2022-12-06 12:50:32 +08:00
Chao Wu	538bddf4ee	Merge pull request #5811 from tzY15368/fix-katactl-conflict-dependency kata-ctl: fix dependency version conflict	2022-12-06 10:44:48 +08:00
Alexandru Matei	71491a69c3	runtime: move process wait logic to another function extract process wait logic to another function Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-12-05 13:32:04 +02:00
Alexandru Matei	92ebe61fea	runtime: reap force killed processes reap child processes after sending SIGKILL Fixes #5739 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-12-05 13:31:58 +02:00
Xuewei Niu	fdf0a7bb14	runtime-rs: fix the issues mentioned in the code review Removed the `Debug` trait for the `ShareFs` and etc. Renamed `ShareFsMount::upgrade()` and `ShareFsMount::downgrade()` to `upgrade_to_rw()` and `downgrade_to_ro()`. Protected `mounted_info_set` with a mutex to avoid race conditions. Fixes: #5588 Signed-off-by: Xuewei Niu <justxuewei@apache.org>	2022-12-05 11:18:26 +08:00
Xuewei Niu	1d823c4f65	runtime-rs: umount and permission controls in sandbox level This commit implemented umonut controls and permission controls. When a volume is no longer referenced, it will be umounted immediately. When a volume mounted with readonly permission and a new coming container needs readwrite permission, the volume should be upgraded to readwrite permission. On the contrary, if a volume with readwrite permission and no container needs readwrite, then the volume should be downgraded. Fixes: #5588 Signed-off-by: Xuewei Niu <justxuewei@apache.org>	2022-12-05 10:58:13 +08:00
Xuewei Niu	527b871414	runtime-rs: bind mount volumes in sandbox level Implemented bind mount related managment on the sandbox side, involving bind mount a volume if it's not mounted before, upgrade permission to readwrite if there is a new container needs. Fixes: #5588 Signed-off-by: Xuewei Niu <justxuewei@apache.org>	2022-12-05 10:58:13 +08:00
Bin Liu	9ccf2ebe8a	agent: add signal value to log For signal_process call, log the signal value in logs. Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-02 14:53:58 +08:00
Bin Liu	fb2c142f18	runtime-rs: fix some variable names and typos Fix some not perfect variable names, and some typos in logs. Fixes: #5820 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-02 14:52:34 +08:00
Bin Liu	8246de821f	Merge pull request #5809 from liubin/fix/cargo-deny-workflow-error workflow: fix cargo-deny-runner.yaml syntax error	2022-12-02 12:19:44 +08:00
Bin Liu	514b7778a2	Merge pull request #5807 from liubin/fix/5806-add-shim-lanuage runtime: Add identification in version for runtime-rs	2022-12-02 11:36:55 +08:00
Bin Liu	c1f5a93b66	Merge pull request #5814 from liubin/fix/5813-test-dragonball-error workflow: call cargo in user's $PATH	2022-12-02 11:36:19 +08:00
Tingzhou Yuan	737420469a	kata-ctl: fix dependency version conflict Also added crate `runtime-rs/crates/runtimes` as dependency as it's immediately depended upon by the `direct-volume` feature, see issue 5341 and PR 5467. Fixes #5810 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>	2022-12-01 17:53:21 +00:00
Bin Liu	89574f03f8	workflow: call cargo in user's $PATH Call cargo in root's HOME may lead to permission error, should call cargo installed in user's HOME/PATH. Fixes: #5813 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-01 15:37:16 +08:00
Bin Liu	d4321ab489	runtime: Add identification in version for runtime-rs Now we are supporting two runtime/shim, the go version, and the rust version, for debug purposes, we can add an identification in the version info to tell us which runtime/shim is used. Fixes: #5806 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-01 15:14:08 +08:00
Bin Liu	7fabfb2cf0	Merge pull request #5756 from chentt10/remove-version-number-from-commit-message runtime-rs: remove the version number from the commit display message	2022-12-01 13:11:47 +08:00
Bin Liu	f7fc436bed	workflow: fix cargo-deny-runner.yaml syntax error There is a syntax error in .github/workflows/cargo-deny-runner.yaml Fixes: #5808 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-01 12:32:00 +08:00
Fabiano Fidêncio	212325a9db	Merge pull request #5649 from ManaSugi/runk/refactor-start-using-agent-code runk: Re-implement start operation using the agent codes	2022-11-29 20:45:16 +01:00
Fabiano Fidêncio	ac1b2d2a18	Merge pull request #5774 from UiPath/fix-go-panic build: update golang version to 1.19.3	2022-11-29 13:17:53 +01:00
Fabiano Fidêncio	d8d9aae123	Merge pull request #5781 from jodh-intel/snap-fix-release snap: Fix snapcraft setup (unbreak snap releases)	2022-11-29 13:11:34 +01:00
Manabu Sugimoto	78532154d9	docs: Add description for guest SELinux support Add the description about how to enable SELinux for containers running inside the guest. Fixes: #4812 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-29 19:07:56 +09:00
Manabu Sugimoto	c617bbe70d	runtime: Pass SELinux policy for containers to the agent Pass SELinux policy for containers to the agent if `disable_guest_selinux` is set to `false` in the runtime configuration. The `container_t` type is applied to the container process inside the guest by default. Users can also set a custom SELinux policy to the container process using `guest_selinux_label` in the runtime configuration. This will be an alternative configuration of Kubernetes' security context for SELinux because users cannot specify the policy in Kata through Kubernetes's security context. To apply SELinux policy to the container, the guest rootfs must be CentOS that is created and built with `SELINUX=yes`. Fixes: #4812 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-29 19:07:56 +09:00
Manabu Sugimoto	9354769286	agent: Add SELinux support for containers The kata-agent supports SELinux for containers inside the guest to comply with the OCI runtime specification. Fixes: #4812 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-29 19:07:56 +09:00
Bin Liu	588f81a23c	Merge pull request #5612 from openanolis/fix-iptables fix(agent): fix iptables binary path in guest	2022-11-29 16:57:06 +08:00
Bin Liu	1da2d0603c	Merge pull request #5761 from gaohuatao-1/ght_overhead runtime-rs: moving only vCPU threads into sandbox controller	2022-11-29 13:53:01 +08:00
Manabu Sugimoto	a75f99d20d	osbuilder: Create guest image for SELinux Create a guest image to support SELinux for containers inside the guest if `SELINUX=yes` is specified. This works only if the guest rootfs is CentOS and the init service is systemd, not the agent init. To enable labeling the guest image on the host, selinuxfs must be mounted on the host. The kata-agent will be labeled as `container_runtime_exec_t` type. Fixes: #4812 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-29 13:32:26 +09:00
Manabu Sugimoto	a9c746f284	kernel: Add kernel configs for SELinux Add kernel configs related to SELinux in order to add the support for containers running inside the guest. Fixes: #4812 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-29 13:32:26 +09:00
GabyCT	681d946644	Merge pull request #5748 from GabyCT/topic/removeextratabspacesdocker tools: Remove extra tab spaces from kata deploy binaries script	2022-11-28 15:34:12 -06:00
James O. D. Hunt	86cb058833	snap: Fix snapcraft setup (unbreak snap releases) Setup the snapcraft environment manually as the action we had been using for this does not appear to be actively maintained currently. Related to this, switch to specifying the snapcraft store credentials using the `SNAPCRAFT_STORE_CREDENTIALS` secret. This unbreaks `snapcraft upload`, which Canonical appear to have broken by removing the previous facility. Fixes: #5772. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-11-28 15:51:47 +00:00
Alexandru Matei	f443b78537	build: update golang version to 1.19.3 This Go release fixes golang/go#56309 Fixes #5773 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-28 17:03:29 +02:00
GabyCT	013752667b	Merge pull request #5776 from liubin/tmp/debug-static-check ci: let static checks don't depend on build	2022-11-28 07:51:42 -06:00
Fabiano Fidêncio	527e6c99e9	Merge pull request #5766 from liubin/fix/5763-use-composite-action-refactor-static-checks actions: use matrix to refactor static checks	2022-11-28 14:12:27 +01:00
Bin Liu	6af037d379	Merge pull request #5154 from Yuan-Zhuo/main agent: support systemd cgroup for kata agent.	2022-11-28 18:40:10 +08:00
Manabu Sugimoto	e12db92e4d	runk: Re-implement start operation using the agent codes This commit re-implements `start` operation by leveraging the agent codes. Currently, `runk` has own `start` mechanism even if the agent already has the feature to handle starting a container. This worsen the maintainability and `runk` cannot keep up with the changes on the agent side easily. Hence, `runk` replaces own implementations with agent's ones. Fixes: #5648 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-28 19:11:21 +09:00
Fabiano Fidêncio	74531114c3	Merge pull request #5762 from liubin/fix/5759-skip-action-by-path actions: skip some jobs using "paths-ignore" filter	2022-11-28 11:04:34 +01:00
Bin Liu	e723bad0af	ci: let static checks don't depend on build Build is a time consumable operation, skip build while let ci run faster. Fixes: #5777 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-11-28 15:26:04 +08:00
Bin Liu	a55eb78c32	Merge pull request #5752 from liubin/fix/5750-go-fix-1.19 runtime: go fix code for 1.19	2022-11-26 02:09:02 +08:00
Bin Liu	57c80ad65c	Merge pull request #5758 from chentt10/update-runtime-rs-build-and-install doc: update runtime-rs "Build and Install"	2022-11-26 02:08:48 +08:00
Bin Liu	69aae02276	actions: use matrix to refactor static checks Using matrix to reduce the duplication that of similar code. Fixes: #5763 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-11-26 00:32:15 +08:00
Jianyong Wu	a5e4cad4b6	kata-ctl: add host check for aarch64 For now, we can check if host support running kata by check if "/dev/kvm" exist on aarch64. Fixes: #5768 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-11-25 18:55:32 +08:00
gaohuatao	2edbe389d8	runtime-rs: moving only vCPU threads into sandbox controller when overhead controller exists, just contrain vCPU threads in sandbox controller Fixes:#5760 Signed-off-by: gaohuatao <gaohuatao@bytedance.com>	2022-11-25 17:53:21 +08:00
Peng Tao	e32c023d96	Merge pull request #5714 from UiPath/fix-mkdir runtime: don't fail mkdir if the folder is already created by another process	2022-11-25 17:52:56 +08:00
Bin Liu	ae1001a9d1	Merge pull request #5742 from openanolis/chao/SGX_whitelist kernel: add CONFIG_X86_SGX into whitelist	2022-11-25 17:36:26 +08:00
Bin Liu	340e24f175	actions: skip some job using "paths-ignore" filter If only docs/images are changed, some jobs should not run. Fixes: #5759 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-11-25 15:33:32 +08:00
Chen Taotao	2426ea9bdc	doc: update runtime-rs "Build and Install" When using source code to compile runtime-rs,make the documentation point out the detailed environment build and compilation methods to avoid errors caused by related dependent packages. Fixes:#5757 Signed-off-by: Chen Taotao <chentt10@chinatelecom.cn>	2022-11-25 13:13:00 +08:00
Chen Taotao	67fe703ff5	runtime-rs: remove the version number from the commit display message The displayed commit message and version message are partially duplicated. Remove the version number from the commit display message. Fixes:#5735 Signed-off-by: Chen Taotao <chentt10@chinatelecom.cn>	2022-11-25 13:00:01 +08:00
Ji-Xinyou	1d93a93468	fix(agent): fix iptables binary path in guest Some rootfs put iptables-save and iptables-restore under /usr/sbin instead of /sbin. This pr checks both and returns the one exist. Fixes: #5608 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-11-25 11:57:34 +08:00
Bin Liu	1dfd845f51	runtime: go fix code for 1.19 We have starting to use golang 1.19, some features are not supported later, so run `go fix` to fix them. Fixes: #5750 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-11-25 11:29:18 +08:00
Zhongtao Hu	f02bb1a9cb	Merge pull request #5729 from openanolis/netnsref runtime-rs: block on the current thread when setup the network to avoid be take over by other task	2022-11-25 08:09:10 +08:00
Gabriela Cervantes	cd85a44a04	tools: Remove extra tab spaces from kata deploy binaries script This PR removes extra tab spaces from the kata deploy binaries script. Fixes #5747 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-11-24 17:57:36 +00:00
Chao Wu	cb199e0ecf	kernel: add CONFIG_X86_SGX into whitelist CONFIG_X86_SGX is introduced after kernel 5.11, and that config is a default x86_64 config for Kata build-kernel.sh script. But if we use -v to specify any kernel version below 5.11 will cause an inevitable error because CONFIG_X86_SGX is not supported in older kernels and that may cause problem for the situation if we need kernel version below 5.11. So I propose to put CONFIG_X86_SGX into whitelist.conf to avoid break building guest kernel below 5.11. fixes: #5741 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-11-24 20:43:58 +08:00
Alexandru Matei	4b45e13869	runtime: don't fail mkdir if the folder is already created Use MkdirAll instead of Mkdir so it doesn't generate an error when the folder is created by another process Fixes #5713 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-24 11:20:56 +02:00
Chao Wu	9bde32daa1	Merge pull request #5707 from openanolis/ref Refactor(runtime-rs): add conditional compile for virt-sandbox persist	2022-11-24 15:24:06 +08:00
Zhongtao Hu	b987bbc576	runtime-rs: block on the current thread when setup the network As the increase of the I/O intensive tasks, two issues could be caused: 1. When the future is blocked, the current thread (which is in the network namespace) might be take over by other tasks. After the future is finished, the thread take over the current task might not be in the pod network namespace 2. When finish setting up the network, the current thread will be set back to the host namsapce. But the task which be taken over would still stay in the pod network namespace To avoid that, we need to block the future on the current thread. Fixes:#5728 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-11-24 13:48:05 +08:00
Bin Liu	06a604b753	Merge pull request #5720 from YchauWang/wyc-docs-test-22 runtime: add log record to the qemu config method `appendDevices` for…	2022-11-24 13:15:06 +08:00
Peng Tao	b4d0a39f6d	Merge pull request #5723 from fidencio/topic/runtime-bump-containerd-to-v1.6.8 runtime: Use containerd v1.6.8	2022-11-24 11:28:58 +08:00
GabyCT	6d1b5d47fb	Merge pull request #5664 from GabyCT/topic/fixfirecrackerscript tools: Fix indentation of build static firecracker script	2022-11-23 15:00:07 -06:00
Fabiano Fidêncio	82aa876903	Merge pull request #5727 from liubin/feat/add-nydus-to-release package: add nydus to release artifacts	2022-11-23 14:39:26 +01:00
Bin Liu	abb9ebeece	package: add nydus to release artifacts Install nydus related binaries under /opt/kata/libexec/ Fixes: #5726 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-11-23 15:17:58 +08:00
Fabiano Fidêncio	5cbf879659	Merge pull request #5693 from jongwu/test_ip_table agent: check if command exist before do ip_tables test	2022-11-23 08:15:08 +01:00
wangyongchao.bj	30a7ebf430	runtime: Log invalid devices in QEMU config When the user tried to add new devices to the VM, there is no error info for the invalid device. This PR adds a log record to the `appendDevices` for the invalid device of the qemu config. Fixes: #5719 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2022-11-23 09:09:45 +08:00
Fabiano Fidêncio	df3d9878d5	Merge pull request #5695 from darfux/virtiofs-queue-size runtime: Support virtiofs queue size for qemu and make it configurable	2022-11-22 20:04:30 +01:00
Archana Shinde	e7f8d21bb7	Merge pull request #5717 from Kvasscn/fix_direct_blk_mount_info docs: change mount-info.json to mountInfo.json	2022-11-22 10:19:02 -08:00
Fabiano Fidêncio	2539f31862	runtime: Use containerd v1.6.8 Let's follow the binary bump used in the CI and also bump the vendored version of containerd to v1.6.8. Fixes: #5722 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-22 18:28:30 +01:00
Fabiano Fidêncio	732123b9ab	Merge pull request #5709 from kinderyj/main docs: update doc "NVIDIA GPU passthrough"	2022-11-22 16:53:51 +01:00
Chao Wu	8b04ba95cb	Merge pull request #5691 from yipengyin/support-vhost-vsock runtime-rs: support vhost-vsock	2022-11-22 14:59:55 +08:00
Jason Zhang	993d05a42e	docs: change mount-info.json to mountInfo.json mount-info.json should be mountInfo.json according to the description in the doc. Fixes: #5716 Signed-off-by: Jason Zhang <zhanghj.lc@inspur.com>	2022-11-22 14:25:57 +08:00
Yipeng Yin	d808adef95	runtime-rs: support vhost-vsock Rename old VsockConfig to HybridVsockConfig. And add VsockConfig to support vhost-vsock. We follow kata's old way to try random vhost fd for 50 times to generate uniqe fd. Fixes: #5654 Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>	2022-11-22 10:03:52 +08:00
Zhongtao Hu	6b2ef66f0f	runtime-rs: add conditional compile for virt-sandbox persist code refactoring, add conditional compile for virt-sandbox persist Fixes: #5706 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-11-21 19:51:43 +08:00
Matt Wang	6c1e153a6f	docs: update doc "NVIDIA GPU passthrough" We should make sure the hook shell `nvidia-container-toolkit.sh` is executable. Fixes: #5594 Signed-off-by: Matt Wang <kinder_yj@hotmail.com>	2022-11-21 17:31:20 +08:00
Jianyong Wu	b53171b605	agent: check command before do test_ip_tables test_ip_tables test depends on iptables tools. But we can't ensure these tools are exist. it's better to skip the test if there is no such tools. Fixes: #5697 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-11-21 14:56:51 +08:00
Bin Liu	7c8d474959	Merge pull request #5689 from kata-containers/kata-ctl-util utils: Add utility function to fetch the kernel version.	2022-11-21 14:44:05 +08:00
Peng Tao	be31a0fb41	Merge pull request #5638 from bergwolf/github/nydusd versions: update nydusd version	2022-11-21 09:53:11 +08:00
Peng Tao	a636d426d9	versions: update nydusd version To the latest stable v2.1.1. Depends-on: github.com/kata-containers/tests#5246 Fixes: #5635 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-11-19 16:33:29 +00:00
liyuxuan.darfux	3bb145c63a	runtime: Support virtiofs queue size for qemu and make it configurable The default vhost-user-fs queue-size of qemu is 128 now. Set it to 1024 by default which is same as clh. Also make this value configurable. Fixes: #5694 Signed-off-by: liyuxuan.darfux <liyuxuan.darfux@bytedance.com>	2022-11-19 15:38:11 +08:00
Archana Shinde	e80a9f09fa	utils: Add utility function to fetch the kernel version. Add functionality to get kernel version and related unit tests. This is intended to be used in the kata-env command going forward. Fixes: #5688 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-11-18 15:39:57 -08:00
Bin Liu	7506237420	Merge pull request #5144 from openanolis/nydus-dev runtime-rs: support nydus v5 and v6 rootfs	2022-11-18 14:05:04 +08:00
Bo Chen	65686dbbdc	Merge pull request #5684 from likebreath/1117/clh_v28.0 Upgrade to Cloud Hypervisor v28.0	2022-11-17 15:18:51 -08:00
Chelsea Mafrica	85f818743b	Merge pull request #5679 from liubin/fix/5678-update-swap-doc docs: update doc "Setup swap device in guest kernel"	2022-11-17 13:23:57 -08:00
Bo Chen	36545aa81a	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v28.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #5683 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-11-17 09:45:27 -08:00
Bo Chen	f4b02c2244	versions: Upgrade to Cloud Hypervisor v28.0 Details of this release can be found in our new roadmap project as iteration v28.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #5683 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-11-17 09:44:49 -08:00
Fabiano Fidêncio	81c0945afa	Merge pull request #5669 from fidencio/topic/rust-fixes-plus-golang-bump Rust fixes + Golang bump	2022-11-17 16:02:17 +01:00
Bin Liu	e4a6fbadf8	docs: update doc "Setup swap device in guest kernel" `crictl runp` command needs `--runtime kata` option to start a Kata Containers pod. Fixes: #5678 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-11-17 22:57:22 +08:00
Fabiano Fidêncio	2f5f575a43	log-parser: Simplify check ``` 14:13:15 parse.go:306:5: S1009: should omit nil check; len() for github.com/kata-containers/kata-containers/src/tools/log-parser.kvPairs is defined as zero (gosimple) 14:13:15 if pairs == nil \|\| len(pairs) == 0 { 14:13:15 ^ ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-17 14:17:29 +01:00
Fabiano Fidêncio	d94718fb30	runtime: Fix gofmt issues It seems that bumping the version of golang and golangci-lint new format changes are required. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-17 14:16:12 +01:00
Fabiano Fidêncio	16b8375095	golang: Stop using io/ioutils The package has been deprecated as part of 1.16 and the same functionality is now provided by either the io or the os package. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-17 13:43:25 +01:00
Fabiano Fidêncio	66aa330d0d	versions: Update golangci-lint Let's bump the golangci-lint in order to fix issues that popped up after updating Golang to its 1.19.2 version. Depends-on: github.com/kata-containers/tests#5257 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-16 19:03:02 +01:00
Peng Tao	b3a4a16294	versions: bump containerd version v1.5.2 cannot be built from source by newer golang. Let's bump containerd version to 1.6.8. The GO runtime dependency has been moved to v1.6.6 for some time already. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-11-16 19:02:41 +01:00
Peng Tao	eab8d6be13	build: update golang version to 1.19.2 So that we get the latest language fixes. There is little use to maitain compiler backward compatibility. Let's just set the default golang version to the latest 1.19.2. Fixes: #5494 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-11-16 19:02:39 +01:00
Chao Wu	e80dbc15d8	runtime-rs: workaround Dragonball compilation problem Since the upstream rust-vmm is changing its dependency style towards caret requirements in these days (more information: rust-vmm/vm-memory#199) and it breaks Dragonball compilation frequently. rust-vmm is expected to finish the changes this week and in order to not break Kata CI due to Dragonball's compilation error, we will add Cargo.lock file into /src/dragonball first and remove it later when rust-vmm is stable. fixes: #5657 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-11-16 12:44:41 +01:00
Ji-Xinyou	c3f1922df6	fix(fmt): fix cargo fmt to pass static check Fix cargo fmt Fixes: #5639 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-11-16 12:44:38 +01:00
Greg Kurz	1bbcb413c9	Merge pull request #5597 from UiPath/fix-clh-wait clh: avoid race condition when stopping clh	2022-11-16 07:39:27 +01:00
Gabriela Cervantes	a4099dab8f	tools: Fix indentation of build static firecracker script This PR fixes the indentation of the build static firecracker script. Fixes #5663 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-11-15 16:01:36 +00:00
Bin Liu	b8dbb35bb7	Merge pull request #5631 from GabyCT/topic/fixvirtiofsdscript tools: Fix indentation of build static virtiofsd script	2022-11-11 14:31:26 +08:00
Bin Liu	dff78593c0	Merge pull request #5505 from Joffref/patch-1 docs: Fix configuration path	2022-11-11 14:26:40 +08:00
Zhongtao Hu	7d91150185	Merge pull request #5536 from chentt10/fix-name-shim-source-ambiguous runtime-rs : fix the shim source in the documentation test is ambiguous	2022-11-11 14:07:05 +08:00
Zhongtao Hu	c46814b26a	runtime-rs:support nydus v5 and v6 add nydus v5 snd v6 upport for container rootfs Fixes:#5142 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-11-11 10:15:35 +08:00
Alexandru Matei	a04afab74d	qemu: early exit from Check if the process was stopped Fixes: #5625 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Alexandru Matei	7e481f2179	qemu: set stopped only if StopVM is successful Fixes: #5624 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Alexandru Matei	0e3ac66e76	clh: return faster with dead clh process from isClhRunning Through proactively checking if Cloud Hypervisor process is dead, this patch provides a faster path for isClhRunning Fixes: #5623 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Alexandru Matei	9ef68e0c7a	clh: fast exit from isClhRunning if the process was stopped Use atomic operations instead of acquiring a mutex in isClhRunning. This stops isClhRunning from generating a deadlock by trying to reacquire an already-acquired lock when called via StopVM->terminate. Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Alexandru Matei	2631b08ff1	clh: don't try to stop clh multiple times Avoid executing StopVM concurrently when virtiofs dies as a result of clh being stopped in StopVM. Fixes: #5622 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
James O. D. Hunt	56641bc230	Merge pull request #5637 from openanolis/chao/update_cargo_lock versions: update vmm-sys-util and related crates to v0.11.0	2022-11-10 13:49:24 +00:00
Chao Wu	f45fe4f90d	versions: update vmm-sys-util and related crates to v0.11.0 Since the upstream of vmm-sys-utils upgraded to 0.11.0, some crates automatically upgrade to v0.11.0, and some stay at v0.10.0 ( depending on how they write version dependency in Cargo toml` which causes the compile error in runtime-rs. In order to fix this problem, we need to upgrade all vmm-sys-util dependencies in runtime-rs to v0.11.0. fixes: #5636 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-11-10 19:13:23 +08:00
quanweiZhou	bbc93260c9	Merge pull request #5615 from openanolis/chao/delete_cargo_patch runtime-rs: delete all cargo patches	2022-11-10 10:18:19 +08:00
Gabriela Cervantes	8be0817305	tools: Fix indentation of build static virtiofsd script This Pr removes single spaces and fix the indentation of the script. Fixes #5630 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-11-09 17:09:13 +00:00
Zhongtao Hu	071ac4693a	Merge pull request #5613 from openanolis/iptables feat(shim-mgmt): iptables handler	2022-11-09 17:21:45 +08:00
Bin Liu	1d59137c6f	Merge pull request #5620 from GabyCT/topic/removeemptysspaces tools: Remove empty spaces from build kernel script	2022-11-09 17:02:29 +08:00
Ji-Xinyou	f8f97c1e22	feat(shim-mgmt): iptables handler Support the handlers in runtime, which are used by kata-ctl iptables series of commands in runtime. Fixes: #5370 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-11-09 10:39:50 +08:00
Chao Wu	29c75cf12b	runtime-rs: delete all cargo patches The cargo patch in the cargo.toml seems to cause the whole runtime-rs building time longer and also makes it harder to build runtime-rs in an environment without the network We should delete all patches from the cargo.toml file and publish all the crates that was once patched. fixes: #5614 #5527 #5526 #5449 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-11-09 10:02:58 +08:00
Gabriela Cervantes	9f70a6949b	tools: Remove empty spaces from build kernel script This PR removes some extra empty spaces at the build kernel script. Fixes #5619 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-11-08 17:49:57 +00:00
Chao Wu	f5f25d9379	Merge pull request #5431 from wllenyj/dragonball-ut-3 Built-in Sandbox: add more unit tests for dragonball. Part 3	2022-11-08 15:48:16 +08:00
Zhongtao Hu	351bdbfacd	Merge pull request #5567 from openanolis/chao/fix_mem_file_path_error Dragonball: enable mem_file_path config into hugetlbfs process	2022-11-08 09:00:13 +08:00
wllenyj	57336835da	dragonball: add more unit test for device manager Added more unit tests for device manager. Fixes: #4899 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-11-08 00:45:17 +08:00
wllenyj	2333700237	dragonball: add test utils. Added some tools for dragonball unit testing. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-11-08 00:45:17 +08:00
Bin Liu	bfe9157abc	Merge pull request #5570 from openanolis/capability runtime-rs:add hypervisor interface capabilities	2022-11-07 23:04:55 +08:00
Mathis Joffre	3e9c3f12ce	docs: Fix configuration path On install you generate a configuration-fc.toml file when building the kata-runtime and copy it to either /etc/kata-containers/configuration-fc.toml or /usr/share/defaults/kata-containers/configuration-fc.toml. To reflect that the path must be one of the above, we can fix the path in doc. Fixes: #5589 Signed-off-by: Mathis Joffre <mariusjoffre@gmail.com>	2022-11-07 10:19:47 +01:00
Chao Wu	2adb1c1823	Dragonball: enable mem_file_path config into hugetlbfs process In the current Dragonball code, mem_file_path config is not used when hugetlbfs is enabled. In this commit we add mem_file_path into hugetlbfs enable process. fixes: #5566 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-11-07 16:07:57 +08:00
Fabiano Fidêncio	7250be3601	Merge pull request #5584 from fengyehong/clh-thread cloud-hypervisor: Fix GetThreadIDs function	2022-11-07 08:22:40 +01:00
Fabiano Fidêncio	3b1750e8e8	Merge pull request #5586 from fidencio/topic/paralelise-static-checks github: Parallelise static checks	2022-11-07 07:54:48 +01:00
Bin Liu	824ea83c3c	Merge pull request #5573 from pmores/fill-in-virtiofsd-standalone-impl runtime-rs: blanks filled & fixes made to virtiofsd launch	2022-11-07 14:19:45 +08:00
Bin Liu	83d052f82b	Merge pull request #4476 from LitFlwr0/vcpu-pinning-frq vCPUs pinning support for Kata Containers	2022-11-07 10:37:22 +08:00
Guanglu Guo	daeee26a1e	cloud-hypervisor: Fix GetThreadIDs function Get vcpu thread-ids by reading cloud-hypervisor process tasks information. Fixes: #5568 Signed-off-by: Guanglu Guo <guoguanglu@qiyi.com>	2022-11-05 17:23:19 +08:00
Bin Liu	427b01e298	Merge pull request #5548 from justxuewei/fix/share-fs-permission runtime-rs: fix shared volume permission issue	2022-11-04 21:21:50 +08:00
Fabiano Fidêncio	40d514aa2c	github: Parallelise static checks Although introducing an awful amount of code duplication, let's parallelise the static checks in order to reduce its time and the space used in the VMs running those. While I understand there may be ways to make the whole setup less repetitive and error prone, I'm taking the approach of: * Make it work * Make it right * Make it fast So, it's clear that I'm only attempting to make it work, and I'd appreciate community help in order to improve the situation here. But, for now, this is a stopgap solution. JFYI, the time needed for run the tests on the `main` branch went down from ~110 minutes to ~60 minutes. Plus, we're not running those on a single VM anymore, which decreases the change to hit the space limit. Reference: https://github.com/kata-containers/kata-containers/actions/runs/3393468605/jobs/5640842041 Ideally, each one of the following tests should be also split into smaller tests, each test for one component, for instance. * static-checks * compiler-checks * unit-tests * unit-tests-as-root Fixes: #5585 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-04 13:41:16 +01:00
LitFlwr0	2508d39b7c	runtime: added vcpus pinning logics Core VCPU threads pinning logics for issue 4476. Also provided docs. Fixes:#4476 Signed-off-by: LitFlwr0 <861690705@qq.com>	2022-11-04 17:52:42 +08:00
Zhongtao Hu	fef8e92af1	runtime-rs:add hypervisor interface capabilities 1. be able to check does hypervisor support use block device, block device hotplug, multi-queue, and share file 2. be able to set the hypervisor capability of using block device, block device hotplug, multi-queue, and share file Fixes: #5569 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-11-04 09:24:36 +08:00
Bin Liu	b0c7bcce7c	Merge pull request #5556 from ManaSugi/runk/fix-kill-behavior runk: Ignore an error when calling kill cmd with --all option	2022-11-04 08:42:27 +08:00
Bin Liu	02fa6b8dad	Merge pull request #5557 from ManaSugi/runk/update-cargolock-libseccomp runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock	2022-11-04 08:41:45 +08:00
Fabiano Fidêncio	bb38901550	Merge pull request #5571 from jodh-intel/snap-unbreak-docker snap: Unbreak docker install	2022-11-03 23:47:07 +01:00
Pavel Mores	27b1913584	runtime-rs: blanks filled & fixes made to virtiofsd launch The 'config' argument to ShareVirtioFsStandalone::new() is now actually used, taking care of an explicit TODO. If a shared path doesn't exist in ShareVirtioFsStandalone::virtiofsd_args() it is now created instead of returning an error, thus following ShareVirtioFsInline's suit. The '-o vhost_user_socket=...' command line argument doesn't seem to be supported by newer versions of virtiofsd so we replace it with '--socket-path' which should be functionally equivalent according to docs. Fixes #5572 Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-11-03 08:38:59 +01:00
James O. D. Hunt	990e6359b7	snap: Unbreak docker install It appears that _either_ the GitHub workflow runners have changed their environment, or the Ubuntu archive has changed package dependencies, resulting in the following error when building the snap: ``` Installing build dependencies: bc bison build-essential cpio curl docker.io ... : The following packages have unmet dependencies: docker.io : Depends: containerd (>= 1.2.6-0ubuntu1~) E: Unable to correct problems, you have held broken packages. ``` This PR uses the simplest solution: install the `containerd` and `runc` packages. However, we might want to investigate alternative solutions in the future given that the docker and containerd packages seem to have gone wild in the Ubuntu GitHub workflow runner environment. If you include the official docker repo (which the snap uses), a _subset_ of the related packages is now: - `containerd` - `containerd.io` - `docker-ce` - `docker.io` - `moby-containerd` - `moby-engine` - `moby-runc` - `runc` Fixes: #5545. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-11-02 10:09:03 +00:00
James O. D. Hunt	ca69a9ad6d	snap: Use metadata for dependencies Rather than hard-coding the package manager into the docker part, use the `build-packages` section to specify the parts package dependencies in a distro agnostic manner. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-11-02 09:50:29 +00:00
Manabu Sugimoto	df092185ee	runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock The libseccomp crate was upgraded to v0.3.0 by `4696ead`, but `Cargo.lock` of runk wasn't updated by mistake. So, this commit updates `Cargo.lock` of runk to the latest dependencies. Fixes: #5487 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-01 20:26:33 +09:00
Manabu Sugimoto	16dca4ecd4	runk: Ignore an error when calling kill cmd with --all option Ignore an error handling that is triggered when the kill command is called with `--all option` to the stopped container. High-level container runtimes such as containerd call the kill command with `--all` option in order to terminate all processes inside the container even if the container already is stopped. Hence, a low-level runtime should allow `kill --all` regardless of the container state like runc. This commit reverts to the previous behavior. Fixes: #5555 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-01 20:24:29 +09:00
Xuewei Niu	b74c18024a	runtime-rs: fix shared volume permission issue Fix the issue where share volumes always have readwrite permission even if readonly permission is enough. Fixes: #5549 Signed-off-by: Xuewei Niu <justxuewei@apache.org>	2022-11-01 18:42:19 +08:00
Chen TaoTao	936fe35acb	runtime-rs : fix shim source is ambiguous In the documentation test, the name shim has multiple potential sources of import, now give it a clear source. Fixes: #5535 Signed-off-by: Chen TaoTao <chentt10@chinatelecom.cn>	2022-10-31 19:54:22 -07:00
snir911	288e337a6f	Merge pull request #5434 from Rouzip/remove-doNetNS add EnterNetNS in virtcontainers	2022-10-30 11:19:07 +02:00
GabyCT	e04ad49c1b	Merge pull request #5530 from GabyCT/topic/fixclhscript tools: Fix indentation of build static clh script	2022-10-28 11:52:56 -05:00
Gabriela Cervantes	0ed7da30d7	tools: Fix indentation of build static clh script This Pr removes single spaces and fix the indentation of the script. Fixes #5528 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-10-27 21:09:34 +00:00
Bin Liu	0bb005093e	Merge pull request #5523 from BbolroC/s390x-virtiofsd virtiofsd: Not use "link-self-contained=yes" on s390x	2022-10-27 20:42:57 +08:00
Hyounggyu Choi	43fcb8fd09	virtiofsd: Not use "link-self-contained=yes" on s390x The compile option link-self-contained=yes asks rustc to use C library startup object files that come with the compiler, which are not available on the target s390x-unknown-linux-gnu. A build does not contain any startup files leading to a broken executable entry point (causing segmentation fault). Fixes: #5522 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2022-10-26 23:43:22 +02:00
David Esparza	37f0cd1c8f	Merge pull request #5436 from amshinde/kata-ctl-drop-privs Kata ctl drop privs	2022-10-26 11:37:27 -05:00
David Esparza	8b0c830a23	Merge pull request #5513 from bergwolf/github/golang-ci-lint versions: bump golangci-lint version	2022-10-26 07:36:45 -05:00
Bin Liu	059b09b0a8	Merge pull request #5510 from bergwolf/github/runtime-rs-makefile runtime-rs: generate config files with the default target	2022-10-26 20:29:17 +08:00
David Esparza	4d6c3bd0fa	Merge pull request #5515 from cmaf/docs-fix-sgx-k8s-volumemount docs: Fix volumeMounts in SGX usage example	2022-10-26 07:24:31 -05:00
Chelsea Mafrica	219919e9f7	docs: Fix volumeMounts in SGX usage example The /dev/sgx is not mounted and the enclave is not available, causing the demo job to report an error in the logs. Add volumeMounts to container in order to have the device available in the container. Fixes: #5514 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-10-25 23:20:49 -07:00
Archana Shinde	c0f5bc81b7	cargo: Add Cargo.lock to version control Add Cargo.lock to capture state of build. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-10-25 20:34:40 -07:00
Archana Shinde	474927ec90	gitignore: Add gitignore file Ignore autogeneraated version.rs Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-10-25 20:34:40 -07:00
Archana Shinde	699f821e12	utils: Add function to drop priveleges This function is meant to be used before operations such as accessing network to make sure those operations are not performed as a privilged user. Fixes: #5331 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-10-25 20:34:40 -07:00
Peng Tao	a6fb4e2a68	versions: bump golangci-lint version There is little point to maintain backward compatiblity for golangci-lint. Let's just use a unified version of it. Fixes: #5512 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-10-26 10:41:24 +08:00
Peng Tao	b015f34aff	runtime-rs: generate config files with the default target Right now it is not generated with a simple `make`. Fixes: #5509 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-10-26 10:25:29 +08:00
Yuan-Zhuo	d7bb4b5512	agent: support systemd cgroup for kata agent 1. Implemented a rust module for operating cgroups through systemd with the help of zbus (src/agent/rustjail/src/cgroups/systemd). 2. Add support for optional cgroup configuration through fs and systemd at agent (src/agent/rustjail/src/container.rs). 3. Described the usage and supported properties of the agent systemd cgroup (docs/design/agent-systemd-cgroup.md). Fixes: #4336 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2022-10-25 13:57:09 +08:00
Bo Chen	a151d8ee50	Merge pull request #5493 from fidencio/topic/update-clh versions: Update Cloud Hypervisor to b4e39427080	2022-10-24 07:54:02 -07:00
Bin Liu	0f7088a4b1	Merge pull request #5501 from openanolis/update_install_guide docs: update rust runtime installation guide	2022-10-24 17:49:34 +08:00
Bin Liu	4696eadfeb	Merge pull request #5488 from ManaSugi/fix/update-libseccomp-crate rustjail: Upgrade libseccomp crate to v0.3.0	2022-10-24 17:03:30 +08:00
Bin Liu	badb2600b3	Merge pull request #5474 from openanolis/makefile makefile: remove sudo when create symbolic link	2022-10-24 17:03:20 +08:00
Bin Liu	ab5f97759d	Merge pull request #5497 from Rouzip/remove-redundant agent: remove redundant checks	2022-10-24 16:41:49 +08:00
Fabiano Fidêncio	190e623c40	Merge pull request #5317 from Champ-Goblem/fix-containerd-stats shim: Ensure pagesize is set when reporting hugetlb stats	2022-10-24 10:24:49 +02:00
Fabiano Fidêncio	7248cf51c5	Merge pull request #5447 from hbrueckner/fix-5438 kata-ctl: Re-enable network tests on s390x (fixes 5438)	2022-10-24 10:23:35 +02:00
Zhongtao Hu	144efd1a7a	docs: update rust runtime installation guide As kata-deploy support rust runtime, we need to update the installation docs Fixes:#5500 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-10-24 15:55:30 +08:00
James O. D. Hunt	65ef2a0a0b	Merge pull request #5089 from liubin/fix/4895-ignore-exit-error agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink	2022-10-24 08:46:54 +01:00
Zhongtao Hu	164ecca3f0	Merge pull request #5499 from zhaoxuat/main fix readme content error at doc directory	2022-10-24 14:15:52 +08:00
zhaoxu	abf4f9b299	docs: kata 3.0 Architecture fix readme content error Fixes: #5498 Signed-off-by: zhaoxu <zhaoxu@megvii.com>	2022-10-24 11:07:34 +08:00
snir911	ee189d2ebe	Merge pull request #5455 from kata-containers/main-validate-hp-size agent: validate hugepage size is supported	2022-10-23 08:15:05 +03:00
Rouzip	44d8de8923	agent: remove redundant checks Remove redundant checks for executable files. FIXes: #3730 Signed-off-by: Rouzip <1226015390@qq.com>	2022-10-22 23:31:18 +08:00
Fabiano Fidêncio	9d286af7b4	versions: Update Cloud Hypervisor to b4e39427080 An API change, done a long time ago, has been exposed on Cloud Hypervisor and we should update it on the Kata Containers side to ensure it doesn't affect Cloud Hypervisor CI and because the change is needed for an upcoming work to get QAT working with Cloud Hypervisor. Fixes: #5492 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-21 20:52:54 +02:00
Bin Liu	081ee48713	agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink Sometimes we will face EEXIST error when adding arp neighbour. Using NLM_F_REPLACE replace NLM_F_EXCL will avoid fail if the entry exists. See https://man7.org/linux/man-pages/man7/netlink.7.html Fixes: #4895 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-10-21 21:19:14 +08:00
Hendrik Brueckner	e95089b716	kata-ctl: add basic cpu check for s390x Add a basic s390x cpu check for the "sie" feature to be present. Also re-enable cpu check testing. Fixes: #5438 Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>	2022-10-21 12:04:28 +00:00
Hendrik Brueckner	871d2cf2c0	kata-ctl: Limit running tests to x86 and use native-tls on s390x For s390x, use native-tls for reqwest because the rustls-tls/ring dependency is not available for s390x. Also exclude s390x, powerpc64le, and aarch64 from running the cpu check due to the lack of the arch-specific implementation. In this case, rust complains about unused functions in src/check.rs (both normal and test context). Fixes: #5438 Co-authored-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>	2022-10-21 11:54:26 +00:00
Manabu Sugimoto	cbd84c3f5a	rustjail: Upgrade libseccomp crate to v0.3.0 The libseccomp crate v0.3.0 has been released, so use it in the agent. Fixes: #5487 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-10-21 15:40:05 +09:00
Bin Liu	1bf64c9a11	Merge pull request #5453 from openanolis/chao/fix_comment_typo Makefile: fix an typo in runtime-rs makefile	2022-10-21 14:36:39 +08:00
David Esparza	1c159d83ea	Merge pull request #5465 from fidencio/topic/re-work-QEMU-dockerfile qemu: Re-work static-build Dockerfile	2022-10-20 13:32:03 -05:00
Zhongtao Hu	748be0fe3d	makefile: remove sudo when create symbolic link when using mock to package rpm, we cannot have sudo permission Fixes: #5473 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-10-20 22:13:21 +08:00
Bin Liu	cd27ad144e	Merge pull request #5219 from openanolis/krt-modify Modify agent-url return value in runtime-rs	2022-10-20 11:17:29 +08:00
Fabiano Fidêncio	227e717d27	qemu: Re-work static-build Dockerfile Differently than every single other bit that's part of our repo, QEMU has been using a single Dockerfile that prepares an environment where the project can be built, but also building the project as part of that very same Dockerfile. This is a problem, for several different reasons, including: * It's very hard to have a reproducible build if you don't have an archived image of the builder * One cannot cache / ipload the image of the builder, as that contains already a specific version of QEMU * Every single CI run we end up building the builder image, which includes building dependencies (such as liburing) Let's split the logic into a new build script, and pass the build script to be executed inside the builder image, which will be only responsible for providing an environment where QEMU can be built. Fixes: #5464 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-19 21:34:36 +02:00
Bin Liu	faf363db75	Merge pull request #5414 from openanolis/chao/regulate_runtime_rs_makefile_comments runtime-rs: regulate the comment in runtime-rs makefile	2022-10-19 15:36:00 +08:00
Snir Sheriber	72738dc11f	agent: validate hugepage size is supported before setting a limit, otherwise paths may not be found. guest supporting different hugepage size is more likely with peer-pods where podvm may use different flavor. Fixes: #5191 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-10-19 09:55:33 +03:00
Chao Wu	f74e328fff	Makefile: fix an typo in runtime-rs makefile There is a typo in runtime-rs makefile. _dragonball should be _DB fixes: #5452 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-10-19 14:12:48 +08:00
Chao Wu	f205472b01	Makefile: regulate the comment style for the runtime-rs comments In runtime-rs makefile, we use ``` ``` to let make help print out help information for variables and targets, but later commits forgot this rule. So we need to follow the previous rule and change the current comments. fixes: #5413 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-10-19 12:12:50 +08:00
Fabiano Fidêncio	c97b7b18e7	Merge pull request #5416 from zvonkok/patch-1 doc: Update how-to-run-kata-containers-with-SNP-VMs.md	2022-10-18 22:45:05 +02:00
Hendrik Brueckner	9f2c7e47c9	Revert "kata-ctl: Disable network check on s390x" This reverts commit `00981b3c0a`. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>	2022-10-18 11:12:18 +00:00
James O. D. Hunt	dd60a0298d	Merge pull request #5439 from jodh-intel/kata-ctl-s390x-disable-tls kata-ctl: Disable network check on s390x	2022-10-18 09:58:09 +01:00
Zvonko Kaiser	ac403cfa5a	doc: Update how-to-run-kata-containers-with-SNP-VMs.md If the needed libraries (for virtfs) are installed on the host, QEMU will pick it up and enable it. If not installed and you do not enable the flag, QEMU will just ignore it, and you end up without 9p support. Enabling it explicitly will fail if the needed libs are not installed so this way we can be sure that it gets build. Fixes: #5418 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2022-10-17 05:56:19 -07:00
James O. D. Hunt	00981b3c0a	kata-ctl: Disable network check on s390x s390x apparently does not support rust-tls, which is required by the network check (due to the `reqwest` crate dependency). Disable the network check on s390x until we can find a solution to the problem. > Note: > > This fix is assumed to be a temporary one until we find a solution. > Hence, I have not moved the network check code (which should be entirely > generic) into an architecture specific module. Fixes: #5435. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-17 10:24:06 +01:00
Rouzip	39363ffbfb	runtime: remove same function Add EnterNetNS in virtcontainers to remove same function. FIXes #5394 Signed-off-by: Rouzip <1226015390@qq.com>	2022-10-17 10:59:13 +08:00
James O. D. Hunt	c322d1d12a	kata-ctl: arch: Improve check call Rework the architecture-specific `check()` call by moving all the conditional logic out of the function. Fixes: #5402. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-15 11:41:53 +01:00
Fabiano Fidêncio	ff8bfdfe3b	Merge pull request #5426 from fidencio/topic/build-virtiofsd-in-a-2nd-layer-container virtiofsd: Build inside a container	2022-10-15 00:26:56 +02:00
Fabiano Fidêncio	0bc5baafb9	snap: Build virtiofsd using the kata-deploy scripts Let's build virtiofsd using the kata-deploy build scripts, which simplifies and unifies the way we build our components. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-14 13:44:03 +02:00
Fabiano Fidêncio	cb4ef4734f	snap: Create a task for installing docker Let's have the docker installation / configuration as part of its own task, which can be set as a dependency of other tasks whcih may or may not depend on docker. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-14 12:41:21 +02:00
Fabiano Fidêncio	7e5941c578	virtiofsd: Build inside a container When moving to building the CI artefacts using the kata-deploy scripts, we've noticed that the build would fail on any machine where the tarball wasn't officially provided. This happens as rust is missing from the 1st layer container. However, it's a very common practice to leave the 1st layer container with the minimum possible dependencies and install whatever is needed for building a specific component in a 2nd layer container, which virtiofsd never had. In this commit we introduce the second layer containers (yes, comtainers), one for building virtiofsd using musl, and one for building virtiofsd using glibc. The reason for taking this approach was to actually simplify the scripts and avoid building the dependencies (libseccomp, libcap-ng) using musl libc. Fixes: #5425 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-14 12:41:21 +02:00
Zhongtao Hu	5d17cbeef7	Merge pull request #5383 from openanolis/chao/update_comments_in_event_manager Dragonball: remove redundant comments in event manager	2022-10-14 15:50:37 +08:00
Fabiano Fidêncio	c745d6648d	Merge pull request #5420 from fidencio/topic/update-tdx-qemu-repo versions: Update TDX QEMU	2022-10-13 20:57:37 +02:00
Bin Liu	b23a24ab2f	Merge pull request #5417 from liubin/fix/typo-get_contaier_type runtime-rs: fix typo get_contaier_type to get_container_type	2022-10-13 22:35:23 +08:00
Bin Liu	c7b38532f0	Merge pull request #5412 from tzY15368/improve-cmd-descriptions kata-ctl: improve command descriptions for consistency	2022-10-13 19:17:42 +08:00
Fabiano Fidêncio	35d52d30fd	versions: Update TDX QEMU The previously used repo will be removed by Intel, as done with the one used for TDX kernel. The TDX team has already worked on providing the patches that were hosted atop of the QEMU commit with the following hash 4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0 as a tarball in the https://github.com/intel/tdx-tools repo, see https://github.com/intel/tdx-tools/pull/162. On the Kata Containers side, in order to simplify the process and to avoid adding hundreds of patches to our repo, we've revived the https://github.com/kata-containers/qemu repo, and created a branch and a tag with those hundreds of patches atop of the QEMU commit hash 4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0. The branch is called 4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0-plus-TDX-v3.1 and the tag is called TDX-v3.1. Knowing the whole background, let's switch the repo we're getting the TDX QEMU from. Fixes: #5419 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-13 11:53:29 +02:00
Bin Liu	4d9dd8790d	runtime-rs: fix typo get_contaier_type to get_container_type Change get_contaier_type to get_container_type Fixes: #5415 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-10-13 17:12:43 +08:00
Bin Liu	2de29b6f69	Merge pull request #5088 from liubin/fix/5087-force-shutdown-shim runtime-rs: force shutdown shim process in it can't exit	2022-10-13 16:55:05 +08:00
Fabiano Fidêncio	d934d87482	Merge pull request #5404 from fidencio/topic/update-tdx-kernel-repo versions: Update TDX kernel	2022-10-13 09:14:44 +02:00
Tingzhou Yuan	70676d4a99	kata-ctl: improve command descriptions for consistency This change improves the command descriptions for kata-ctl and can avoid certain confusions in command functionality. Fixes #5411 Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>	2022-10-13 04:10:23 +00:00
Bin Liu	3b70c72436	Merge pull request #5395 from wllenyj/dragonball-s390 ci: skip s390x for dragonball.	2022-10-13 09:03:08 +08:00
Bin Liu	157d3cdcb1	Merge pull request #5397 from openanolis/chao/delete_redundant_dragonball_comment Dragonball: delete redundant comments in blk_dev_mgr	2022-10-13 09:01:59 +08:00
Fabiano Fidêncio	9eb73d543a	versions: Update TDX kernel The previously used repo has been removed by Intel. As this happened, the TDX team worked on providing the patches that were hosted atop of the v5.15 kernel as a tarball present in the https://github.com/intel/tdx-tools repos, see https://github.com/intel/tdx-tools/pull/161. On the Kata Containers side, in order to simplify the process and to avoid adding ~1400 kernel patches to our repo, we've revived the https://github.com/kata-containers/linux repo, and created a branch and a tag with those ~1400 patches atop of the v5.15. The branch is called v5.15-plus-TDX, and the tag is called 5.15-plus-TDX (in order to avoid having to change how the kernel builder script deals with versioning). Knowing the whole background, let's switch the repo we're getting the TDX kernel from. Fixes: #5326 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-12 16:54:43 +02:00
James O. D. Hunt	d3ee8d9f1b	Merge pull request #5388 from jodh-intel/kata-ctl kata-ctl: Move development to main branch	2022-10-12 14:29:35 +01:00
James O. D. Hunt	00a42f69c0	kata-ctl: cargo: 2021 -> 2018 Revert to the 2018 edition of rust for consistency with other rust components. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-12 11:46:51 +01:00
James O. D. Hunt	fb63274747	kata-ctl: rustfmt + clippy fixes Make this file conform to the standard rust layout conventions and simplify the code as recommended by `clippy`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-12 11:46:48 +01:00
wllenyj	1f1901e059	dragonball: fix clippy warning for aarch64 Added aarch64 check. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-10-12 18:29:00 +08:00
wllenyj	a343c570e4	dragonball: enhance dragonball ci Unified use of Makefile instead of calling `cargo test` directly. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-10-12 17:53:01 +08:00
wllenyj	6a64fb0eb3	ci: skip s390x for dragonball. Currently, Dragonball only supports x86_64 and aarch64 platforms. Fixes: #4381 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-10-12 15:27:45 +08:00
Bin Liu	7aacba0abc	Merge pull request #5282 from liubin/fix/4730-rs-emptydir runtime-rs: support ephemeral storage for emptydir	2022-10-12 09:53:59 +08:00
Chao Wu	a743e37daf	Dragonball: delete redundant comments in blk_dev_mgr delete redundent derive part for BlockDeviceMgr. fixes: #5396 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-10-11 19:41:47 +08:00
Chao Wu	d2bf2f5dd0	Merge pull request #5393 from LetFu/5392/fixInstallKata30RustRuntimeShimGuideTypo docs: fix a typo in rust-runtime-installation-guide	2022-10-11 19:27:31 +08:00
James O. D. Hunt	2b345ba29d	build: Add kata-ctl to tools list Update the top-level Makefile to build the `kata-ctl` tool by default. Fixes: #4499, #5334. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-11 10:05:16 +01:00
James O. D. Hunt	f7010b8061	kata-ctl: docs: Write basic documentation Provide a basic document explaining a little about the `kata-ctl` command. Fixes: #5351. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-11 10:04:48 +01:00
Bin Liu	ffdd7e1ad8	Merge pull request #4961 from wllenyj/dragonball-ut-2 Built-in Sandbox: add more unit tests for dragonball	2022-10-11 14:12:25 +08:00
Bin Liu	39702c19d5	Merge pull request #5276 from bergwolf/github/readme readme: remove libraries mentioning	2022-10-11 13:19:18 +08:00
chmod100	862eaef863	docs: fix a typo in rust-runtime-installation-guide Fixes: #5392 Signed-off-by: chmod100 <letfu@outlook.com>	2022-10-11 02:31:29 +00:00
wllenyj	26c043dee7	ci: Add dragonball test Enhanced Static-Check of CI to support nested virtualization. Fixes: #5378 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-10-11 00:36:20 +08:00
James O. D. Hunt	781e604c39	docs: Reference kata-ctl README Add a link to the `kata-ctl` tool's README. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 16:49:53 +01:00
James O. D. Hunt	15c343cbf2	kata-ctl: Don't rely on system ssl libs Build using the rust TLS implementation rather than the system ones. This resolves the `reqwest` crate build failure: it doesn't appear to build against the native libssl libraries due to Kata defaulting to using the musl libc. Fixes: #5387. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 13:42:51 +01:00
James O. D. Hunt	c23584994a	kata-ctl: clippy: Resolve warnings and reformat Resolved a couple of clippy warnings and applied standard `rustfmt`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 13:42:51 +01:00
David Esparza	133690434c	kata-ctl: implement CLI argument --check-version-only This kata-ctl argument returns the latest stable Kata release by hitting github.com. Adds check-version unit tests. Fixes: #11 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2022-10-10 13:42:51 +01:00
David Esparza	eb5423cb7f	kata-ctl: switch to use clap derive for CLI handling Switch from the functional version of `clap` to the declarative methodology. Signed-off-by: David Esparza <david.esparza.borquez@intel.com> Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 13:42:51 +01:00
Chelsea Mafrica	018aa899cb	kata-ctl: Add cpu check Add architecture-specific code for x86_64 and generic calls handling checks for CPU flags and attributes. Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-10-10 13:42:50 +01:00
James O. D. Hunt	7c9f9a5a1d	kata-ctl: Make arch test run at compile time Changed the `panic!()` call to a `compile_error!()` one to ensure it fires at compile time rather than runtime. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 13:42:50 +01:00
James O. D. Hunt	b63ba66dc3	kata-ctl: Formatting tweaks Automatic format updates. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 13:42:50 +01:00
James O. D. Hunt	cca7e32b54	kata-ctl: Lint fixes to allow the branch to be built Remove return value for branches that call `unimplemented!()`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 13:42:50 +01:00
Chelsea Mafrica	8e7bb8521c	kata-ctl: add code for framework for arch Add framework for different architectures for check. In the existing kata-runtime check, the network checks do not appear to be architecture-specific while the kernel module, cpu, and kvm checks do have separate implementations for different architectures. Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-10-10 13:42:50 +01:00
David Esparza	303fc8b118	kata-ctl: Add unit tests cases Add more unit tests cases to --version argument. Signed-off-by: David Esparza <david.esparza.borquez@intel.com> Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 13:42:43 +01:00
David Esparza	d0b33e9a32	versions: Add kata-ctl version entry As we're switching to using the rust version of the kata-ctl, lets provide with its own entry in the kata-ctl command line. Signed-off-by: David Esparza <david.esparza.borquez@intel.com> Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-10-10 13:42:35 +01:00
Chelsea Mafrica	002b18054d	kata-ctl: Add initial rust code for kata-ctl Use agent-ctl tool rust code as an example for a skeleton for the new kata-ctl tool. Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-10-10 10:10:37 +01:00
wllenyj	b62b18bf1c	dragonball: fix clippy warning Fixed: - unnecessary_lazy_evaluations - derive_partial_eq_without_eq - redundant_closure - single_match - question_mark - unused-must-use - redundant_clone - needless_return Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-10-10 16:41:40 +08:00
wllenyj	2ddc948d30	Makefile: add dragonball components. Enable ci to run dragonball unit tests. Fixes: #4899 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-10-10 16:41:40 +08:00
wllenyj	3fe81fe4ab	dragonball-ut: use skip_if_not_root to skip root case Use skip_if_not_root to skip when unit test requires privileges. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-10-10 16:41:40 +08:00
wllenyj	72259f101a	dragonball: add more unit test for vmm actions Added more unit tests for vmm actions. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-10-10 16:41:39 +08:00
Peng Tao	acd72c44d4	Merge pull request #5380 from bergwolf/3.1.0-alpha0-branch-bump # Kata Containers 3.1.0-alpha0	2022-10-09 16:16:36 +08:00
Chao Wu	9717dc3f75	Dragonball: remove redundant comments in event manager handle_events for EventManager doesn't take max_events as arguments, so we need to update the comments for it. p.s. max_events is defined when initializing the EventManager. fixes: #5382 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-10-09 14:38:12 +08:00
Peng Tao	ee74231b1c	release: Kata Containers 3.1.0-alpha0 - libs/kata-types: adjust default_vcpus correctly - runtime-rs: delete duplicated PASSTHROUGH_FS_DIR const - Enable ACRN hypervisor support for Kata 2.x release - agent: reduce reference count for failed mount - agent: don't exit early if signal fails due to ESRCH - kata-sys-util: delete duplicated get_bundle_path - packaging: Mount $HOME/.docker in the 1st layer container - Upgrade to Cloud Hypervisor v27.0 - microvm: Remove kernel_irqchip=on option - kata-sys-util: fix typo `unknow` - dragonball: update ut for kernel config - versions: Update gperf url to avoid libseccomp random failures - versions: Update oci version - dragonball: fix no "as_str" error on Arm - tools: release: fix bogus version check - runtime-rs: update Cargo.lock - refactor(runtime-rs): Use RwLock in runtime-agent - runtime-rs: fix shim close_io call to support kubectl cp - runtime-rs: add comments for runtime-rs shared directory - workflow: trigger test-kata-deploy with pull_request and fix workflow_dispatch - Dragonball: update linux_loader to 0.6.0 - modify virtio_net_dev_mgr.rs wrong code comments - docs: Update urls in runk documentation - runtime-rs: support watchable mount - runtime-rs: debug console support in runtime - kata-deploy: ship the rustified runtime binary - runtime-rs: define VFIO unbind path as a const - runtime-rs: set agent timeout to 0 for stream RPCs - Added SNP-Support for Kata-Containers - packaging: fix typo in configure-hypervisor.sh - runtime/runtime-rs: update dependency - release: Revert kata-deploy changes after 3.0.0-rc0 release - runtime-rs: add test for StaticResource - runtime-rs: remove hardcoded string - docs: add README for runtime-rs hypervisor crate - runtime-rs: use Path.is_file to check regular files - osbuilder: Export directory variables for libseccomp - runtime-rs: add unit tests for network resource - runtime-rs/resource: use macro to reduce duplicated code - runtime-rs: fix incorrect comments - kernel: Add crypto kernel config for s390 - Non-root hypervisor uid reuse bug - Build-in Sandbox: update dragonball-sandbox dependencies - docs: Update url in virtualization document - dragonball: Fix problem that stdio console cannot connect to stdout - runtime-rs: call TomlConfig's validate function after load - feat(Shimmgmt): Shim management server and client `53f209af4` libs/kata-types: adjust default_vcpus correctly `ef5a2dc3b` agent: don't exit early if signal fails due to ESRCH `435c8f181` acrn: Enable ACRN hypervisor support for Kata 2.x release `c31cf7269` agent: reduce reference count for failed mount `4da743f90` packaging: Mount $HOME/.docker in the 1st layer container `067e2b1e3` runtime: clh: Use the new API to boot with TDX firmware (td-shim) `5d63fcf34` runtime: clh: Re-generate the client code `fe6107042` versions: Upgrade to Cloud Hypervisor v27.0 `17de94e11` microvm: Remove kernel_irqchip=on option `3aeaa6459` runtime-rs: delete duplicated PASSTHROUGH_FS_DIR const `43ae97233` kata-sys-util: delete duplicated get_bundle_path `ac0483122` kata-sys-util: fix typo `unknow` `a24127659` versions: Update gperf url to avoid libseccomp random failures `a617a6348` versions: Update oci version `6d585d591` dragonball: fix no "as_str" error on Arm `421729f99` tools: release: fix bogus version check `457b0beaf` runtime-rs: update Cargo.lock `f89ada2de` dragonball: update ut for kernel config `0e899669e` runtime-rs: fix shim close_io call to support kubectl cp `96cf21fad` runtime-rs: add comments for runtime-rs shared directory `9bd941098` docs: Update urls in runk documentation `90ecc015e` Dragonball: update linux_loader to 0.6.0 `4a763925e` runtime-rs: support watchable mount `abc26b00b` dragonball: modify wrong code comments modify virtio_net_dev_mgr.rs wrong code comments `20bcaf0e3` runtime-rs: set agent timeout to 0 for stream RPCs `274de024c` docs: add README for runtime-rs hypervisor crate `a4a23457c` osbuilder: Export directory variables for libseccomp `d663f110d` kata-deploy: get the config path from cri options `c6b3dcb67` kata-deploy: support kata-deploy for runtime-rs `46965739a` runtime-rs: remove hardcoded string `a394761a5` kata-deploy: add installation for runtime-rs `50299a329` refactor(runtime-rs): Use RwLock in runtime agent `9628c7df0` runtime: update runc dependency `7fbc88387` runtime-rs: drop dependency on rustc-serialize `bf2be0cf7` release: Revert kata-deploy changes after 3.0.0-rc0 release `e23bfd615` runtime-rs: make function name more understandable `426a43678` runtime-rs: add unit test and eliminate raw string `87959cb72` runtime-rs: debug console support in runtime `d55cf9ab7` docs: Update url in virtualization document `0399da677` runtime-rs: update dependencies `f6f19917a` dragonball: update dragonball-sandbox dependencies `2caee1f38` runtime-rs: define VFIO unbind path as a const `3f65ff2d0` runtime-rs: fix incorrect comments `9670a3caa` runtime-rs: use Path.is_file to check regular files `d9e6eb11a` docs: Guide to use SNP-VMs with Kata-Containers `ded60173d` runtime: Enable choice between AMD SEV and SNP `22bda0838` runtime: Support for AMD SEV-SNP VMs `a2bbd2942` kernel: Introduce SNP kernel `0e69405e1` docs: Developer-Guide updated `105eda5b9` runtime: Initrd path option added to config `a8a8a28a3` runtime-rs/resource: use macro to reduce duplicated code `7622452f4` Dragonball: Fix the problem about stdio console `208233288` runtime-rs: add test for StaticResource `adb33a412` packaging: fix typo in configure-hypervisor.sh `f91431987` runtime: store the user name in hypervisor config `86a02c5f6` kernel: Add crypto kernel config for s390 `5cafe2177` runtime: make StopVM thread-safe `c3015927a` runtime: add more debug logs for non-root user operation `5add50aea` runtime-rs: timeout for shim management client `9f13496e1` runtime-rs: shim management client `aaf6d6908` runtime-rs: call TomlConfig's validate function after load `e891295e1` runtime-rs: shim management - agent-url `59aeb776b` runtime-rs: shim management `a828292b4` runtime-rs: add unit tests for network resource `7676cde0c` workflow: trigger test-kata-deploy with pull_request `f10827357` workflow: require PR num input on test-kata-deploy workflow_dispatch 428d6dc80 workflow: Revert "workflow: trigger test-kata-deploy with pull_request" Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-10-09 11:50:42 +08:00
Peng Tao	102a9dda71	workflow: Revert "workflow: trigger test-kata-deploy with pull_request" This reverts commit `7676cde0c5`. It turns out that when triggerred from a PR, the docker login command is failing with ``` Error: Cannot perform an interactive login from a non TTY device ``` Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-10-09 11:50:42 +08:00
Fupan Li	2c88e1cd80	Merge pull request #5302 from liubin/fix/5285-SetFsSharingSupport-comment runtime: fix incorrect comment for SetFsSharingSupport function	2022-10-09 09:40:31 +08:00
Bin Liu	b556c9b986	Merge pull request #5235 from YchauWang/wyc-qmp-log virtcontainers: add warn log record for qmp hotplug cpu error	2022-10-09 08:29:09 +08:00
Bin Liu	07201c7fe5	Merge pull request #5111 from liubin/fix/5110-adjust-default-vcpus libs/kata-types: adjust default_vcpus correctly	2022-10-08 20:29:53 +08:00
Bin Liu	53f209af44	libs/kata-types: adjust default_vcpus correctly With default_maxvcpus = 0 and default_vcpus = 1 settings, the default_vcpus will be set to 0 and leads to starting fail. The default_maxvcpus is not set correctly when it is set to 0, and the default_vcpus is set to 0. The correct action is setting default_maxvcpus to the max number of CPUs or MAX_DRAGONBALL_VCPUS, and the default_vcpus should be set to the desired value if the valuse is between 0 and default_maxvcpus. Fixes: #5110 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-10-08 16:52:05 +08:00
Bin Liu	dd34540b8a	Merge pull request #5305 from liubin/fix/5301-delete-duplicated-PASSTHROUGH_FS_DIR runtime-rs: delete duplicated PASSTHROUGH_FS_DIR const	2022-10-08 16:39:03 +08:00
Ji-Xinyou	9c1ac3d457	runtime-rs: return port on agent-url req Add the server vport (1024) when requesting agent-url Fixes: #5213 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-10-08 16:14:21 +08:00
Fabiano Fidêncio	ce73bc6dac	Merge pull request #5015 from vijaydhanraj/enable_acrn_kata2.x Enable ACRN hypervisor support for Kata 2.x release	2022-10-08 09:27:59 +02:00
Bin Liu	4616363eec	Merge pull request #5365 from fengwang666/mount-bug-fix agent: reduce reference count for failed mount	2022-10-08 14:27:38 +08:00
Fupan Li	1b7272c7ca	Merge pull request #5367 from fengwang666/signal-bug-fix agent: don't exit early if signal fails due to ESRCH	2022-10-08 14:21:50 +08:00
Feng Wang	ef5a2dc3bf	agent: don't exit early if signal fails due to ESRCH ESRCH usually means the process has exited. In this case, the execution should continue to kill remaining container processes. Fixes: #5366 Signed-off-by: Feng Wang <feng.wang@databricks.com> [Fix up cargo updates] Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-10-08 12:15:12 +08:00
Bin Liu	5ace4e2354	Merge pull request #5304 from liubin/fix/5299-delete-duplicated-get_bundle_path kata-sys-util: delete duplicated get_bundle_path	2022-10-08 10:57:52 +08:00
Vijay Dhanraj	435c8f181a	acrn: Enable ACRN hypervisor support for Kata 2.x release Currently ACRN hypervisor support in Kata2.x releases is broken. This commit re-enables ACRN hypervisor support and also refactors the code so as to remove dependency on Sandbox. Fixes #3027 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>	2022-10-07 07:40:32 -07:00
Feng Wang	c31cf7269e	agent: reduce reference count for failed mount The kata agent adds a reference for each storage object before mount and skip mount again if the storage object is known. We need to remove the object reference if mount fails. Fixes: #5364 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-10-06 21:37:59 -07:00
Fabiano Fidêncio	ff62cedd26	Merge pull request #5323 from fidencio/topic/fix-kata-deploy-build-behind-proxy packaging: Mount $HOME/.docker in the 1st layer container	2022-10-05 21:18:29 +02:00
Fabiano Fidêncio	4da743f90b	packaging: Mount $HOME/.docker in the 1st layer container In order to ensure that the proxy configuration is passed to the 2nd layer container, let's ensure the $HOME/.docker/config.json file is exposed inside the 1st layer container. For some reason which I still don't fully understand exporting https_proxy / http_proxy / no_proxy was not enough to get those variables exported to the 2nd layer container. In this commit we're creating a "$HOME/.docker" directory, and removing it after the build, in case it doesn't exist yet. The reason we do this is to avoid docker not running in case "$HOME/.docker" doesn't exist. This was not tested with podman, but if there's an issue with podman, the issue was already there beforehand and should be treated as a different problem than the one addressed in this commit. Fixes: #5077 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-05 15:25:07 +02:00
Archana Shinde	6e2d39c588	Merge pull request #5311 from likebreath/0930/clh_v27.0 Upgrade to Cloud Hypervisor v27.0	2022-10-04 10:56:00 -07:00
Fabiano Fidêncio	d5572d5fd5	Merge pull request #5106 from norbjd/fix/microvm-machine-options microvm: Remove kernel_irqchip=on option	2022-10-04 12:19:37 +02:00
Champ-Goblem	89e62d4edf	shim: Ensure pagesize is set when reporting hugetbl stats The containerd stats method and metrics API are broken with Kata 2.5.x, the stats fail to load and the metrics API responds with status code 500 This seems to be down to the conversion from the stats reported by the agent RPC `StatsContainer` where the field `Pagesize` is not completed by the `setHugetlbStats` method. In the case where multiple sized tables stats are reported, this causes containerd to register two metrics with the same label set, rather than each being partitioned by the `page` label. Fixes: #5316 Signed-off-by: Champ-Goblem <cameron@northflank.com>	2022-10-04 09:16:30 +01:00
Bo Chen	067e2b1e33	runtime: clh: Use the new API to boot with TDX firmware (td-shim) The new way to boot from TDX firmware (e.g. td-shim) is using the combination of '--platform tdx=on' with '--firmware tdshim'. Fixes: #5309 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-03 10:30:54 -07:00
Bo Chen	5d63fcf344	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v27.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Fixes: #5309 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-03 10:30:42 -07:00
Bo Chen	fe61070426	versions: Upgrade to Cloud Hypervisor v27.0 This release has been tracked in our new [roadmap project ](https://github.com/orgs/cloud-hypervisor/projects/6) as iteration v27.0. Community Engagement A new mailing list has been created to support broader community discussions. Please consider [subscribing](https://lists.cloudhypervisor.org/g/dev/); an announcement of a regular meeting will be announced via this list shortly. Prebuilt Packages Prebuilt packages are now available. Please see this [document](https://github.com/cloud-hypervisor/obs-packaging/blob/main/README.md) on how to install. These packages also include packages for the different firmware options available. Network Device MTU Exposed to Guest The MTU for the TAP device associated with a virtio-net device is now exposed to the guest. If the user provides a MTU with --net mtu=.. then that MTU is applied to created TAP interfaces. This functionality is also exposed for vhost-user-net devices including those created with the reference backend. Boot Tracing Support for generating a trace report for the boot time has been added including a script for generating an SVG from that trace. Simplified Build Feature Flags The set of feature flags, for e.g. experimental features, have been simplified: * msvh and kvm features provide support for those specific hypervisors (with kvm enabled by default), * tdx provides support for Intel TDX; and although there is no MSHV support now it is now possible to compile with the mshv feature, * tracing adds support for boot tracing, * guest_debug now covers both support for gdbing a guest (formerly gdb feature) and dumping guest memory. The following feature flags were removed as the functionality was enabled by default: amx, fwdebug, cmos and common. Asynchronous Kernel Loading AArch64 has gained support for loading the guest kernel asynchronously like x86-64. GDB Support for AArch64 GDB stub support (accessed through --gdb under guest_debug feature) is now available on AArch64 as well as as x86-64. Notable Bug Fixes * This version incorporates a version of virtio-queue that addresses an issue where a rogue guest can potentially DoS the VMM, * Improvements around PTY handling for virtio-console and serial devices, * Improved error handling in virtio devices. Deprecations Deprecated features will be removed in a subsequent release and users should plan to use alternatives. * Booting legacy firmware (compiled without a PVH header) has been deprecated. All the firmware options (Cloud Hypervisor OVMF and Rust Hypervisor Firmware) support booting with PVH so support for loading firmware in a legacy mode is no longer needed. This functionality will be removed in the next release. Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v27.0 Note: To have the new API of loading firmware for booting (e.g. boot from td-shim), a specific commit revision after the v27.0 release is used as the Cloud Hypervisor version from the 'versions.yaml'. Fixes: #5309 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-03 10:25:04 -07:00
Fabiano Fidêncio	0143036b84	Merge pull request #5303 from liubin/fix/5296-typo-unknow kata-sys-util: fix typo `unknow`	2022-10-03 15:29:45 +02:00
norbjd	17de94e118	microvm: Remove kernel_irqchip=on option `kernel_irqchip` option doesn't seem to bring any benefits and, on the contrary, its usage cause issues when using the microvm machine type. With this in mind, let's remove it. Fixes: #1984, #4386 Signed-off-by: norbjd <norbjd@users.noreply.github.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-03 11:48:05 +02:00
Bin Liu	3aeaa6459d	runtime-rs: delete duplicated PASSTHROUGH_FS_DIR const The const PASSTHROUGH_FS_DIR defined twice, delte one. Fixes: #5301 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-30 15:53:08 +08:00
Bin Liu	43ae972335	kata-sys-util: delete duplicated get_bundle_path get_bundle_path has already defined in spec.rs, delete it from fs.rs. Fixes: #5299 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-30 15:50:58 +08:00
Bin Liu	ac04831223	kata-sys-util: fix typo `unknow` Change `unknow` to `unknown`. Fixes: #5296 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-30 15:47:34 +08:00
Bin Liu	68e8a86aec	runtime: fix incorrect comment for SetFsSharingSupport function The comment for SetFsSharingSupport is not suitable, correct the function name. Fixes: #5285 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-30 15:44:44 +08:00
Bin Liu	805e80b2a2	Merge pull request #5278 from openanolis/chao/update_linux_loader_ut dragonball: update ut for kernel config	2022-09-30 11:12:29 +08:00
Bin Liu	357d323803	Merge pull request #5244 from GabyCT/topic/debugosbuilder versions: Update gperf url to avoid libseccomp random failures	2022-09-30 10:10:54 +08:00
Bin Liu	8d4ced3c86	runtime-rs: support ephemeral storage for emptydir Add support for ephemeral storage and k8s emptydir. Depends-on:github.com/kata-containers/tests#5161 Fixes: #4730 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-30 09:10:20 +08:00
David Esparza	9b033f174b	Merge pull request #5292 from GabyCT/topic/updateoci versions: Update oci version	2022-09-29 16:29:11 -05:00
Greg Kurz	7b4c3c0cab	Merge pull request #5288 from jongwu/fix_cmdline_arm dragonball: fix no "as_str" error on Arm	2022-09-29 18:59:00 +02:00
Gabriela Cervantes	a241276592	versions: Update gperf url to avoid libseccomp random failures This PR updates the gperf url to avoid random failures when installing libseccomp as it seems that the mirrror url produces network random failures in multiple CIs. Fixes #5294 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-09-29 16:52:46 +00:00
Gabriela Cervantes	a617a63481	versions: Update oci version This PR updates the oci version that we are using in kata containers. Fixes #5291 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-09-29 15:32:48 +00:00
Jianyong Wu	6d585d5919	dragonball: fix no "as_str" error on Arm Cmdline struct update in the latest linux-loader lib and its as_str method is changed to as_cstring, thus we need fix it according whereas the old as_str method is used. Fixes: #5287 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-09-29 21:06:31 +08:00
Bin Liu	68f6dbb202	Merge pull request #5284 from gkurz/fix-release-script tools: release: fix bogus version check	2022-09-29 20:46:11 +08:00
Greg Kurz	421729f991	tools: release: fix bogus version check Shell expands `"rc"` to the top-level `src` directory. This results in comparing a version with a directory name. This doesn't make sense and causes the script to choose the wrong branch of the `if`. The intent of the check is actually to detect `rc` in the version. Fixes: #5283 Signed-off-by: Greg Kurz <groug@kaod.org>	2022-09-29 11:31:43 +02:00
Bin Liu	949ffcc457	Merge pull request #5281 from liubin/fix/5280-update-cargo-lock runtime-rs: update Cargo.lock	2022-09-29 17:16:21 +08:00
Bin Liu	1352e31180	Merge pull request #5200 from openanolis/agent_rwlock refactor(runtime-rs): Use RwLock in runtime-agent	2022-09-29 13:15:41 +08:00
Bin Liu	457b0beaf0	runtime-rs: update Cargo.lock src/dragonball/Cargo.toml is updated and the Cargo.lock is not commited into repo. Fixes: #5280 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-29 13:15:01 +08:00
Bin Liu	abbdf89a06	Merge pull request #5271 from liubin/fix/4729-add-close-io-for-kubectl-cp runtime-rs: fix shim close_io call to support kubectl cp	2022-09-29 13:10:49 +08:00
Peng Tao	046ddc6463	readme: remove libraries mentioning There are two duplicated mentioning of the rust libraries in README.md. Let's just remove them all as the section is intended to list out core Kata components rather than general libraries. Fixes: #5275 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-09-29 12:10:50 +08:00
Chao Wu	f89ada2de1	dragonball: update ut for kernel config Since linux loader is updated in the Dragonball and the api for Cmdline has been changed ( as_str() changed to as_cstring() ), we need to update unit test in Dragonball. fixes: #5277 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-09-29 11:35:45 +08:00
Bin Liu	0e899669ee	runtime-rs: fix shim close_io call to support kubectl cp Add close_io to shim and call agent's close_stdin in close_io. Depends-on:github.com/kata-containers/tests#5155 Fixes: #4729 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-29 09:35:17 +08:00
quanweiZhou	901893163f	Merge pull request #5198 from openanolis/share-fs-comment runtime-rs: add comments for runtime-rs shared directory	2022-09-29 09:12:01 +08:00
Greg Kurz	7294e2fa9e	Merge pull request #4387 from snir911/tmp-workflow-main workflow: trigger test-kata-deploy with pull_request and fix workflow_dispatch	2022-09-28 16:42:51 +02:00
Zhongtao Hu	96cf21fad0	runtime-rs: add comments for runtime-rs shared directory add comments for runtime-rs shared directory Fixes:#5197 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-09-28 15:46:34 +08:00
Zhongtao Hu	2f1a4b02ee	Merge pull request #5254 from openanolis/chao/update_linux_loader Dragonball: update linux_loader to 0.6.0	2022-09-28 15:04:09 +08:00
Bin Liu	0f6884b8c3	Merge pull request #5252 from zhaoxuat/main modify virtio_net_dev_mgr.rs wrong code comments	2022-09-28 11:34:20 +08:00
Bin Liu	d0be4a285e	Merge pull request #5260 from GabyCT/topic/fixrunkdoc docs: Update urls in runk documentation	2022-09-28 11:30:39 +08:00
Zhongtao Hu	ff053b0808	Merge pull request #5220 from liubin/fix/5184-rs-inotify runtime-rs: support watchable mount	2022-09-28 11:19:53 +08:00
Zhongtao Hu	319caa8e74	Merge pull request #5097 from openanolis/dbg-console runtime-rs: debug console support in runtime	2022-09-28 10:30:22 +08:00
Peng Tao	33b0720119	Merge pull request #5193 from openanolis/origin/kata-deploy kata-deploy: ship the rustified runtime binary	2022-09-28 10:19:16 +08:00
Gabriela Cervantes	9bd941098e	docs: Update urls in runk documentation This PR updates the urls that we have in the runk documentation. Fixes #5259 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-09-27 15:45:43 +00:00
Chao Wu	90ecc015e0	Dragonball: update linux_loader to 0.6.0 Since linux-loader 0.4.0 and 0.5.0 is yanked due to null terminator bug, we need to update linux-loader to 0.6.0. And as_str() function should also be changed. fixes: #5253 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-09-27 23:01:44 +08:00
Bin Liu	c64e56327f	Merge pull request #5190 from liubin/fix/5189-unbind-as-a-const runtime-rs: define VFIO unbind path as a const	2022-09-27 21:04:18 +08:00
Bin Liu	4a763925e5	runtime-rs: support watchable mount Use watchable mount to support inotify for virtio-fs. Fixes: #5184 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-27 19:08:25 +08:00
zhaoxu	abc26b00bb	dragonball: modify wrong code comments modify virtio_net_dev_mgr.rs wrong code comments Fixes: #5252 Signed-off-by: zhaoxu <zhaoxu@megvii.com>	2022-09-27 18:32:13 +08:00
Bin Liu	c95cf6dce7	Merge pull request #5250 from liubin/fix/5249-set-timeout-to-zero-for-stream-rpc runtime-rs: set agent timeout to 0 for stream RPCs	2022-09-27 17:39:35 +08:00
Peng Tao	8a2df6b31c	Merge pull request #4931 from jpecholt/snp-support Added SNP-Support for Kata-Containers	2022-09-27 14:17:54 +08:00
Bin Liu	41a3bd87a5	Merge pull request #5161 from liubin/fix/5160-typo-in-configure-hypervisor-sh packaging: fix typo in configure-hypervisor.sh	2022-09-27 13:03:39 +08:00
Bin Liu	20bcaf0e36	runtime-rs: set agent timeout to 0 for stream RPCs For stream RPCs: - write_stdin - read_stdout - read_stderr there should be no timeout (by setting it to 0). Fixes: #5249 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-27 11:47:37 +08:00
Bin Liu	407e46b1b7	Merge pull request #5218 from bergwolf/github/deps runtime/runtime-rs: update dependency	2022-09-27 11:02:46 +08:00
Bin Liu	414c6a1578	Merge pull request #5175 from bergwolf/revert-kata-deploy-changes-after-3.0.0-rc0-release release: Revert kata-deploy changes after 3.0.0-rc0 release	2022-09-27 11:02:24 +08:00
Bin Liu	a2f207b923	Merge pull request #5163 from liubin/fix/5162-add-test-for-StaticResource runtime-rs: add test for StaticResource	2022-09-26 17:44:20 +08:00
Zhongtao Hu	9d67f5a7e2	Merge pull request #5230 from openanolis/nohc runtime-rs: remove hardcoded string	2022-09-26 16:01:41 +08:00
quanweiZhou	ad87c7ac56	Merge pull request #5206 from openanolis/hypervisor/readme docs: add README for runtime-rs hypervisor crate	2022-09-26 16:01:12 +08:00
Bin Liu	5a98fb8d2b	Merge pull request #5186 from liubin/fix/5185 runtime-rs: use Path.is_file to check regular files	2022-09-26 12:33:47 +08:00
GabyCT	f7f05f238e	Merge pull request #5233 from GabyCT/topic/exportlibseccomp osbuilder: Export directory variables for libseccomp	2022-09-23 13:54:14 -05:00
Zhongtao Hu	4a36bb9e21	Merge pull request #4924 from openanolis/runtime-rs-netUT runtime-rs: add unit tests for network resource	2022-09-23 17:45:24 +08:00
Zhongtao Hu	274de024c5	docs: add README for runtime-rs hypervisor crate add README for runtime-rs hypervisor crate Fixes:#4634 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-09-23 15:20:02 +08:00
Chao Wu	9cf5de0b4e	Merge pull request #5171 from liubin/fix/5170-use-macro runtime-rs/resource: use macro to reduce duplicated code	2022-09-23 10:59:53 +08:00
wangyongchao.bj	04bbce8dc3	virtcontainers: add warn log record for qmp hotplug cpu error The qmp command of hotplug cpu failed error was hidden. It didn't friendly for the user tracing the hotplug cpu error. The PR help us to improve the hotplug cpu error log. Add real qemu command error log for `failed to hot add vCPUs`. Through the error message, we can get the reason of the failed qmp command for hotplug cpu operation. Fixes: #5234 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2022-09-23 08:22:30 +08:00
Gabriela Cervantes	a4a23457ca	osbuilder: Export directory variables for libseccomp To avoid the random failures when we are building the rootfs as it seems that it does not find the value for the libseccomp and gperf directory, this PR export these variables. Fixes #5232 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-09-22 21:45:20 +00:00
Chelsea Mafrica	de869f2565	Merge pull request #5188 from liubin/fix/5187-incorrect-comments-in-kata-types-hypervisor runtime-rs: fix incorrect comments	2022-09-22 14:09:20 -07:00
Zhongtao Hu	d663f110d7	kata-deploy: get the config path from cri options get the config path for runtime-rs from cri options Fixes: #5000 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-09-22 17:39:25 +08:00
Zhongtao Hu	c6b3dcb67d	kata-deploy: support kata-deploy for runtime-rs support kata-deploy for runtime-rs Fixes:#5000 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-09-22 17:39:20 +08:00
Ji-Xinyou	46965739a4	runtime-rs: remove hardcoded string Use KATA_PATH instead of "run/kata" Fixes: #5229 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-22 16:06:51 +08:00
Zhongtao Hu	a394761a5c	kata-deploy: add installation for runtime-rs setup the compile environment and installation path for the Rust runtime Fixes:#5000 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-09-22 15:59:44 +08:00
Peng Tao	ce22a9f134	Merge pull request #5159 from BbolroC/s390-config kernel: Add crypto kernel config for s390	2022-09-22 15:36:24 +08:00
Peng Tao	a2c13bad45	Merge pull request #5156 from fengwang666/uid-reuse-bug Non-root hypervisor uid reuse bug	2022-09-22 15:35:39 +08:00
Peng Tao	af174c2b6d	Merge pull request #5195 from wllenyj/update-dbs Build-in Sandbox: update dragonball-sandbox dependencies	2022-09-22 15:07:11 +08:00
Ji-Xinyou	50299a3292	refactor(runtime-rs): Use RwLock in runtime agent Use RwLock for Agent in runtime, for better concurrency. Fixes: #5199 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-21 17:43:40 +08:00
Peng Tao	9628c7df0c	runtime: update runc dependency To bring fix to CVE-2022-29162. Fixes: #5217 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-09-21 17:21:37 +08:00
Peng Tao	7fbc883879	runtime-rs: drop dependency on rustc-serialize We are not using it and it hasn't got any updates for more than five years, leaving open CVEs unresolved. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-09-21 17:19:58 +08:00
Peng Tao	bf2be0cf7a	release: Revert kata-deploy changes after 3.0.0-rc0 release As 3.0.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup tags back to "latest", and re-add the kata-deploy-stable and the kata-cleanup-stable files. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-09-21 15:19:38 +08:00
snir911	cb977c04bd	Merge pull request #5204 from GabyCT/topic/updatevirt docs: Update url in virtualization document	2022-09-21 10:05:13 +03:00
Ji-Xinyou	e23bfd615e	runtime-rs: make function name more understandable Change kparams to kernel_params for understandability. Fixes: #5068 Signed-Off-By: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-21 11:48:11 +08:00
Ji-Xinyou	426a436780	runtime-rs: add unit test and eliminate raw string Add two unit tests for coverage and eliminate raw strings to constant. Fixes: #5068 Signed-Off-By: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-21 11:47:07 +08:00
Ji-Xinyou	87959cb72d	runtime-rs: debug console support in runtime Read debug console configuration in kernel params. Fixes: #5068 Signed-Off-By: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-21 11:46:55 +08:00
Bin Liu	a2e7434a0f	Merge pull request #5082 from QiliangFan/main dragonball: Fix problem that stdio console cannot connect to stdout	2022-09-21 11:12:19 +08:00
Gabriela Cervantes	d55cf9ab71	docs: Update url in virtualization document This PR updates the url for the cloud hypervisor in the virtualization document. Fixes #5203 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-09-20 16:52:24 +00:00
wllenyj	0399da677d	runtime-rs: update dependencies Updated Cargo.lock. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-09-20 15:00:14 +08:00
wllenyj	f6f19917a8	dragonball: update dragonball-sandbox dependencies Updated vmm-sys-util to 0.10.0 Updated virtio-queue to 0.4.0 Updated vm-memory to 0.9.0 Updated linux-loader to 0.5.0 Fixes: #5194 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-09-20 14:48:09 +08:00
Zhongtao Hu	e05e42fd3c	Merge pull request #5113 from liubin/fix/5112-call-TomlConfig-validate-func runtime-rs: call TomlConfig's validate function after load	2022-09-20 14:38:42 +08:00
Zhongtao Hu	fc65e96ad5	Merge pull request #5133 from openanolis/shimmgmt feat(Shimmgmt): Shim management server and client	2022-09-20 14:37:19 +08:00
Bin Liu	2caee1f38d	runtime-rs: define VFIO unbind path as a const In src/runtime-rs/crates/hypervisor/src/device/vfio.rs, the path of new_id is defined as a const, but unbind is used as a local variable, they should be unified to const. Fixes: #5189 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-19 16:08:35 +08:00
Bin Liu	3f65ff2d07	runtime-rs: fix incorrect comments Some comments for types are incorrect in file src/libs/kata-types/src/config/hypervisor/mod.rs Fixes: #5187 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-19 16:03:06 +08:00
Bin Liu	9670a3caac	runtime-rs: use Path.is_file to check regular files Use Path.is_file to replace using `stat` to check the file type. Fixes: #5185 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-19 15:57:07 +08:00
Joana Pecholt	d9e6eb11ae	docs: Guide to use SNP-VMs with Kata-Containers The guide describes how to set Kata-Containers up so that AMD SEV-SNP encrypted VMs are used when deploying confidential containers. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Joana Pecholt	ded60173d4	runtime: Enable choice between AMD SEV and SNP This is based on a patch from @niteeshkd that adds a config parameter to choose between AMD SEV and SEV-SNP VMs as the confidential guest type in case both types are supported. SEV is the default. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Joana Pecholt	22bda0838c	runtime: Support for AMD SEV-SNP VMs This commit adds AMD SEV-SNP as a confidential guest option to the runtime. Information on required components such as OVMF, QEMU and a kernel supporting SEV-SNP are defined in the versions file and corresponding configs are added. Note: The CPU model 'host' provided by the current SNP-QEMU does not support all SNP capabilities yet, which is why this option is changed to EPYC-v4. Note: The guest's physical address space reduction specified with ReducedPhysBits is 1. Details are can be found in Section 15.34.6 here https://www.amd.com/system/files/TechDocs/24593.pdf Fixes #4437 Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Joana Pecholt	a2bbd29422	kernel: Introduce SNP kernel This introduces the SNP kernel as a confidential computing guest. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Joana Pecholt	0e69405e16	docs: Developer-Guide updated Developer-Guide.md is updated to work using current golang versions. Related Readmes are also updated. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Joana Pecholt	105eda5b9a	runtime: Initrd path option added to config Adds initrd configuration option to the configuration.toml that is generated for the setup using QEMU. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Tim Zhang	32a9d6d66d	Merge pull request #5174 from bergwolf/3.0.0-rc0-branch-bump # Kata Containers 3.0.0-rc0	2022-09-16 16:59:55 +08:00
Peng Tao	583591099d	release: Kata Containers 3.0.0-rc0 - runtime-rs: delete some allow(dead_code) attributes - kata-types: don't check virtio_fs_daemon for inline-virtio-fs - kata-types: change return type of getting CPU period/quota function - runtime-rs: fix host device check pattern - runtime-rs: remove meaningless comment - runtime-rs: update rust runtime roadmap - runk: Enable seccomp support by default - config: add "inline-virtio-fs" as a "shared_fs" type - runtime-rs: add README.md - runk: Refactor container builder - kernel: fix kernel tarball name for SEV - libs/kata-types: replace tabs by spaces in comments - gperf: point URL to mirror site `be242a3c3` release: Adapt kata-deploy for 3.0.0-rc0 `156e1c324` runtime-rs: delete some allow(dead_code) attributes `62cf6e6fc` runtime-rs: remove meaningless comment `bcf6bf843` runk: Enable seccomp support by default `2b1d05857` runtime-rs: fix host device check pattern `85b49cee0` runtime-rs: add README.md `36d805fab` config: add "inline-virtio-fs" as a "shared_fs" type `b948a8ffe` kernel: fix kernel tarball name for SEV `50f912615` libs/kata-types: replace tabs by spaces in comments `96c8be715` libs/kata-types: change return type of getting CPU period/quota `fc9c6f87a` kata-types: don't check virtio_fs_daemon for inline-virtio-fs `968c2f6e8` runk: Refactor container builder `84268f871` runtime-rs: update rust runtime roadmap `566656b08` gperf: point URL to mirror site Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-09-16 03:53:44 +00:00
Peng Tao	be242a3c3c	release: Adapt kata-deploy for 3.0.0-rc0 kata-deploy files must be adapted to a new release. The cases where it happens are when the release goes from -> to: * main -> stable: * kata-deploy-stable / kata-cleanup-stable: are removed * stable -> stable: * kata-deploy / kata-cleanup: bump the release to the new one. There are no changes when doing an alpha release, as the files on the "main" branch always point to the "latest" and "stable" tags. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-09-16 03:53:43 +00:00
Bin Liu	a8a8a28a34	runtime-rs/resource: use macro to reduce duplicated code Some device types have the same definition, they can be implemented by macro to reduce code. And this commit also deleted the `peer_name` field of the structs that is never been used. Fixes: #5170 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-15 15:45:26 +08:00
Bin Liu	be22e8408d	Merge pull request #5165 from liubin/fix/5164-remove-dead_code runtime-rs: delete some allow(dead_code) attributes	2022-09-15 09:32:10 +08:00
Bin Liu	156e1c3247	runtime-rs: delete some allow(dead_code) attributes Some #![allow(dead_code)]s and code are not needed indeed. Fixes: #5164 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-14 20:50:30 +08:00
qiliangfan	7622452f4b	Dragonball: Fix the problem about stdio console Let stdout stream connect to the com1_device, Fixes: #5083 Signed-off-by: qiliangfan <fanqiliang@mail.nankai.edu.cn>	2022-09-14 15:53:57 +08:00
Bin Liu	208233288a	runtime-rs: add test for StaticResource Add test case for StaticResource, the old test is not covering the StaticResource struct. Fixes: #5162 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-14 11:45:07 +08:00
Bin Liu	adb33a4121	packaging: fix typo in configure-hypervisor.sh `powwer` is a typo of `power`, and many spaces should be replaced by tabs for indent. Fixes: #5160 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-14 11:38:01 +08:00
Feng Wang	f914319874	runtime: store the user name in hypervisor config The user name will be used to delete the user instead of relying on uid lookup because uid can be reused. Fixes: #5155 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-09-13 10:32:55 -07:00
Hyounggyu Choi	86a02c5f6a	kernel: Add crypto kernel config for s390 This config update supports new crypto algorithms for s390. Fixes: #5158 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2022-09-13 18:13:57 +02:00
Feng Wang	5cafe21770	runtime: make StopVM thread-safe StopVM can be invoked by multiple threads and needs to be thread-safe Fixes: #5155 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-09-12 21:56:15 -07:00
Feng Wang	c3015927a3	runtime: add more debug logs for non-root user operation Previously the logging was insufficient and made debugging difficult Fixes: #5155 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-09-12 21:38:57 -07:00
Bin Liu	a58feba9bb	Merge pull request #5105 from liubin/fix/5104-ignore-virtiofs-daemon-for-inline-mode kata-types: don't check virtio_fs_daemon for inline-virtio-fs	2022-09-13 10:33:56 +08:00
Bin Liu	42d4da9b6c	Merge pull request #5101 from liubin/fix/5100-cpu-period-quota-data-type kata-types: change return type of getting CPU period/quota function	2022-09-13 10:33:29 +08:00
Tim Zhang	8ec4edcf4f	Merge pull request #5146 from liubin/fix/5145-check-host-dev runtime-rs: fix host device check pattern	2022-09-13 10:33:05 +08:00
Tim Zhang	447521c6da	Merge pull request #5151 from liubin/fix/5150-remove-comment runtime-rs: remove meaningless comment	2022-09-13 10:32:53 +08:00
Bin Liu	2f830c09a3	Merge pull request #5073 from openanolis/update runtime-rs: update rust runtime roadmap	2022-09-13 10:32:25 +08:00
Bin Liu	62cf6e6fc3	runtime-rs: remove meaningless comment The comment for `generate_mount_path` function is a copy miss and should be deleted. Fixes: #5150 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-09 16:07:35 +08:00
Bin Liu	55f4f3a95b	Merge pull request #4897 from ManaSugi/runk/enable-seccomp runk: Enable seccomp support by default	2022-09-09 14:11:35 +08:00
Manabu Sugimoto	bcf6bf843c	runk: Enable seccomp support by default Enable seccomp support in `runk` by default. Due to this, `runk` is built with `gnu libc` by default because the building `runk` with statically linked the `libseccomp` and `musl` requires additional configurations. Also, general container runtimes are built with `gnu libc` as dynamically linked binaries by default. The user can disable seccomp by `make SECCOMP=no`. Fixes: #4896 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-09-09 10:55:16 +09:00
GabyCT	be462baa7e	Merge pull request #5103 from liubin/fix/5102-add-inline-virtiofs-config config: add "inline-virtio-fs" as a "shared_fs" type	2022-09-08 10:33:20 -05:00
GabyCT	bcbce8317d	Merge pull request #5061 from liubin/fix/5022-runtime-rs-readme runtime-rs: add README.md	2022-09-08 10:32:08 -05:00
bin liu	2b1d058572	runtime-rs: fix host device check pattern Host devices should start with `/dev/` but not `/dev`. Fixes: #5145 Signed-off-by: bin liu <liubin0329@gmail.com>	2022-09-08 22:44:46 +08:00
Bin Liu	85b49cee02	runtime-rs: add README.md Add README.md for runtime-rs. Fixes: #5022 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-08 16:03:45 +08:00
Bin Liu	7cfc357c6e	Merge pull request #5034 from ManaSugi/runk/refactor-container-builder runk: Refactor container builder	2022-09-08 11:30:07 +08:00
Ji-Xinyou	5add50aea2	runtime-rs: timeout for shim management client Let client side support timeout if the timeout value is set. If timeout not set, execute directly. Fixes: #5114 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-08 11:11:33 +08:00
Bin Liu	36d805fab9	config: add "inline-virtio-fs" as a "shared_fs" type "inline-virtio-fs" is newly supported by kata 3.0 as a "shared_fs" type, it should be described in configuration file. "inline-virtio-fs" is the same as "virtio-fs", but it is running in the same process of shim, does not need an external virtiofsd process. Fixes: #5102 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-08 11:05:01 +08:00
Fabiano Fidêncio	5793685a4b	Merge pull request #5095 from ryansavino/sev-kernel-build-fix kernel: fix kernel tarball name for SEV	2022-09-07 17:50:17 +02:00
Bin Liu	5df6ff991d	Merge pull request #5116 from liubin/fix/5115-replace-tab-by-space libs/kata-types: replace tabs by spaces in comments	2022-09-07 15:53:34 +08:00
Ji-Xinyou	9f13496e13	runtime-rs: shim management client Add client side function(public), to establish http connections (PUT, POST, GET) to the long standing shim mgmt server. Fixes: #5114 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-07 15:39:14 +08:00
Fabiano Fidêncio	e94d38c97b	Merge pull request #5058 from ryansavino/gperf-url-fix gperf: point URL to mirror site	2022-09-07 09:25:13 +02:00
Bin Liu	aaf6d69089	runtime-rs: call TomlConfig's validate function after load Call TomlConfig's validate function after it is loaded and adjusted by annotations. Fixes: #5112 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-07 11:34:08 +08:00
Bin Liu	fe55f6afd7	Merge pull request #5124 from amshinde/revert-arp-neighbour-api Revert arp neighbour api	2022-09-07 11:14:53 +08:00
Ji-Xinyou	e891295e10	runtime-rs: shim management - agent-url Add agent-url to its handler. The general framework of registering URL handlers is done. Fixes: #5114 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-07 11:13:21 +08:00
Chelsea Mafrica	051dabb0fe	Merge pull request #5099 from liubin/fix/5098-add-default-config-for-runtime-rs runtime-rs: add default agent/runtime/hypervisor for configuration	2022-09-06 17:49:42 -07:00
Archana Shinde	d23779ec9b	Revert "agent: fix unittests for arp neighbors" This reverts commit `81fe51ab0b`.	2022-09-06 15:41:42 -07:00
Archana Shinde	d340564d61	Revert "agent: use rtnetlink's neighbours API to add neighbors" This reverts commit `845c1c03cf`. Fixes: #5126	2022-09-06 15:41:42 -07:00
Archana Shinde	188d37badc	kata-deploy: Add debug statement Adding this so that we can see the status of running pods in case of failure. Fixes: #5126 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-09-06 15:41:14 -07:00
Ryan Savino	b948a8ffe6	kernel: fix kernel tarball name for SEV 'linux-' prefix needed for tarball name in SEV case. Output to same file name. Fixes: #5094 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-09-06 11:04:29 -05:00
Bin Liu	50f9126153	libs/kata-types: replace tabs by spaces in comments Replace tabs by spaces in the comments of file libs/kata-types/src/annotations/mod.rs. Fixes: #5115 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-06 17:32:57 +08:00
Ji-Xinyou	59aeb776b0	runtime-rs: shim management Add shim management http server and boot it as a light-weight thread when the sandbox is created. Fixes: #5114 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-06 16:44:16 +08:00
Bin Liu	96c8be715b	libs/kata-types: change return type of getting CPU period/quota period should have a type of u64, and quota should be i64, the function of getting CPU period and quota from annotations should use the same data type as function return type. Fixes: #5100 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-06 11:35:52 +08:00
Bin Liu	fc9c6f87a3	kata-types: don't check virtio_fs_daemon for inline-virtio-fs If the shared_fs is set to "inline-virtio-fs", the "virtio_fs_daemon" should be ignored. Fixes: #5104 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-05 17:44:28 +08:00
James O. D. Hunt	662ce3d6f2	Merge pull request #5086 from Yuan-Zhuo/main docs: fix unix socket address in agent-ctl doc	2022-09-05 09:24:28 +01:00
Bin Liu	e879270a0c	runtime-rs: add default agent/runtime/hypervisor for configuration Kata 3.0 introduced 3 new configurations under runtime section: name="virt_container" hypervisor_name="dragonball" agent_name="kata" Blank values will lead to starting to fail. Adding default values will make user easy to migrate to kata 3.0. Fixes: #5098 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-05 15:55:28 +08:00
Bin Liu	e5437a7084	Merge pull request #5063 from liubin/fix/5062-split-amend-spec runtime-rs: split amend_spec function	2022-09-05 15:00:31 +08:00
Manabu Sugimoto	968c2f6e8e	runk: Refactor container builder Refactor the container builder code (`InitContainer` and `ActivatedContainer`) to make it easier to understand and to maintain. The details: 1. Separate the existing `builder.rs` into an `init_builder.rs` and `activated_builder.rs` to make them easy to read and maintain. 2. Move the `create_linux_container` function from the `builder.rs` to `container.rs` because it is shared by the both files. 3. Some validation functions such as `validate_spec` from `builder.rs` to `utils.rs` because they will be also used by other components as utilities in the future. Fixes: #5033 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-09-05 14:36:30 +09:00
Bin Liu	ba013c5d0f	Merge pull request #4744 from openanolis/runtime-rs-static_resource_mgmt runtime-rs: support functionality of static resource management	2022-09-05 11:17:09 +08:00
Wainer Moschetta	e81a73b622	Merge pull request #4719 from bookinabox/cargo-deny github-actions: Add cargo-deny	2022-09-02 17:24:50 -03:00
Fabiano Fidêncio	1ccd883103	Merge pull request #5090 from fidencio/topic/keep-passing-build-suffix-to-qemu qemu: Keep passing BUILD_SUFFIX	2022-09-02 19:37:22 +02:00
Fabiano Fidêncio	373dac2dbb	qemu: Keep passing BUILD_SUFFIX In the commit `54d6d01754` we ended up removing the BUILD_SUFFIX argument passed to QEMU as it only seemed to be used to generate the HYPERVISOR_NAME and PKGVERSION, which were added as arguments to the dockerfile. However, it turns out BUILD_SUFFIX is used by the `qemu-build-post.sh` script, so it can rename the QEMU binary accordingly. Let's just bring it back. Fixes: #5078 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-09-02 15:47:48 +02:00
Fabiano Fidêncio	9cf4eaac13	Merge pull request #5079 from ryansavino/tdx-qemu-tarball-path-fix qemu: fix tdx qemu tarball directories	2022-09-02 14:04:50 +02:00
Bin Liu	86ad832e37	runtime-rs: force shutdown shim process in it can't exit In some case the call of cleanup from shim to service manager will fail, and the shim process will continue to running, that will make process leak. This commit will force shutdown the shim process in case of any errors in service crate. Fixes: #5087 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-02 19:43:50 +08:00
Yuan-Zhuo	5f4f5f2400	docs: fix unix socket address in agent-ctl doc Following the instructions in guidance doc will result in the ECONNREFUSED, thus we need to keep the unix socket address in the two commands consistent. Fixes: #5085 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2022-09-02 17:37:44 +08:00
Peng Tao	b5786361e9	Merge pull request #4862 from egernst/memory-hotplug-limitation Address Memory hotplug limitation	2022-09-02 16:11:46 +08:00
Ryan Savino	59e3850bfd	qemu: create no_patches.txt file for SPR-BKC-QEMU-v2.5 Patches failing without the no_patches.txt file for SPR-BKC-QEMU-v2.5. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-09-01 21:07:30 -05:00
Bin Liu	6de4bfd860	Merge pull request #5076 from GabyCT/topic/updatedeveloperguide docs: Update url in the Developer Guide	2022-09-02 10:01:02 +08:00
Ryan Savino	54d6d01754	qemu: fix tdx qemu tarball directories Dockerfile cannot decipher multiple conditional statements in the main RUN call. Cannot segregate statements in Dockerfile with '{}' braces without wrapping entire statement in 'bash -c' statement. Dockerfile does not support setting variables by bash command. Must set HYPERVISOR_NAME and PKGVERSION from parent script: build-base-qemu.sh Fixes: #5078 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-09-01 20:36:28 -05:00
Archana Shinde	f79ef1ad90	Merge pull request #5048 from amshinde/3.0.0-alpha1-branch-bump # Kata Containers 3.0.0-alpha1	2022-09-02 06:42:16 +05:30
Gabriela Cervantes	e83b821316	docs: Update url in the Developer Guide This PR updates the url for containerd in the Developer Guide. Fixes #5075 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-09-01 15:33:29 +00:00
Zhongtao Hu	84268f8716	runtime-rs: update rust runtime roadmap Update the status and plan for the Rust runtime developement Fixes: #4884 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-09-01 22:53:30 +08:00
GabyCT	9bce2beebf	Merge pull request #5040 from GabyCT/topic/updatecni versions: Update cni plugins version	2022-09-01 09:31:06 -05:00
Bin Liu	69b82023a8	Merge pull request #5065 from liubin/fix/5064-specify-language-for-code-in-markdown docs: Specify language in markdown for syntax highlight	2022-09-01 16:11:23 +08:00
Bin Liu	41ec71169f	runtime-rs: split amend_spec function amend_spec do two works: - modify the spec - check if the pid namespace is enabled This make it confusable. So split it into two functions. Fixes: #5062 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-01 14:44:54 +08:00
Bin Liu	749a6a2480	docs: Specify language in markdown for syntax highlight Specify language for code block in docs/Unit-Test-Advice.md for syntax highlight. Fixes: #5064 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-01 13:54:31 +08:00
Ji-Xinyou	a828292b47	runtime-rs: add unit tests for network resource Add UTs for network resource Fixes: #4923 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-09-01 10:13:09 +08:00
Eric Ernst	9997ab064a	sandbox_test: Add test to verify memory hotplug behavior Augment the mock hypervisor so that we can validate that ACPI memory hotplug is carried out as expected. We'll augment the number of memory slots in the hypervisor config each time the memory of the hypervisor is changed. In this way we can ensure that large memory hotplugs are broken up into appropriately sized pieces in the unit test. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-08-31 10:32:30 -07:00
Eric Ernst	f390c122f0	sandbox: don't hotplug too much memory at once If we're using ACPI hotplug for memory, there's a limitation on the amount of memory which can be hotplugged at a single time. During hotplug, we'll allocate memory for the memmap for each page, resulting in a 64 byte per 4KiB page allocation. As an example, hotplugging 12GiB of memory requires ~192 MiB of free memory, which is about the limit we should expect for an idle 256 MiB guest (conservative heuristic of 75% of provided memory). From experimentation, at pod creation time we can reliably add 48 times what is provided to the guest. (a factor of 48 results in using 75% of provided memory for hotplug). Using prior example of a guest with 256Mi RAM, 256 Mi * 48 = 12 Gi; 12GiB is upper end of what we should expect can be hotplugged successfully into the guest. Note: It isn't expected that we'll need to hotplug large amounts of RAM after workloads have already started -- container additions are expected to occur first in pod lifecycle. Based on this, we expect that provided memory should be freely available for hotplug. If virtio-mem is being utilized, there isn't such a limitation - we can hotplug the max allowed memory at a single time. Fixes: #4847 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-08-31 10:32:30 -07:00
Ryan Savino	566656b085	gperf: point URL to mirror site gperf download fails intermittently. Changing to mirror site will hopefully increase download reliability. Fixes: #5057 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-08-31 10:02:53 -05:00
Fabiano Fidêncio	08d230c940	Merge pull request #5046 from fidencio/topic/fix-regression-on-building-tdx-kernel kernel: Re-work get_tee_kernel()	2022-08-31 13:16:26 +02:00
Greg Kurz	380af44043	Merge pull request #5036 from jpecholt/whitelist-cleanup kernel: Whitelist cleanup	2022-08-31 11:08:32 +02:00
Fabiano Fidêncio	a1fdc08275	kernel: Re-work get_tee_kernel() `00aadfe20a` introduced a regression on `make cc-tdx-kernel-tarball` as we stopped passing all the needed information to the `build-kernel.sh` script, leading to requiring `yq` installed in the container used to build the kernel. This commit partially reverts the faulty one, rewritting it in a way the old behaviour is brought back, without changing the behaviour that was added by the faulty commit. Fixes: #5043 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-31 10:08:12 +02:00
Peng Tao	f1276180b1	Merge pull request #4996 from liubin/fix/4995-delete-socket-option-for-shim runtime-rs: delete socket from shim command-line options	2022-08-31 14:16:56 +08:00
Bin Liu	515bdcb138	Merge pull request #4900 from wllenyj/dragonball-ut Built-in Sandbox: add more unit tests for dragonball.	2022-08-31 14:00:07 +08:00
Eric Ernst	e0142db24f	hypervisor: Add GetTotalMemoryMB to interface It'll be useful to get the total memory provided to the guest (hotplugged + coldplugged). We'll use this information when calcualting how much memory we can add at a time when utilizing ACPI hotplug. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-08-30 16:37:47 -07:00
Archana Shinde	0ab49b233e	release: Kata Containers 3.0.0-alpha1 - Initrd fixes for ubuntu systemd - kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments - Fix kata-deploy to work on CI context - github-actions: Auto-backporting - runtime-rs: add support for core scheduling - ci: Use versions.yaml for the libseccomp - runk: Add cli message for init command - agent: add some logs for mount operation - Use iouring for qemu block devices - logging: Replace nix::Error::EINVAL with more descriptive msgs - kata-deploy: fix threading conflicts - kernel: Ignore CONFIG_SPECULATION_MITIGATIONS for older kernels - runtime-rs: support loading kernel modules in guest vm - TDX: Get TDX working again with Cloud Hypervisor + a minor change on QEMU's code - runk: Move delete logic to libcontainer - runtime: cri-o annotations have been moved to podman - Fix depbot reported rust crates dependency security issues - UT: test_load_kernel_module needs root - enable vmx for vm factory - runk: add pause/resume commands - kernel: upgrade guest kernel support to 5.19 - Drop-in cfg files support in runtime-rs - agent: do some rollback works if case of do_create_container failed - network: Fix error message for setting hardware address on TAP interface - Upgrade to Cloud Hypervisor v26.0 - runtime: tracing: End root span at end of trace - ci: Update libseccomp version - dep: update nix dependency - Updated the link target of CRI-O - libs/test-utils: share test code by create a new crate `dc32c4622` osbuilder: fix ubuntu initrd /dev/ttyS0 hang `cc5f91dac` osbuilder: add systemd symlinks for kata-agent `c08a8631e` agent: add some logs for mount operation `0a6f0174f` kernel: Ignore CONFIG_SPECULATION_MITIGATIONS for older kernels `6cf16c4f7` agent-ctl: fix clippy error `4b57c04c3` runtime-rs: support loading kernel modules in guest vm `dc90eae17` qemu: Drop unnecessary `tdx_guest` kernel parameter `d4b67613f` clh: Use HVC console with TDX `c0cb3cd4d` clh: Avoid crashing when memory hotplug is not allowed `9f0a57c0e` clh: Increase API and SandboxStop timeouts for TDX `b535bac9c` runk: Add cli message for init command `c142fa254` clh: Lift the sharedFS restriction used with TDX `bdf8a57bd` runk: Move delete logic to libcontainer `a06d819b2` runtime: cri-o annotations have been moved to podman `ffd1c1ff4` agent-ctl/trace-forwarder: udpate thread_local dependency `69080d76d` agent/runk: update regex dependency `e0ec09039` runtime-rs: update async-std dependency `763ceeb7b` logging: Replace nix::Error::EINVAL with more descriptive msgs `4ee2b99e1` kata-deploy: fix threading conflicts `731d39df4` kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments `96d903734` github-actions: Auto-backporting `a6fbaac1b` runk: add pause/resume commands `8e201501e` kernel: fix for set_kmem_limit error `00aadfe20` kernel: SEV guest kernel upgrade to 5.19.2 `0d9d8d63e` kernel: upgrade guest kernel support to 5.19.2 `57bd3f42d` runtime-rs: plug drop-in decoding into config-loading code `87b97b699` runtime-rs: add filesystem-related part of drop-in handling `cf785a1a2` runtime-rs: add core toml::Value tree merging `92f7d6bf8` ci: Use versions.yaml for the libseccomp `f508c2909` runtime: constify splitIrqChipMachineOptions `2b0587db9` runtime: VMX is migratible in vm factory case `fa09f0ec8` runtime: remove qemuPaths `326f1cc77` agent: enrich some error code path `4f53e010b` agent: skip test_load_kernel_module if non-root `3a597c274` runtime: clh: Use the new 'payload' interface `16baecc5b` runtime: clh: Re-generate the client code `50ea07183` versions: Upgrade to Cloud Hypervisor v26.0 `f7d41e98c` kata-deploy: export CI in the build container `4f90e3c87` kata-deploy: add dockerbuild/install_yq.sh to gitignore `8ff5c10ac` network: Fix error message for setting hardware address on TAP interface `338c28295` dep: update nix dependency `78231a36e` ci: Update libseccomp version `34746496b` libs/test-utils: share test code by create a new crate `3829ab809` docs: Update CRI-O target link `fcc1e0c61` runtime: tracing: End root span at end of trace `c1e3b8f40` govmm: Refactor qmp functions for adding block device `598884f37` govmm: Refactor code to get rid of redundant code `00860a7e4` qmp: Pass aio backend while adding block device `e1b49d758` config: Add block aio as a supported annotation `ed0f1d0b3` config: Add "block_device_aio" as a config option for qemu `b6cd2348f` govmm: Add io_uring as AIO type `81cdaf077` govmm: Correct documentation for Linux aio. `a355812e0` runtime-rs: fixed bug on core-sched error handling `591dfa4fe` runtime-rs: add support for core scheduling `09672eb2d` agent: do some rollback works if case of do_create_container failed Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-30 12:59:10 -07:00
Derek Lee	52bbc3a4b0	cargo.lock: update crates to comply with checks Updates versions of crossbeam-channel because 0.52.0 is a yanked package (creators mark version as not for release except as a dependency for another package) Updates chrono to use >0.42.0 to avoid: https://rustsec.org/advisories/RUSTSEC-2020-0159 Updates lz4-sys. Signed-off-by: Derek Lee <derlee@redhat.com>	2022-08-30 10:08:41 -07:00
Derek Lee	aa581f4b28	cargo.toml: Add oci to src/libs workplace Adds oci under the src/libs workplace. oci shares a Cargo.lock file with the rest of src/libs but was not listed as a member of the workspace. There is no clear reason why it is not included in the workspace, so adding it so cargo-deny stop complaining Signed-off-by: Derek Lee <derlee@redhat.com>	2022-08-30 09:30:03 -07:00
Derek Lee	7914da72c9	cargo.tomls: Added Apache 2.0 to cargo.tomls One of the checks done by cargo-deny is ensuring all crates have a valid license. As the rust programs import each other, cargo.toml files without licenses trigger the check. While I could disable this check this would be bad practice. This adds an Apache-2.0 license in the Cargo.toml files. Some of these files already had a header comment saying it is an Apache license. As the entire project itself is under an Apache-2.0 license, I assumed all individual components would also be covered under that license. Signed-off-by: Derek Lee <derlee@redhat.com>	2022-08-30 09:30:03 -07:00
Derek Lee	bed4aab7ee	github-actions: Add cargo-deny Adds cargo-deny to scan for vulnerabilities and license issues regarding rust crates. GitHub Actions does not have an obvious way to loop over each of the Cargo.toml files. To avoid hardcoding it, I worked around the problem using a composite action that first generates the cargo-deny action by finding all Cargo.toml files before calling this new generated action in the master workflow. Uses recommended deny.toml from cargo-deny repo with the following modifications: ignore = ["RUSTSEC-2020-0071"] because chrono is dependent on the version of time with the vulnerability and there is no simple workaround multiple-versions = "allow" Because of the above error and other packages, there are instances where some crates require different versions of a crate. unknown-git = "allow" I don't see a particular issue with allowing crates from other repos. An alternative would be the manually set each repo we want in an allow-git list, but I see this as more of a nuisance that its worth. We could leave this as a warning (default), but to avoid clutter I'm going to allow it. If deny.toml needs to be edited in the future, here's the guide: https://embarkstudios.github.io/cargo-deny/index.html Fixes #3359 Signed-off-by: Derek Lee <derlee@redhat.com>	2022-08-30 09:30:03 -07:00
Gabriela Cervantes	b1a8acad57	versions: Update cni plugins version This PR updates the cni plugins version that is being used in the kata CI. Fixes #5039 Depends-on: github.com/kata-containers/tests#5088 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-08-30 16:04:45 +00:00
Joana Pecholt	a6581734c2	kernel: Whitelist cleanup This removes two options that are not needed (any longer). These are not set for any kernel so they do not need to be ignored either. Fixes #5035 Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-08-30 13:24:12 +02:00
Fabiano Fidêncio	1b92a946d6	Merge pull request #4987 from ryansavino/initrd-fixes-for-ubuntu-systemd Initrd fixes for ubuntu systemd	2022-08-30 09:16:43 +02:00
GabyCT	630eada0d3	Merge pull request #4956 from shippomx/main kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments	2022-08-29 14:31:46 -05:00
GabyCT	3426da66df	Merge pull request #4951 from wainersm/fix_kata-deploy-ci Fix kata-deploy to work on CI context	2022-08-29 14:30:59 -05:00
Wainer Moschetta	cd5be6d55a	Merge pull request #4775 from bookinabox/auto-backport github-actions: Auto-backporting	2022-08-29 14:08:12 -03:00
Bin Liu	11383c2c0e	Merge pull request #4797 from openanolis/runtime-rs-coresched runtime-rs: add support for core scheduling	2022-08-29 14:28:30 +08:00
Bin Liu	25f54bb999	Merge pull request #4942 from ManaSugi/fix/use-versions-yaml-for-libseccomp ci: Use versions.yaml for the libseccomp	2022-08-29 11:22:35 +08:00
Archana Shinde	c174eb809e	Merge pull request #4983 from ManaSugi/runk/add-init-msg runk: Add cli message for init command	2022-08-27 00:15:25 +05:30
Ryan Savino	dc32c4622f	osbuilder: fix ubuntu initrd /dev/ttyS0 hang Guest log is showing a hang on systemd getty start. Adding symlink for /dev/ttyS0 resolves issue. Fixes: #4932 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-08-26 04:59:36 -05:00
Ryan Savino	cc5f91dac7	osbuilder: add systemd symlinks for kata-agent AGENT_INIT=no (systemd) add symlinks for kata-agent service. Fixes: #4932 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-08-26 04:59:36 -05:00
Fupan Li	63959b0be6	Merge pull request #5011 from liubin/fix/4962-add-logs agent: add some logs for mount operation	2022-08-26 17:12:15 +08:00
Bin Liu	c08a8631e0	agent: add some logs for mount operation Somewhere is lack of log info, add more details about the storage and log when error will help understand what happened. Fixes: #4962 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-26 14:09:56 +08:00
Archana Shinde	7d52934ec1	Merge pull request #4798 from amshinde/use-iouring-qemu Use iouring for qemu block devices	2022-08-26 04:00:24 +05:30
Wainer Moschetta	cbe5e324ae	Merge pull request #4815 from bookinabox/improve-agent-errors logging: Replace nix::Error::EINVAL with more descriptive msgs	2022-08-25 14:27:56 -03:00
Fabiano Fidêncio	1eea3d9920	Merge pull request #4965 from ryansavino/kata-deploy-threading-fix kata-deploy: fix threading conflicts	2022-08-25 19:11:52 +02:00
Fabiano Fidêncio	70cd4f1320	Merge pull request #4999 from fidencio/topic/ignore-CONFIG_SPECULATION_MITIGATIONS-for-older-kernels kernel: Ignore CONFIG_SPECULATION_MITIGATIONS for older kernels	2022-08-25 17:43:57 +02:00
Fabiano Fidêncio	0a6f0174f5	kernel: Ignore CONFIG_SPECULATION_MITIGATIONS for older kernels TDX kernel is based on a kernel version which doesn't have the CONFIG_SPECULATION_MITIGATIONS option. Having this in the allow list for missing configs avoids a breakage in the TDX CI. Fixes: #4998 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-25 10:51:13 +02:00
Bin Liu	cce99c5c73	runtime-rs: delete socket from shim command-line options The socket is not used to specify the socket address, but an ENV variable is used for runtime-rs. Fixes: #4995 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-25 15:32:17 +08:00
Bin Liu	a7e64b1ca9	Merge pull request #4892 from openanolis/shuoyu/runtime-rs runtime-rs: support loading kernel modules in guest vm	2022-08-25 15:01:23 +08:00
Fabiano Fidêncio	ddc94e00b0	Merge pull request #4982 from fidencio/topic/improve-cloud-hypervisor-plus-tdx-support TDX: Get TDX working again with Cloud Hypervisor + a minor change on QEMU's code	2022-08-25 08:53:10 +02:00
Bin Liu	875d946fb4	Merge pull request #4976 from ManaSugi/runk/refactor-delete-func runk: Move delete logic to libcontainer	2022-08-25 14:30:30 +08:00
Yushuo	6cf16c4f76	agent-ctl: fix clippy error Fixes: #4988 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2022-08-25 11:00:49 +08:00
Yushuo	4b57c04c33	runtime-rs: support loading kernel modules in guest vm Users can specify the kernel module to be loaded through the agent configuration in kata configuration file or in pod anotation file. And information of those modules will be sent to kata agent when sandbox is created. Fixes: #4894 Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>	2022-08-25 10:38:04 +08:00
Peng Tao	aa6bcacb7d	Merge pull request #4973 from bergwolf/github/go-depbot runtime: cri-o annotations have been moved to podman	2022-08-25 10:12:06 +08:00
Peng Tao	78af76b72a	Merge pull request #4969 from bergwolf/github/depbot Fix depbot reported rust crates dependency security issues	2022-08-25 10:11:54 +08:00
Fabiano Fidêncio	dc90eae17b	qemu: Drop unnecessary `tdx_guest` kernel parameter With the current TDX kernel used with Kata Containers, `tdx_guest` is not needed, as TDX_GUEST is now a kernel configuration. With this in mind, let's just drop the kernel parameter. Fixes: #4981 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 20:02:43 +02:00
Fabiano Fidêncio	d4b67613f0	clh: Use HVC console with TDX As right now the TDX guest kernel doesn't support "serial" console, let's switch to using HVC in this case. Fixes: #4980 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 20:02:40 +02:00
Fabiano Fidêncio	c0cb3cd4d8	clh: Avoid crashing when memory hotplug is not allowed The runtime will crash when trying to resize memory when memory hotplug is not allowed. This happens because we cannot simply set the hotplug amount to zero, leading is to not set memory hotplug at all, and later then trying to access the value of a nil pointer. Fixes: #4979 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 20:02:22 +02:00
Fabiano Fidêncio	9f0a57c0eb	clh: Increase API and SandboxStop timeouts for TDX While doing tests using `ctr`, I've noticed that I've been hitting those timeouts more frequently than expected. Till we find the root cause of the issue (which is not in the Kata Containers), let's increase the timeouts when dealing with a Confidential Guest. Fixes: #4978 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 20:02:12 +02:00
Manabu Sugimoto	b535bac9c3	runk: Add cli message for init command Add cli message for init command to tell the user not to run this command directly. Fixes: #4367 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-08-25 00:32:35 +09:00
Fabiano Fidêncio	c142fa2541	clh: Lift the sharedFS restriction used with TDX When booting the TDX kernel with `tdx_disable_filter`, as it's been done for QEMU, VirtioFS can work without any issues. Whether this will be part of the upstream kernel or not is a different story, but it easily could make it there as Cloud Hypervisor relies on the VIRTIO_F_IOMMU_PLATFORM feature, which forces the guest to use the DMA API, making these devices compatible with TDX. See Sebastien Boeuf's explanation of this in the 3c973fa7ce208e7113f69424b7574b83f584885d commit: """ By using DMA API, the guest triggers the TDX codepath to share some of the guest memory, in particular the virtqueues and associated buffers so that the VMM and vhost-user backends/processes can access this memory. """ Fixes: #4977 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 17:14:05 +02:00
Manabu Sugimoto	bdf8a57bdb	runk: Move delete logic to libcontainer Move delete logic to `libcontainer` crate to make the code clean like other commands. Fixes: #4975 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-08-24 19:12:36 +09:00
Peng Tao	a06d819b24	runtime: cri-o annotations have been moved to podman Let's swith to depending on podman which also simplies indirect dependency on kubernetes components. And it helps to avoid cri-o security issues like CVE-2022-1708 as well. Fixes: #4972 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-24 18:11:37 +08:00
Peng Tao	ffd1c1ff4f	agent-ctl/trace-forwarder: udpate thread_local dependency To bring in fix to CWE-362. Fixes: #4968 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-24 17:10:49 +08:00
Peng Tao	69080d76da	agent/runk: update regex dependency To bring in fix to CVE-2022-24713. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-24 17:02:15 +08:00
Peng Tao	e0ec09039d	runtime-rs: update async-std dependency So that we bump several indirect dependencies like crossbeam-channel, crossbeam-utils to bring in fixes to known security issues like CVE-2020-15254. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-24 16:56:29 +08:00
Bin Liu	2b5dc2ad39	Merge pull request #4705 from bergwolf/github/agent-ut-improve UT: test_load_kernel_module needs root	2022-08-24 16:22:55 +08:00
Bin Liu	6551d4f25a	Merge pull request #4051 from bergwolf/github/vmx-vm-factory enable vmx for vm factory	2022-08-24 16:22:37 +08:00
Bin Liu	ad91801240	Merge pull request #4870 from cyyzero/runk-cgroup runk: add pause/resume commands	2022-08-24 14:44:43 +08:00
Derek Lee	763ceeb7ba	logging: Replace nix::Error::EINVAL with more descriptive msgs Replaces instances of anyhow!(nix::Error::EINVAL) with other messages to make it easier to debug. Fixes #954 Signed-off-by: Derek Lee <derlee@redhat.com>	2022-08-23 13:44:46 -07:00
Ryan Savino	4ee2b99e1e	kata-deploy: fix threading conflicts Fix threading conflicts when kata-deploy 'make kata-tarball' is called. Force the creation of rootfs tarballs to happen serially instead of in parallel. Fixes: #4787 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-08-23 12:35:23 -05:00
Miao Xia	731d39df45	kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments Kata guest os cgroup is not work properly kata guest kernel config option CONFIG_CGROUP_HUGETLB is not set, leading to: root@clr-b08d402cc29d44719bb582392b7b3466 ls /sys/fs/cgroup/hugetlb/ ls: cannot access '/sys/fs/cgroup/hugetlb/': No such file or directory Fixes: #4953 Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>	2022-08-23 12:31:13 +02:00
Derek Lee	96d9037347	github-actions: Auto-backporting An implementation of semi-automating the backporting process. This implementation has two steps: 1. Checking whether any associated issues are marked as bugs If they do, mark with `auto-backport` label 2. On a successful merge, if there is a `auto-backport` label and there are any tags of `backport-to-BRANCHNAME`, it calls an action that cherry-picks the commits in the PR and automatically creates a PR to those branches. This action uses https://github.com/sqren/backport-github-action Fixes #3618 Signed-off-by: Derek Lee <derlee@redhat.com>	2022-08-22 16:19:09 -07:00
Chen Yiyang	a6fbaac1bd	runk: add pause/resume commands To make cgroup v1 and v2 works well, I use `cgroups::cgroup` in `Container` to manager cgroup now. `CgroupManager` in rustjail has some drawbacks. Frist, methods in Manager traits are not visiable. So we need to modify rustjail and make them public. Second, CgrupManager.cgroup is private too, and it can't be serialized. We can't load/save it in status file. One solution is adding getter/setter in rustjail, then create `cgroup` and set it when loading status. In order to keep the modifications to a minimum in rustjail, I use `cgroups::cgroup` directly. Now it can work on cgroup v1 or v2, since cgroup-rs do this stuff. Fixes: #4364 #4821 Signed-off-by: Chen Yiyang <cyyzero@qq.com>	2022-08-22 23:11:50 +08:00
Fabiano Fidêncio	d797036b77	Merge pull request #4861 from ryansavino/upgrade-kernel-support-5.19 kernel: upgrade guest kernel support to 5.19	2022-08-22 14:57:00 +02:00
Bin Liu	8c8e97a495	Merge pull request #4772 from pmores/drop-in-cfg-files-support-rs Drop-in cfg files support in runtime-rs	2022-08-22 13:41:56 +08:00
Bin Liu	eb91ee45be	Merge pull request #4754 from liubin/fix/4749-rollback-when-creating-container-failed agent: do some rollback works if case of do_create_container failed	2022-08-22 10:44:11 +08:00
Ryan Savino	8e201501ef	kernel: fix for set_kmem_limit error Fixes: #4390 Fix in cargo cgroups-rs crate - Updated crate version to 0.2.10 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-08-19 13:08:14 -05:00
Ryan Savino	00aadfe20a	kernel: SEV guest kernel upgrade to 5.19.2 kernel: Update SEV guest kernel to 5.19.2 Kernel 5.19.2 has all the needed patches for running SEV, thus let's update it and stop using the version coming from confidential-containers. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-08-19 13:08:14 -05:00
Ryan Savino	0d9d8d63ea	kernel: upgrade guest kernel support to 5.19.2 kernel: Upgrade guest kernel support to 5.19.2 Let's update to the latest 5.19.x released kernel. CONFIG modifications necessary: fragments/common/dax.conf - CONFIG_DEV_PAGEMAP_OPS no longer configurable: https://www.kernelconfig.io/CONFIG_DEV_PAGEMAP_OPS?q=CONFIG_DEV_PAGEMAP_OPS&kernelversion=5.19.2 fragments/common/dax.conf - CONFIG_ND_BLK no longer supported: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f8669f1d6a86a6b17104ceca9340ded280307ac1 fragments/x86_64/base.conf - CONFIG_SPECULATION_MITIGATIONS is a dependency for CONFIG_RETPOLINE: https://www.kernelconfig.io/config_retpoline?q=&kernelversion=5.19.2 fragments/s390/network.conf - removed from kernel since 5.9.9: https://www.kernelconfig.io/CONFIG_PACK_STACK?q=CONFIG_PACK_STACK&kernelversion=5.19.2 Updated vmlinux path in build-kernel.sh for arch s390 Fixes #4860 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-08-19 13:08:13 -05:00
Fabiano Fidêncio	9806ce8615	Merge pull request #4937 from chenhengqi/fix-error-msg network: Fix error message for setting hardware address on TAP interface	2022-08-19 17:54:58 +02:00
Pavel Mores	57bd3f42d3	runtime-rs: plug drop-in decoding into config-loading code To plug drop-in support into existing config-loading code in a robust way, more specifically to create a single point where this needs to be handled, load_from_file() and load_raw_from_file() were refactored. Seeing as the original implemenations of both functions were identical apart from adjust_config() calls in load_from_file(), load_from_file() was reimplemented in terms of load_raw_from_file(). Fixes #4771 Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-08-19 11:01:29 +02:00
Pavel Mores	87b97b6994	runtime-rs: add filesystem-related part of drop-in handling The central function being added here is load() which takes a path to a base config file and uses it to load the base config file itself, find the corresponding drop-in directory (get_dropin_dir_path()), iterate through its contents (update_from_dropins()) and load each drop-in in turn and merge its contents with the base file (update_from_dropin()). Also added is a test of load() which mirrors the corresponding test in the golang runtime (TestLoadDropInConfiguration() in config_test.go). Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-08-19 11:01:29 +02:00
Pavel Mores	cf785a1a23	runtime-rs: add core toml::Value tree merging This is the core functionality of merging config file fragments into the base config file. Our TOML parser crate doesn't seem to allow working at the level of TomlConfig instances like BurntSushi, used in the Golang runtime, does so we implement the required functionality at the level of toml::Value trees. Tests to verify basic requirements are included. Values set by a base config file and not touched by a subsequent drop-in should be preserved. Drop-in config file fragments should be able to change values set by the base config file and add settings not present in the base. Conversion of a merged tree into a mock TomlConfig-style structure is tested as well. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-08-19 11:01:29 +02:00
Manabu Sugimoto	92f7d6bf8f	ci: Use versions.yaml for the libseccomp It would be nice to use `versions.yaml` for the maintainability. Previously, we have been specified the `libseccomp` and the `gperf` version directly in this script without using the `versions.yaml` because the current snap workflow is incomplete and fails. This is because snap CI environment does not have kata-cotnainers repository under ${GOPATH}. To avoid the failure, the `rootfs.sh` extracts the libseccomp version and url in advance and pass them to the `install_libseccomp.sh` as environment variables. Fixes: #4941 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-08-19 09:05:08 +09:00
Fabiano Fidêncio	828383bc39	Merge pull request #4933 from likebreath/0816/prepare_clh_v26.0 Upgrade to Cloud Hypervisor v26.0	2022-08-18 18:36:53 +02:00
James O. D. Hunt	6d6edb0bb3	Merge pull request #4903 from cmaf/tracing-defer-rootSpan-end runtime: tracing: End root span at end of trace	2022-08-18 08:51:41 +01:00
Peng Tao	f508c2909a	runtime: constify splitIrqChipMachineOptions A simple cleanup. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-18 10:09:20 +08:00
Peng Tao	2b0587db95	runtime: VMX is migratible in vm factory case We are not spinning up any L2 guests in vm factory, so the L1 guest migration is expected to work even with VMX. See https://www.linux-kvm.org/page/Nested_Guests Fixes: #4050 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-18 10:08:43 +08:00
Peng Tao	fa09f0ec84	runtime: remove qemuPaths It is broken that it doesn't list QemuVirt machine type. In fact we don't need it at all. Just drop it. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-18 10:06:10 +08:00
Peng Tao	326f1cc773	agent: enrich some error code path So that it is easier to find out why some function fails. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-18 10:02:12 +08:00
Peng Tao	4f53e010b4	agent: skip test_load_kernel_module if non-root We need root privilege to load a real kernel module. Fixes: #4704 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-18 10:02:12 +08:00
Bin Liu	cc4b9ac7cd	Merge pull request #4940 from ManaSugi/fix/update-libseccomp-version ci: Update libseccomp version	2022-08-18 08:36:59 +08:00
Bin Liu	c7b7bb701a	Merge pull request #4930 from bergwolf/github/depbot dep: update nix dependency	2022-08-18 08:05:14 +08:00
Bo Chen	3a597c2742	runtime: clh: Use the new 'payload' interface The new 'payload' interface now contains the 'kernel' and 'initramfs' config. Fixes: #4952 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-17 12:23:43 -07:00
Bo Chen	16baecc5b1	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v26.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Fixes: #4952 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-17 12:23:12 -07:00
Bo Chen	50ea071834	versions: Upgrade to Cloud Hypervisor v26.0 Highlights from the Cloud Hypervisor release v26.0: SMBIOS Improvements via `--platform` `--platform` and the appropriate API structure has gained support for supplying OEM strings (primarily used to communicate metadata to systemd in the guest) Unified Binary MSHV and KVM Support Support for both the MSHV and KVM hypervisors can be compiled into the same binary with the detection of the hypervisor to use made at runtime. Notable Bug Fixes * The prefetchable flag is preserved on BARs for VFIO devices * PCI Express capabilties for functionality we do not support are now filtered out * GDB breakpoint support is more reliable * SIGINT and SIGTERM signals are now handled before the VM has booted * Multiple API event loop handling bug fixes * Incorrect assumptions in virtio queue numbering were addressed, allowing thevirtio-fs driver in OVMF to be used * VHDX file format header fix * The same VFIO device cannot be added twice * SMBIOS tables were being incorrectly generated Deprecations Deprecated features will be removed in a subsequent release and users should plan to use alternatives. The top-level `kernel` and `initramfs` members on the `VmConfig` have been moved inside a `PayloadConfig` as the `payload` member. The OpenAPI document has been updated to reflect the change and the old API members continue to function and are mapped to the new version. The expectation is that these old versions will be removed in the v28.0 release. Removals The following functionality has been removed: The unused poll_queue parameter has been removed from --disk and equivalent. This was residual from the removal of the vhost-user-block spawning feature. Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v26.0 Fixes: #4952 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-17 12:20:26 -07:00
wllenyj	c75970b816	dragonball: add more unit test for config manager Added more unit tests for config manager. Fixes: #4899 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-08-17 23:46:26 +08:00
Wainer dos Santos Moschetta	f7d41e98cb	kata-deploy: export CI in the build container The clone_tests_repo() in ci/lib.sh relies on CI variable to decide whether to checkout the tests repository or not. So it is required to pass that variable down to the build container of kata-deploy, otherwise it can fail on some scenarios. Fixes #4949 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2022-08-17 10:42:49 -03:00
Wainer dos Santos Moschetta	4f90e3c87e	kata-deploy: add dockerbuild/install_yq.sh to gitignore The install_yq.sh is copied to tools/packaging/kata-deploy/local-build/dockerbuild so that it is added in the kata-deploy build image. Let's tell git to ignore that file. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2022-08-17 10:00:28 -03:00
Bin Liu	9d6d236003	Merge pull request #4869 from PrajwalBorkar/prajwal-patch Updated the link target of CRI-O	2022-08-17 17:55:40 +08:00
Hengqi Chen	8ff5c10ac4	network: Fix error message for setting hardware address on TAP interface Error out with the correct interface name and hardware address instead. Fixes: #4944 Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>	2022-08-17 16:42:07 +08:00
Peng Tao	338c282950	dep: update nix dependency To fix CVE-2021-45707 that affects nix < 0.20.2. Fixes: #4929 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-17 16:06:26 +08:00
James O. D. Hunt	82ad43f9bf	Merge pull request #4928 from liubin/fix/4925-share-test-utils-for-rust libs/test-utils: share test code by create a new crate	2022-08-17 08:31:11 +01:00
Manabu Sugimoto	78231a36e4	ci: Update libseccomp version Updates the libseccomp version that is being used in the Kata CI. Fixes: #4858, #4939 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-08-17 15:39:22 +09:00
Bin Liu	8cd1e50eb6	Merge pull request #4921 from liubin/fix/2920-delete-vergen runtime-rs: delete vergen dependency	2022-08-17 10:09:12 +08:00
Bin Liu	34746496b7	libs/test-utils: share test code by create a new crate More and more Rust code is introduced, the test utils original in agent should be made easy to share, move it into a new crate will make it easy to share between different crates. Fixes: #4925 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-17 00:12:44 +08:00
GabyCT	dd93d4ad5a	Merge pull request #4922 from bergwolf/github/release workflow: trigger release for 3.x releases	2022-08-16 10:20:33 -05:00
Peng Tao	6d6c068692	workflow: trigger release for 3.x releases So that we can push 3.x artifacts to the release page. Fixes: #4919 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-16 17:55:51 +08:00
Bin Liu	eab7c8f28f	runtime-rs: delete vergen dependency vergen is a build dependency, but it is not being used. we are processing ver/commit hash by make command, but not by vergen. Fixes: #4920 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-16 15:31:24 +08:00
Bin Liu	828574d27c	Merge pull request #4893 from openanolis/runtime-rs-main Runtime-rs: support persist file	2022-08-16 14:42:22 +08:00
Bin Liu	334c7b3355	Merge pull request #4916 from GabyCT/topic/fixurl docs: Update url in containerd documentation	2022-08-16 13:45:58 +08:00
Bin Liu	f9d3181533	Merge pull request #4911 from bergwolf/3.0.0-alpha0-branch-bump # Kata Containers 3.0.0-alpha0	2022-08-16 13:44:49 +08:00
Gabriela Cervantes	3e9077f6ee	docs: Update url in containerd documentation This PR updates the url that we have in our kata containerd documentation. Fixes #4915 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-08-15 19:04:29 +00:00
Bin Liu	830fb266e6	Merge pull request #4854 from openanolis/runtime-rs-delete runtime-rs: delete route model	2022-08-15 20:48:58 +08:00
Prajwal Borkar	3829ab809f	docs: Update CRI-O target link Fixes #4767 Signed-off-by: Prajwal Borkar <prajwalborkar5075@gmail.com>	2022-08-15 16:48:32 +05:30
Peng Tao	52133ef66e	release: Kata Containers 3.0.0-alpha0 - runtime-rs: fix design doc's typo - docs: use curl as default downloader for runtime-rs - runtime-rs: update Cargo.lock - Fix some GitHub actions workflow issues - versions: Update libseccomp version - runtime-rs:merge runtime rs to main - nydus: wait nydusd API server ready before mounting share fs - versions: Update TD-shim due to build breakage - agent-ctl: Add an empty [workspace] - packaging: Create no_patches.txt for the SPR-BKC-PC-v9.6.x - docs: Improve SGX documentation - runtime: explicitly mark the source of the log is from qemu.log - runtime: add unlock before return in sendReq - docs: add back host network limitation - runk: add ps sub-command - Depends-on:github.com/kata-containers/tests#4986 - runtime-rs:update rtnetlink version - runtime-rs:skip the build process when the arch is s390x - docs: Improve SGX documentation - agent: Use rtnetlink's neighbours API to add neighbors - Bump TDX dependencies (QEMU and Kernel) - OVMF / td-shim: Adjust final tarball location - libs: fix CI error for protocols - runtime-rs: merge main to runtime-rs - packaging: Add support for building TDVF - versions: Track and add support for building TD-shim - versions: Upgrade rust version - Merge Main into runtime-rs branch - agent: log RPC calls for debugging - runtime-rs: fix stop failed in azure - Add support AmdSev build of OVMF - runtime: Support for host cgroupv2 - versions: Update runc version - qemu: Add liburing to qemu build - runtime-rs: fix set share sandbox pid namespace - Docs: fix tables format error - versions: Update Firecracker version to v1.1.0 - agent: Fix stream fd's double close - container: kill all of the processes in a container when it terminated - fix network failed for kata ci - runtime-rs: handle default_vcpus greator than default_maxvcpu - agent: fix fd-double-close problem in ut test_do_write_stream - runtime-rs: add functionalities support for macvlan and vlan endpoints - Docs: add rust environment setup for kata 3.0 - rustjail: check result to let it return early - upgrade nydus version - support disable_guest_seccomp - cgroups: remove unnecessary get_paths() - versions: Update firecracker version - kata-monitor: fix can't monitor /run/vc/sbs - runtime-rs: fix sandbox_cgroup_only=false panic - runtime-rs: fix ctr exit failed - docs: add installation guide for kata 3.0 - runtime-rs: support functionalities of ipvlan endpoint - runtime-rs: remove the value of hypervisor path in DB config - kata-sys-util: upgrade nix version - runtime-rs: fix some bugs to make runtime-rs on aarch64 - runk: Support `exec` sub-command - runtime-rs: hypervisor part - clh: Don't crash if no network device is set by the upper layer - packaging: Rework how ${BUILD_SUFFIX} is used with the QEMU builder scripts - versions: Update Cloud Hypervisor to v25.0 - Runtime-rs merge main - kernel: Deduplicate code used for building TEE kernels - runtime-rs: Dragonball-sandbox - add virtio device feature support for aarch64 - packaging: Simplify config path handling - build: save lines for repository_owner check - kata 3.0 Architecture - Fix clh tarball build - runtime-rs: built-in Dragonball sandbox part III - virtio-blk, virtio-fs, virtio-net and VMM API support - runtime: Fix DisableSelinux config - docs: Update URL links for containerd documentation - docs: delete CRI containerd plugin statement - release: Revert kata-deploy changes after 2.5.0-rc0 release - tools/snap: simplify nproc - action: revert commit message limit to 150 bytes - runtime-rs: Dragonball sandbox - add Vcpu::configure() function for aarch64 - runtime-rs: makefile for dragonball - runtime-rs:refactor network model with netlink - runtime-rs: Merge Main into runtime-rs branch - runtime-rs: built-in Dragonball sandbox part II - vCPU manager - runtime-rs: runtime-rs merge main - runtime-rs: built-in Dragonball sandbox part I - resource and device managers `caada34f1` runtime-rs: fix design doc's typo `b61dda40b` docs: use curl as default downloader for runtime-rs `ca9d16e5e` runtime-rs: update Cargo.lock `99a7b4f3e` workflow: Revert "static-checks: Allow Merge commit to be >75 chars" `d14e80e9f` workflow: Revert "docs: modify move-issues-to-in-progress.yaml" `1f4b6e646` versions: Update libseccomp version `8a4e69008` versions: Update TD-shim due to build breakage `065305f4a` agent-ctl: Add an empty [workspace] `1444d7ce4` packaging: Create no_patches.txt for the SPR-BKC-PC-v9.6.x `2ae807fd2` nydus: wait nydusd API server ready before mounting share fs `c8d4ea84e` docs: Improve SGX documentation `d8ad16a34` runtime: add unlock before return in sendReq `8bbffc42c` runtime-rs:update rtnetlink version `c5452faec` docs: Improve SGX documentation `389ae9702` runtime-rs:skip the test when the arch is s390x `945e02227` runtime-rs:skip the build process when the arch is s390x `8d1cb1d51` td-shim: Adjust final tarball location `62f05d4b4` ovmf: Adjust final tarball location `9972487f6` versions: Bump Kernel TDX version `c9358155a` kernel: Sort the TDX configs alphabetically `dd397ff1b` versions: Bump QEMU TDX version `230a22905` runk: add ps sub-command `889557ecb` docs: add back host network limitation `c9b5bde30` versions: Track and build TDVF `e6a5a5106` packaging: Generate a tarball as OVMF build result `42eaf19b4` packaging: Simplify OVMF repo clone `4d33b0541` packaging: Don't hardcode "edk2" as the cloned repo's dir. `7247575fa` runtime-rs:fix cargo clippy `b06bc8228` versions: Track and add support for building TD-shim `86ac653ba` libs: fix CI error for protocols `81fe51ab0` agent: fix unittests for arp neighbors `845c1c03c` agent: use rtnetlink's neighbours API to add neighbors `9b1940e93` versions: update rust version `638c2c416` static-build: Add AmdSev option for OVMF builder Introduces new build of firmware needed for SEV `f0b58e38d` static-build: Add build script for OVMF `fa0b11fc5` runtime-rs: fix stdin hang in azure `5c3155f7e` runtime: Support for host cgroup v2 `4ab45e5c9` docs: Update support for host cgroupv2 `326eb2f91` versions: Update runc version `f5aa6ae46` agent: Fix stream fd's double close problem `6e149b43f` Docs: fix tables format error `85f4e7caf` runtime: explicitly mark the source of the log is from qemu.log `56d49b507` versions: Update Firecracker version to v1.1.0 `b3147411e` runtime-rs:add unit test for set share pid ns `1ef3f8eac` runtime-rs: set share sandbox pid namespace `57c556a80` runtime-rs: fix stop failed in azure `0e24f47a4` agent: log RPC calls for debugging `c825065b2` runtime-rs: fix tc filter setup failed `e0194dcb5` runtime-rs: update route destination with prefix `fa85fd584` docs: add rust environment setup for kata 3.0 `896478c92` runtime-rs: add functionalities support for macvlan and vlan endpoints `df79c8fe1` versions: Update firecracker version `912641509` agent: fix fd-double-close problem in ut test_do_write_stream `43045be8d` runtime-rs: handle default_vcpus greator than default_maxvcpu `0d7cb7eb1` agent: delete agent-type property in announce `eec9ac81e` rustjail: check result to let it return early. `402bfa0ce` nydus: upgrade nydus/nydus-snapshotter version `54f53d57e` runtime-rs: support disable_guest_seccomp `4331ef80d` Runtime-rs: add installation guide for rust-runtime `72dbd1fcb` kata-monitor: fix can't monitor /run/vc/sbs. `e9988f0c6` runtime-rs: fix sandbox_cgroup_only=false panic `cebbebbe8` runtime-rs: fix ctr exit failed `62182db64` runtime-rs: add unit test for ipvlan endpoint `99654ce69` runtime-rs: update dbs-xxx dependencies `f4c3adf59` runtime-rs: Add compile option file `545ae3f0e` runtime-rs: fix warning `19eca71cd` runtime-rs: remove the value of hypervisor path in DB config `d8920b00c` runtime-rs: support functionalities of ipvlan endpoint `2b01e9ba4` dragonball: fix warning `996a6b80b` kata-sys-util: upgrade nix version `f690b0aad` qemu: Add liburing to qemu build `d93e4b939` container: kill all of the processes in this container `3c989521b` dragonball: update for review `274598ae5` kata-runtime: add dragonball config check support. `1befbe673` runtime-rs: Cargo lock for fix version problem `3d6156f6e` runtime-rs: support dragonball and runtime-binary `3f6123b4d` libs: update configuration and annotations `9ae2a45b3` cgroups: remove unnecessary get_paths() `be31207f6` clh: Don't crash if no network device is set by the upper layer `051181249` packaging: Add a "-" in the dir name if $BUILD_DIR is available `dc3b6f659` versions: Update Cloud Hypervisor to v25.0 `201ff223f` packaging: Use the $BUILD_SUFFIX when renaming the qemu binary `1a25afcdf` kernel: Allow passing the URL to download the tarball `80c68b80a` kernel: Deduplicate code used for building TEE kernels `d2584991e` dragonball: fix dependency unused warning `458f6f42f` dragonball: use const string for legacy device type `939959e72` docs: add Dragonball to hypervisors `f6f96b8fe` dragonball: add legacy device support for aarch64 `7a4183980` dragonball: add device info support for aarch64 `f7ccf92dc` kata-deploy: Rely on the configured config path `386a523a0` kata-deploy: Pass the config path to CRI-O `13df57c39` build: save lines for repository_owner check `57c2d8b74` docs: Update URL links for containerd documentation `e57a1c831` build: Mark git repos as safe for build `2551924bd` docs: delete CRI containerd plugin statement `9cee52153` fmt: do cargo fmt and add a dependency for blk_dev `47a4142e0` fs: change vhostuser and virtio into const `e14e98bbe` cpu_topo: add handle_cpu_topology function `5d3b53ee7` downtime: add downtime support `6a1fe85f1` vfio: add vfio as TODO `5ea35ddcd` refractor: remove redundant by_id `b646d7cb3` config: remove ht_enabled `cb54ac6c6` memory: remove reserve_memory_bytes `bde6609b9` hotplug: add room for other hotplug solution `d88b1bf01` dragonball: update vsock dependency `dd003ebe0` Dragonball: change error name and fix compile error `38957fe00` UT: fix compile error in unit tests `11b3f9514` dragonball: add virtio-fs device support `948381bdb` dragonball: add virtio-net device support `3d20387a2` dragonball: add virtio-blk device support `87d38ae49` Doc: add document for Dragonball API `2bb1eeaec` docs: further questions related to upcall `026aaeecc` docs: add FAQ to the report `fffcb8165` docs: update the content of the report `42ea854eb` docs: kata 3.0 Architecture `efdb92366` build: Fix clh source build as normal user `0e40ecf38` tools/snap: simplify nproc `f59939a31` runk: Support `exec` sub-command `4d89476c9` runtime: Fix DisableSelinux config `090de2dae` dragonball: fix the clippy errors. `a1593322b` dragonball: add vsock api to api server `89b9ba860` dragonball: add set_vm_configuration api `95fa0c70c` dragonball: add start microvm support `5c1ccc376` dragonball: add Vmm struct `4d234f574` dragonball: refactor code layout `cfd5dae47` dragonball: add vm struct `527b73a8e` dragonball: remove unused feature in AddressSpaceMgr `3bafafec5` action: extend commit message line limit to 150 bytes `5010c643c` release: Revert kata-deploy changes after 2.5.0-rc0 release `7120afe4e` dragonball: add vcpu test function for aarch64 `648d285a2` dragonball: add vcpu support for aarch64 `7dad7c89f` dragonball: update dbs-xxx dependency `07231b2f3` runtime-rs:refactor network model with netlink `c8a905206` build: format files `242992e3d` build: put install methods in utils.mk `8a697268d` build: makefile for dragonball config `9c526292e` runtime-rs:refactor network model with netlink `71db2dd5b` hotplug: add room for future acpi hotplug mechanism `8bb00a3dc` dragonball: fix a bug when generating kernel boot args `2aedd4d12` doc: add document for vCPU, api and device `bec22ad01` dragonball: add api module `07f44c3e0` dragonball: add vcpu manager `78c971875` dragonball: add upcall support `7d1953b52` dragonball: add vcpu `468c73b3c` dragonball: add kvm context `e89e6507a` dragonball: add signal handler `b6cb2c4ae` dragonball: add metrics system `e80e0c464` dragonball: add io manager wrapper `d5ee3fc85` safe-path: fix clippy warning `93c10dfd8` runtime-rs: add crosvm license in Dragonball `dfe6de771` dragonball: add dragonball into kata README `39ff85d61` dragonball: green ci `71f24d827` dragonball: add Makefile. `a1df6d096` Doc: Update Dragonball Readme and add document for device `8619f2b3d` dragonball: add virtio vsock device manager. `52d42af63` dragonball: add device manager. `c1c1e5152` dragonball: add kernel config. `6850ef99a` dragonball: add configuration manager. `0bcb422fc` dragonball: add legacy devices manager `3c45c0715` dragonball: add console manager. `3d38bb300` dragonball: add address space manager. `aff604055` dragonball: add resource manager support. `8835db6b0` dragonball: initial commit `9cb15ab4c` agent: add the FSGroup support `ff7874bc2` protobuf: upgrade the protobuf version to 2.27.0 `06f398a34` runtime-rs: use withContext to evaluate lazily `fd4c26f9c` runtime-rs: support network resource `4be7185aa` runtime-rs: runtime part implement `10343b1f3` runtime-rs: enhance runtimes `9887272db` libs: enhance kata-sys-util and kata-types `3ff0db05a` runtime-rs: support rootfs volume for resource `234d7bca0` runtime-rs: support cgroup resource `75e282b4c` runtime-rs: hypervisor base define `bdfee005f` runtime-rs: service and runtime framework `4296e3069` runtime-rs: agent implements `d3da156ee` runtime-rs: uint FsType for s390x `e705ee07c` runtime-rs: update containerd-shim-protos to 0.2.0 `8c0a60e19` runtime-rs: modify the review suggestion `278f843f9` runtime-rs: shim implements for runtime-rs `641b73610` libs: enhance kata-sys-util `69ba1ae9e` trans: fix the issue of wrong swapness type `d2a9bc667` agent: agent-protocol support async `aee9633ce` libs/sys-util: provide functions to execute hooks `8509de0ae` libs/sys-util: add function to detect and update K8s emptyDir volume `6d59e8e19` libs/sys-util: introduce function to get device id `5300ea23a` libs/sys-util: implement reflink_copy() `1d5c898d7` libs/sys-util: add utilities to parse NUMA information `87887026f` libs/sys-util: add utilities to manipulate cgroup `ccd03e2ca` libs/sys-util: add wrappers for mount and fs `45a00b4f0` libs/sys-util: add kata-sys-util crate under src/libs `48c201a1a` libs/types: make the variable name easier to understand `b9b6d70aa` libs/types: modify implementation details `05ad026fc` libs/types: fix implementation details `d96716b4d` libs/types:fix styles and implementation details `6cffd943b` libs/types:return Result to handle parse error `6ae87d9d6` libs/types: use contains to make code more readable `45e5780e7` libs/types: fixed spelling and grammer error `2599a06a5` libs/types:use include_str! in test file `8ffff40af` libs/types:Option type to handle empty tomlconfig `626828696` libs/types: add license for test-config.rs `97d8c6c0f` docs: modify move-issues-to-in-progress.yaml `8cdd70f6c` libs/types: change method to update config by annotation `e19d04719` libs/types: implement KataConfig to wrap TomlConfig `387ffa914` libs/types: support load Kata agent configuration from file `69f10afb7` libs/types: support load Kata hypervisor configuration from file `21cc02d72` libs/types: support load Kata runtime configuration from file `5b89c1df2` libs/types: add kata-types crate under src/libs `4f62a7618` libs/logging: fix clippy warnings `6f8acb94c` libs: refine Makefile rules `7cdee4980` libs/logging: introduce a wrapper writer for logging `426f38de9` libs/logging: implement rotator for log files `392f1ecdf` libs: convert to a cargo workspace `575df4dc4` static-checks: Allow Merge commit to be >75 chars Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-15 07:23:13 +00:00
Ji-Xinyou	ff7c78e0e8	runtime-rs: static resource mgmt default to false Static resource management should be default to false. If default to be true, later update sandbox operation, e.g. resize, will not work. Fixes: #4742 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-08-15 14:42:38 +08:00
Ji-Xinyou	00f3a6de12	runtime-rs: make static resource mgmt idiomatic Make the get value process (cpu and mem) more idiomatic. Fixes: #4742 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-08-15 11:18:35 +08:00
Zhongtao Hu	4d7f3edbaf	runtime-rs: support the functionality of cleanup Cleanup sandbox resource Fixes: #4891 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-13 15:56:38 +08:00
Zhongtao Hu	5aa83754e5	runtime-rs: support save to persist file and restore Support the functionality of save and restore for sandbox state Fixes:#4891 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-13 15:44:13 +08:00
Chelsea Mafrica	fcc1e0c617	runtime: tracing: End root span at end of trace The root span should exist the duration of the trace. Defer ending span until the end of the trace instead of end of function. Add the span to the service struct to do so. Fixes #4902 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-08-12 13:15:39 -07:00
GabyCT	97b7fe438a	Merge pull request #4898 from openanolis/fixdoc runtime-rs: fix design doc's typo	2022-08-12 10:06:44 -05:00
Bin Liu	2cd964ca79	Merge pull request #4881 from openanolis/runtime-rs-curl docs: use curl as default downloader for runtime-rs	2022-08-12 19:46:39 +08:00
Bin Liu	6a8e8dfc8e	Merge pull request #4876 from liubin/fix/4875-update-Cargo-lock runtime-rs: update Cargo.lock	2022-08-12 19:41:02 +08:00
Ji-Xinyou	caada34f1d	runtime-rs: fix design doc's typo Fix docs/design/architecture_3.0's typo. Both source code and png. Fixes: #4883 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-08-12 17:38:13 +08:00
Bin Liu	bfa86246f8	Merge pull request #4872 from liubin/fix/4871-github-actions-fix Fix some GitHub actions workflow issues	2022-08-11 19:26:15 +08:00
Zhongtao Hu	c280d6965b	runtime-rs: delete route model As route model is used for specific internal scenario, and it's not for the general requirement. Fixes:#4838 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-11 15:56:43 +08:00
Zhongtao Hu	b61dda40b7	docs: use curl as default downloader for runtime-rs use curl as default downloader for runtime-rs Fixes: #4879 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-11 15:52:13 +08:00
Fabiano Fidêncio	881c87a25c	Merge pull request #4859 from GabyCT/topic/updatelibse versions: Update libseccomp version	2022-08-11 09:34:44 +02:00
Bin Liu	ca9d16e5ea	runtime-rs: update Cargo.lock Update Cargo.lock Fixes: #4875 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-11 10:34:36 +08:00
Ji-Xinyou	4a54876dde	runtime-rs: support static resource management functionality Supports functionalities of static resource management, enabled by default. Fixes: #4742 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-08-11 09:46:44 +08:00
Bin Liu	99a7b4f3e1	workflow: Revert "static-checks: Allow Merge commit to be >75 chars" This reverts commit `575df4dc4d`. Fixes: #4871 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-11 08:59:02 +08:00
Bin Liu	d14e80e9fd	workflow: Revert "docs: modify move-issues-to-in-progress.yaml" This reverts commit `97d8c6c0fa`. Fixes: #4871 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-11 08:58:43 +08:00
Bin Liu	cb7f9524be	Merge pull request #4804 from openanolis/anolis/merge_runtime_rs_to_main runtime-rs:merge runtime rs to main	2022-08-11 08:40:41 +08:00
Tim Zhang	4813a3cef9	Merge pull request #4711 from liubin/fix/4710-wait-nydusd-api-server-ready nydus: wait nydusd API server ready before mounting share fs	2022-08-10 17:20:17 +08:00
Gabriela Cervantes	1f4b6e6460	versions: Update libseccomp version This PR updates the libseccomp version at the versions.yaml that is being used in the kata CI. Fixes #4858 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-08-09 14:27:59 +00:00
GabyCT	4d07c86cf1	Merge pull request #4846 from fidencio/topic/update-td-shim-due-to-build-breakage versions: Update TD-shim due to build breakage	2022-08-08 11:50:49 -05:00
Fabiano Fidêncio	b0fa44165e	Merge pull request #4844 from fidencio/topic/agent-ctl-add-an-empty-workspace agent-ctl: Add an empty [workspace]	2022-08-08 17:08:43 +02:00
Fabiano Fidêncio	a8176d0218	Merge pull request #4842 from fidencio/topic/packaging-create-no_patches.txt-for-the-SPR-BKC-PC-v9.6.x-kernel packaging: Create no_patches.txt for the SPR-BKC-PC-v9.6.x	2022-08-08 17:05:26 +02:00
Fabiano Fidêncio	8a4e690089	versions: Update TD-shim due to build breakage "We need a newer nightly 1.62 rust to deal with the change rust-lang/libc@576f778 on crate libc which breaks the compilation." This comes from the a pull-request raised on TD-shim repo, https://github.com/confidential-containers/td-shim/pull/354, which fixes the issues with the commit being used with Kata Containers. Let's bump to a newer commit of TD-shim and to a newer version of the nightly toolchain as part of our versions file. Fixes: #4840 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-08 15:53:57 +02:00
Fabiano Fidêncio	8854b4de2c	Merge pull request #4836 from cmaf/sgx-update-docs-2 docs: Improve SGX documentation	2022-08-08 12:15:04 +02:00
Fabiano Fidêncio	065305f4a1	agent-ctl: Add an empty [workspace] "An empty [workspace] can be used with a package to conveniently create a workspace with the package and all of its path dependencies", according to the https://doc.rust-lang.org/cargo/reference/workspaces.html This is also matches with the suggestion provided by the Cargo itself, due to the errors faced with the Cloud Hypervisor CI: ``` 10:46:23 this may be fixable by adding `go/src/github.com/kata-containers/kata-containers/src/tools/agent-ctl` to the `workspace.members` array of the manifest located at: /tmp/jenkins/workspace/kata-containers-2-clh-PR/Cargo.toml 10:46:23 Alternatively, to keep it out of the workspace, add the package to the `workspace.exclude` array, or add an empty `[workspace]` table to the package's manifest. ``` Fixes: #4843 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-08 11:24:39 +02:00
Fabiano Fidêncio	1444d7ce42	packaging: Create no_patches.txt for the SPR-BKC-PC-v9.6.x The file was added as part of the commit that tested this changes in the CCv0 branch, but forgotten when re-writing it to the `main` branch. Fixes: #4841 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-08 11:00:23 +02:00
liubin	2ae807fd29	nydus: wait nydusd API server ready before mounting share fs If the API server is not ready, the mount call will fail, so before mounting share fs, we should wait the nydusd is started and the API server is ready. Fixes: #4710 Signed-off-by: liubin <liubin0329@gmail.com> Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-08 16:18:38 +08:00
Tim Zhang	8d4d98587f	Merge pull request #4746 from liubin/fix/4745-add-log-field runtime: explicitly mark the source of the log is from qemu.log	2022-08-08 15:21:01 +08:00
Bin Liu	9516286f6d	Merge pull request #4829 from LetFu/fix/addUnlock runtime: add unlock before return in sendReq	2022-08-08 14:42:44 +08:00
Archana Shinde	c1e3b8f40f	govmm: Refactor qmp functions for adding block device Instead of passing a bunch of arguments to qmp functions for adding block devices, use govmm BlockDevice structure to reduce these. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	598884f374	govmm: Refactor code to get rid of redundant code Get rid of redundant return values from function. args and blockdevArgs used to return different values to maintain compatilibity between qemu versions. These are exactly the same now. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	00860a7e43	qmp: Pass aio backend while adding block device Allow govmm to pass aio backend while adding block device. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	e1b49d7586	config: Add block aio as a supported annotation Allow Block AIO to be passed as a per pod annotation. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	ed0f1d0b32	config: Add "block_device_aio" as a config option for qemu This configuration will allow users to choose between different I/O backends for qemu, with the default being io_uring. This will allow users to fallback to a different I/O mechanism while running on kernels olders than 5.1. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	83a919a5ea	Merge pull request #4795 from liubin/fix/4794-update-limitation docs: add back host network limitation	2022-08-05 23:00:47 +05:30
Chelsea Mafrica	c8d4ea84e3	docs: Improve SGX documentation Remove line about annotations support in CRI-O and containerd since it has been supported for a couple years. Fixes #4819 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-08-05 09:57:50 -07:00
Fabiano Fidêncio	e2968b177d	Merge pull request #4763 from cyyzero/runk-ps runk: add ps sub-command	2022-08-05 16:28:38 +02:00
chmod100	d8ad16a34e	runtime: add unlock before return in sendReq Unlock is required before return, so there need to add unlock Fixes: #4827 Signed-off-by: chmod100 <letfu@outlook.com>	2022-08-05 13:30:12 +00:00
Peng Tao	b828190158	Merge pull request #4823 from openanolis/runtime-rs-merge-main-runtime-rs Depends-on:github.com/kata-containers/tests#4986 Runtime-rs:merge main runtime rs	2022-08-05 14:42:22 +08:00
Peng Tao	f791169efc	Merge pull request #4826 from openanolis/runtime-rs-version runtime-rs:update rtnetlink version	2022-08-05 14:28:46 +08:00
Zhongtao Hu	8bbffc42cf	runtime-rs:update rtnetlink version update rtnetlink version for runtime-rs Fixes:#4824 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-05 11:18:09 +08:00
Zhongtao Hu	e403838131	runtim-rs: Merge remote-tracking branch 'origin/main' into runtime-rs To keep runtime-rs up to date, we will merge main into runtime-rs every week. Fixes:kata-containers#4822 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-05 10:49:33 +08:00
Bin Liu	931251105b	Merge pull request #4817 from openanolis/runtime-rs-s390x-fail runtime-rs:skip the build process when the arch is s390x	2022-08-05 08:23:13 +08:00
Salvador Fuentes	587c0c5e55	Merge pull request #4820 from cmaf/sgx-update-docs-1 docs: Improve SGX documentation	2022-08-04 15:59:33 -05:00
Chelsea Mafrica	c5452faec6	docs: Improve SGX documentation Update documentation with details regarding intel-device-plugins-for-kubernetes setup and dependencies. Fixes #4819 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-08-04 12:49:01 -07:00
GabyCT	2764bd7522	Merge pull request #4770 from justxuewei/refactor/agent/netlink-neighbor agent: Use rtnetlink's neighbours API to add neighbors	2022-08-04 12:09:30 -05:00
Zhongtao Hu	389ae97020	runtime-rs:skip the test when the arch is s390x github.com/kata-containers/tests#4986.To avoid returning an error when running the ci, we just skip the test if the arch is s390x Fixes: #4816 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-04 21:13:50 +08:00
Zhongtao Hu	945e02227c	runtime-rs:skip the build process when the arch is s390x github.com/kata-containers/tests#4986.To avoid returning an error when running the ci, we just skip the build process if the arch is s390x Fixes: #4816 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-04 21:13:40 +08:00
Archana Shinde	b6cd2348f5	govmm: Add io_uring as AIO type io_uring was introduced as a new kernel IO interface in kernel 5.1. It is designed for higher performance than the older Linux AIO API. This feature was added in qemu 5.0. Fixes #4645 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-03 10:43:12 -07:00
Archana Shinde	81cdaf0771	govmm: Correct documentation for Linux aio. The comments for "native" aio are incorrect. Correct these. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-03 10:41:50 -07:00
Fabiano Fidêncio	578121124e	Merge pull request #4805 from fidencio/topic/bump-tdx-dependencies Bump TDX dependencies (QEMU and Kernel)	2022-08-03 19:31:26 +02:00
Fabiano Fidêncio	869e408516	Merge pull request #4810 from fidencio/topic/adjust-final-tarball-location-for-tdvf-and-td-shim OVMF / td-shim: Adjust final tarball location	2022-08-03 16:55:14 +02:00
Fabiano Fidêncio	8d1cb1d513	td-shim: Adjust final tarball location Let's create the td-shim tarball in the directory where the script was called from, instead of doing it in the $DESTDIR. This aligns with the logic being used for creating / extracting the tarball content, which is already in use by the kata-deploy local build scripts. Fixes: #4809 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-03 14:58:44 +02:00
Fabiano Fidêncio	62f05d4b48	ovmf: Adjust final tarball location Let's create the OVMF tarball in the directory where the script was called from, instead of doing it in the $DESTDIR. This aligns with the logic being used for creating / extracting the tarball content, which is already in use by the kata-deploy local build scripts. Fixes: #4808 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-03 14:58:29 +02:00
Fabiano Fidêncio	9972487f6e	versions: Bump Kernel TDX version The latest kernel with TDX support should be pulled from a different repo (https://github.com/intel/linux-kernel-dcp, instead of https://github.com/intel/tdx), and the latest version to be used is SPR-BKC-PC-v9.6. With the new version being used, let's make sure we enable the INTEL_TDX_ATTESTATION config option, and all the dependencies needed to do so. Fixes: #4803 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-03 12:00:49 +02:00
Fabiano Fidêncio	c9358155a2	kernel: Sort the TDX configs alphabetically Let's just re-order the TDX configs alphabetically. No new config has been added or removed, thus no need to bump the kernel version. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-03 11:57:02 +02:00
Fabiano Fidêncio	dd397ff1bf	versions: Bump QEMU TDX version Let's use the latest tag provided in the "https://github.com/intel/qemu-dcp" repo, "SPR-BKC-QEMU-v2.5". Fixes: #4802 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-03 11:00:36 +02:00
Ji-Xinyou	a355812e05	runtime-rs: fixed bug on core-sched error handling Kernel code returns -errno, this should check negative values. Fixes: #4429 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-08-03 15:26:48 +08:00
Bin Liu	8b0e1859cb	Merge pull request #4784 from openanolis/fix-protocol-ci-err libs: fix CI error for protocols	2022-08-03 11:03:02 +08:00
Bin Liu	b337390c28	Merge pull request #4791 from openanolis/runtime-rs-merge-main-1 runtime-rs: merge main to runtime-rs	2022-08-03 11:00:54 +08:00
Chelsea Mafrica	873e75b915	Merge pull request #4773 from fidencio/topic/build-tdvf packaging: Add support for building TDVF	2022-08-02 09:14:13 -07:00
Chen Yiyang	230a229052	runk: add ps sub-command ps command supprot two formats, `json` and `table`. `json` format just outputs pids in the container. `table` format will use `ps` utilty in the host, search and output all processes in the container. Add a struct `container` to represent a spawned container. Move the `kill` implemention from kill.rs as a method of `container`. Fixes: #4361 Signed-off-by: Chen Yiyang <cyyzero@qq.com>	2022-08-02 20:45:50 +08:00
Ji-Xinyou	591dfa4fe6	runtime-rs: add support for core scheduling Linux 5.14 supports core scheduling to have better security control for SMT siblings. This PR supports that. Fixes: #4429 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-08-02 17:54:04 +08:00
Bin Liu	889557ecb1	docs: add back host network limitation Kata Containers doesn't support host network namespace, it's a common issue for new users. The limitation is deleted, this commit will add them back. Also, Docker has support to run containers using Kata Containers, delete Docker from not support list. This commit reverts parts of #3710 Fixes: #4794 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-02 15:58:16 +08:00
Fabiano Fidêncio	c9b5bde30b	versions: Track and build TDVF TDVF is the firmware used by QEMU to start TDX capable VMs. Let's start tracking it as it'll become part of the Confidential Containers sooner or later. TDVF lives in the public https://github.com/tianocore/edk2-staging repo and we're using as its version tags that are consumed internally at Intel. Fixes: #4624 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-02 09:51:47 +02:00
Fabiano Fidêncio	e6a5a5106d	packaging: Generate a tarball as OVMF build result Instead of having as a result the directory where OVMF artefacts where installed, let's follow what we do with the other components and have a tarball as a result of the OVMF build. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-02 09:48:59 +02:00
Fabiano Fidêncio	42eaf19b43	packaging: Simplify OVMF repo clone Instead of cloning the repo, and then switching to a specific branch, let's take advantage of `--branch` and directly clone the specific branch / tag. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-02 09:48:59 +02:00
Fabiano Fidêncio	4d33b0541d	packaging: Don't hardcode "edk2" as the cloned repo's dir. As TDVF comes from a different repo, the edk2-staging one, we cannot simply hardcode the name. Instead, let's get the name of the directory from name of the git repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-02 09:48:59 +02:00
Zhongtao Hu	7247575fa2	runtime-rs:fix cargo clippy fix cargo clippy Fixes: #4791 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-02 13:17:37 +08:00
Zhongtao Hu	9803393f2f	runtime-rs: Merge branch 'main' into runtime-rs-merge-main-1 To keep runtime-rs up to date, we will merge main into runtime-rs every week. Fixes: #4790 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-02 10:53:01 +08:00
Fabiano Fidêncio	7503bdab6e	Merge pull request #4783 from fidencio/topic/build-td-shim versions: Track and add support for building TD-shim	2022-08-01 20:50:58 +02:00
Fabiano Fidêncio	b06bc82284	versions: Track and add support for building TD-shim TD-shim is a simplified TDX virtual firmware, used by Cloud Hypervisor, in order to create a TDX capable VM. TD-shim is heavily under development, and is hosted as part of the Confidential Containers project: https://github.com/confidential-containers/td-shim The version chosen for this commit, is a version that's being tested inside Intel, but we, most likely, will need to change it before we have it officially packaged as part of an official release. Fixes: #4779 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-01 16:36:12 +02:00
Bin Liu	8d9135a7ce	Merge pull request #4765 from ryansavino/ccv0-rust-upgrade versions: Upgrade rust version	2022-08-01 17:15:05 +08:00
Quanwei Zhou	86ac653ba7	libs: fix CI error for protocols Fix CI error for protocols. Fixes: #4781 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-08-01 16:26:52 +08:00
Xuewei Niu	81fe51ab0b	agent: fix unittests for arp neighbors Set an ARP address explicitly before netlink::test_add_one_arp_neighbor() running. Signed-off-by: Xuewei Niu <justxuewei@apache.org>	2022-08-01 16:19:25 +08:00
Xuewei Niu	845c1c03cf	agent: use rtnetlink's neighbours API to add neighbors Bump rtnetlink version from 0.8.0 to 0.11.0. Use rtnetlinks's API to add neighbors and fix issues to adapt new verson of rtnetlink. Fixes: #4607 Signed-off-by: Xuewei Niu <justxuewei@apache.org>	2022-08-01 13:44:07 +08:00
Bin Liu	993ae24080	Merge pull request #4777 from openanolis/runtime-rs-merge Merge Main into runtime-rs branch	2022-08-01 13:04:31 +08:00
Zhongtao Hu	adfad44efe	Merge remote-tracking branch 'origin/main' into runtime-rs-merge-tmp To keep runtime-rs up to date, we will merge main into runtime-rs every week. Fixes:#4776 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-01 11:12:48 +08:00
Ryan Savino	9b1940e93e	versions: update rust version Fixes #4764 versions: update rust version to fix ccv0 attestation-agent build error static-checks: kata tools, libs, and agent fixes Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-07-29 18:41:43 -05:00
Peng Tao	0aefab4d80	Merge pull request #4739 from liubin/fix/4738-trace-rpc-calls agent: log RPC calls for debugging	2022-07-29 14:18:23 +08:00
Peng Tao	5457deb034	Merge pull request #4741 from openanolis/fix-stop-failed-in-azure runtime-rs: fix stop failed in azure	2022-07-29 11:41:16 +08:00
Fabiano Fidêncio	54147db921	Merge pull request #4170 from Alex-Carter01/build-amdsev-ovmf Add support AmdSev build of OVMF	2022-07-28 19:42:50 +02:00
Alex Carter	638c2c4164	static-build: Add AmdSev option for OVMF builder Introduces new build of firmware needed for SEV Fixes: kata-containers#4169 Signed-off-by: Alex Carter <alex.carter@ibm.com>	2022-07-28 09:56:06 -05:00
Alex Carter	f0b58e38d2	static-build: Add build script for OVMF Introduces a build script for OVMF. Defaults to X86_64 build (x64 in OVMF) Fixes: #4169 Signed-off-by: Alex Carter <alex.carter@ibm.com>	2022-07-28 09:07:49 -05:00
Quanwei Zhou	fa0b11fc52	runtime-rs: fix stdin hang in azure Fix stdin hang in azure. Fixes: #4740 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-07-28 16:16:37 +08:00
Bin Liu	a67402cc1f	Merge pull request #4397 from yaoyinnan/3073/ftr/host-cgroupv2 runtime: Support for host cgroupv2	2022-07-28 14:30:03 +08:00
Tim Zhang	229ff29c0f	Merge pull request #4758 from GabyCT/topic/updaterunc versions: Update runc version	2022-07-28 14:12:58 +08:00
yaoyinnan	5c3155f7e2	runtime: Support for host cgroup v2 Support cgroup v2 on the host. Update vendor containerd/cgroups to add cgroup v2. Fixes: #3073 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2022-07-28 10:30:45 +08:00
yaoyinnan	4ab45e5c93	docs: Update support for host cgroupv2 Currently cgroup v2 is supported. Remove the note that host cgroup v2 is not supported. Fixes: #3073 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2022-07-28 10:30:44 +08:00
GabyCT	9dfd949f23	Merge pull request #4646 from amshinde/add-liburing-qemu qemu: Add liburing to qemu build	2022-07-27 15:47:49 -05:00
Gabriela Cervantes	326eb2f910	versions: Update runc version This PR updates the runc version to v1.1.0. Fixes #4757 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-07-27 16:19:11 +00:00
Bin Liu	50b0b7cc15	Merge pull request #4681 from Tim-0731-Hzt/runtime-rs-sharepid runtime-rs: fix set share sandbox pid namespace	2022-07-27 21:43:58 +08:00
Bin Liu	557229c39d	Merge pull request #4724 from yahaa/fix-docs Docs: fix tables format error	2022-07-27 21:13:29 +08:00
Bin Liu	09672eb2da	agent: do some rollback works if case of do_create_container failed In some cases do_create_container may return an error, mostly due to `container.start(process)` call. This commit will do some rollback works if this function failed. Fixes: #4749 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-07-27 10:23:46 +08:00
Archana Shinde	1b01ea53d9	Merge pull request #4735 from nubificus/feature-fc-v1.1 versions: Update Firecracker version to v1.1.0	2022-07-27 04:50:32 +05:30
Peng Tao	27c82018d1	Merge pull request #4753 from Tim-Zhang/agent-fix-stream-fd-double-close agent: Fix stream fd's double close	2022-07-27 00:54:07 +08:00
Bin Liu	6fddf031df	Merge pull request #4664 from lifupan/main container: kill all of the processes in a container when it terminated	2022-07-26 23:12:11 +08:00
Tim Zhang	f5aa6ae467	agent: Fix stream fd's double close problem The fd would be closed on Pipestream's dropping and we should not close it agian. Fixes: #4752 Signed-off-by: Tim Zhang <tim@hyper.sh>	2022-07-26 20:05:06 +08:00
yahaa	6e149b43f7	Docs: fix tables format error Fixes: #4725 Signed-off-by: yahaa <1477765176@qq.com>	2022-07-26 19:05:09 +08:00
Bin Liu	85f4e7caf6	runtime: explicitly mark the source of the log is from qemu.log In qemu.StopVM(), if debug is enabled, the shim will dump logs from qemu.log, but users don't know which logs are from qemu.log and shim itself. Adding some additional messages will help users to distinguish these logs. Fixes: #4745 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-07-26 16:08:59 +08:00
Peng Tao	129335714b	Merge pull request #4727 from openanolis/anolis-fix-network fix network failed for kata ci	2022-07-26 15:10:55 +08:00
Peng Tao	71384b60f3	Merge pull request #4713 from openanolis/adjust_default_vcpu runtime-rs: handle default_vcpus greator than default_maxvcpu	2022-07-26 15:02:34 +08:00
gntouts	56d49b5073	versions: Update Firecracker version to v1.1.0 This patch upgrades Firecracker version from v0.23.4 to v1.1.0 * Generate swagger models for v1.1.0 (from firecracker.yaml) * Replace ht_enabled param to smt (API change) * Remove NUMA-related jailer param --node 0 Fixes: #4673 Depends-on: github.com/kata-containers/tests#4968 Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk> Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2022-07-26 07:01:26 +00:00
Zhongtao Hu	b3147411e3	runtime-rs:add unit test for set share pid ns Fixes:#4680 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-26 14:42:00 +08:00
Zhongtao Hu	1ef3f8eac6	runtime-rs: set share sandbox pid namespace Set the share sandbox pid namepsace from spec Fixes:#4680 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-26 14:41:59 +08:00
Quanwei Zhou	57c556a801	runtime-rs: fix stop failed in azure Fix the stop failed in azure. Fixes: #4740 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-07-26 12:16:32 +08:00
liubin	0e24f47a43	agent: log RPC calls for debugging We can log all RPC calls to the agent for debugging purposes to check which RPC is called, which can help us to understand the container lifespan. Fixes: #4738 Signed-off-by: liubin <liubin0329@gmail.com>	2022-07-26 10:32:44 +08:00
Tim Zhang	e764a726ab	Merge pull request #4715 from Tim-Zhang/fix-ut-test_do_write_stream agent: fix fd-double-close problem in ut test_do_write_stream	2022-07-25 17:34:26 +08:00
Peng Tao	3f4dd92c2d	Merge pull request #4702 from openanolis/runtime-rs-endpoint-dev runtime-rs: add functionalities support for macvlan and vlan endpoints	2022-07-25 17:04:45 +08:00
Peng Tao	a3127a03f3	Merge pull request #4721 from openanolis/install-guide-2 Docs: add rust environment setup for kata 3.0	2022-07-25 16:50:20 +08:00
Tim Zhang	427b29454a	Merge pull request #4709 from liubin/fix/4708-unwrap-error rustjail: check result to let it return early	2022-07-25 15:05:20 +08:00
Tim Zhang	0337377838	Merge pull request #4695 from liubin/4694/upgrade-nydus-version upgrade nydus version	2022-07-25 15:05:04 +08:00
Quanwei Zhou	c825065b27	runtime-rs: fix tc filter setup failed Fix bug using tc filter and protocol needs to use network byte order. Fixes: #4726 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-07-25 11:16:33 +08:00
Quanwei Zhou	e0194dcb5e	runtime-rs: update route destination with prefix Update route destination with prefix. Fixes: #4726 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-07-25 11:16:22 +08:00
Bin Liu	534a4920b1	Merge pull request #4692 from openanolis/support_disable_guest_seccomp support disable_guest_seccomp	2022-07-25 11:08:41 +08:00
Zhongtao Hu	fa85fd584e	docs: add rust environment setup for kata 3.0 add more details for rust set up in kata 3.0 install guide Fixes: #4720 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-25 09:48:18 +08:00
Wainer Moschetta	0b4a91ec1a	Merge pull request #4644 from bookinabox/optimize-get-paths cgroups: remove unnecessary get_paths()	2022-07-22 17:01:01 -03:00
Ji-Xinyou	896478c92b	runtime-rs: add functionalities support for macvlan and vlan endpoints Add macvlan and vlan support to runtime-rs code and corresponding unit tests. Fixes: #4701 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-07-22 10:09:11 +08:00
GabyCT	68c265587c	Merge pull request #4718 from GabyCT/topic/updatefirecrackerversion versions: Update firecracker version	2022-07-21 14:26:57 -05:00
Gabriela Cervantes	df79c8fe1d	versions: Update firecracker version This PR updates the firecracker version that is being used in kata CI. Fixes #4717 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-07-21 16:10:29 +00:00
Tim Zhang	912641509e	agent: fix fd-double-close problem in ut test_do_write_stream The fd will closed on struct Process's dropping, so don't close it again manually. Fixes: #4598 Signed-off-by: Tim Zhang <tim@hyper.sh>	2022-07-21 19:37:15 +08:00
Zhongtao Hu	43045be8d1	runtime-rs: handle default_vcpus greator than default_maxvcpu when the default_vcpus is greater than the default_maxvcpus, the default vcpu number should be set equal to the default_maxvcpus. Fixes: #4712 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-21 16:37:56 +08:00
liubin	0d7cb7eb16	agent: delete agent-type property in announce Since there is only one type of agent now, the agent-type is not needed anymore. Signed-off-by: liubin <liubin0329@gmail.com>	2022-07-21 14:53:01 +08:00
liubin	eec9ac81ef	rustjail: check result to let it return early. check the result to let it return early if there are some errors Fixes: #4708 Signed-off-by: liubin <liubin0329@gmail.com>	2022-07-21 14:51:30 +08:00
liubin	402bfa0ce3	nydus: upgrade nydus/nydus-snapshotter version Upgrade nydus/nydus-snapshotter to the latest version. Fixes: #4694 Signed-off-by: liubin <liubin0329@gmail.com>	2022-07-21 14:39:14 +08:00
Quanwei Zhou	54f53d57ef	runtime-rs: support disable_guest_seccomp support disable_guest_seccomp Fixes: #4691 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-07-21 07:46:28 +08:00
Peng Tao	6d56cdb9ac	Merge pull request #4686 from xujunjie-cover/issue4685 kata-monitor: fix can't monitor /run/vc/sbs	2022-07-19 23:40:14 +08:00
Bin Liu	540303880e	Merge pull request #4688 from quanweiZhou/fix_sandbox_cgroup_false runtime-rs: fix sandbox_cgroup_only=false panic	2022-07-19 20:38:57 +08:00
Peng Tao	7c146a5d95	Merge pull request #4684 from quanweiZhou/fix-ctr-exit-error runtime-rs: fix ctr exit failed	2022-07-19 16:02:20 +08:00
Peng Tao	08a6581673	Merge pull request #4662 from openanolis/runtime-rs-user-manaul docs: add installation guide for kata 3.0	2022-07-19 15:58:55 +08:00
Zhongtao Hu	4331ef80d0	Runtime-rs: add installation guide for rust-runtime add installation guide for rust-runtime Fixes:#4661 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-19 13:12:13 +08:00
Peng Tao	4c3bd6b1d1	Merge pull request #4656 from openanolis/runtime-rs-ipvlan runtime-rs: support functionalities of ipvlan endpoint	2022-07-19 11:15:31 +08:00
xujunjie-cover	72dbd1fcb4	kata-monitor: fix can't monitor /run/vc/sbs. need bind host dir /run/vc/sbs/ to kata monitor Fixes: #4685 Signed-off-by: xujunjie-cover <xujunjielxx@163.com>	2022-07-19 09:52:54 +08:00
Bin Liu	960f2a7f70	Merge pull request #4678 from Tim-0731-Hzt/runtime-rs-makefile-2 runtime-rs: remove the value of hypervisor path in DB config	2022-07-19 09:34:45 +08:00
Quanwei Zhou	e9988f0c68	runtime-rs: fix sandbox_cgroup_only=false panic When run with configuration `sandbox_cgroup_only=false`, we will call `gen_overhead_path()` as the overhead path. The `cgroup-rs` will push the path with the subsystem prefix by `PathBuf::push()`. When the path has prefix “/” it will act as root path, such as ``` let mut path = PathBuf::from("/tmp"); path.push("/etc"); assert_eq!(path, PathBuf::from("/etc")); ``` So we shoud not set overhead path with prefix "/". Fixes: #4687 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-07-19 08:30:34 +08:00
Quanwei Zhou	cebbebbe8a	runtime-rs: fix ctr exit failed During use, there will be cases where the container is in the stop state and get another stop. In this case, the second stop needs to be ignored. Fixes: #4683 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-07-19 07:43:22 +08:00
Bin Liu	758cc47b32	Merge pull request #4671 from liubin/4670-upgrade-nix kata-sys-util: upgrade nix version	2022-07-18 23:31:07 +08:00
Bin Liu	25be4d00fd	Merge pull request #4676 from openanolis/xuejun/runtime-rs runtime-rs: fix some bugs to make runtime-rs on aarch64	2022-07-18 23:29:32 +08:00
Ji-Xinyou	62182db645	runtime-rs: add unit test for ipvlan endpoint Add unit test to check the integrity of IPVlanEndpoint::new(...) Fixes: #4655 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-07-18 15:56:06 +08:00
xuejun-xj	99654ce694	runtime-rs: update dbs-xxx dependencies Update dbs-xxx commit ID for aarch64 in runtime-rs/Cargo.toml file to add dependencies for aarch64. Fixes: #4676 Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>	2022-07-18 13:46:46 +08:00
xuejun-xj	f4c3adf596	runtime-rs: Add compile option file Add file aarch64-options.mk for compiling on aarch64 architectures. Fixes: #4676 Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>	2022-07-18 13:46:46 +08:00
xuejun-xj	545ae3f0ee	runtime-rs: fix warning Module anyhow::anyhow is only used on x86_64 architecture in crates/hypervisor/src/device/vfio.rs file. Fixes: #4676 Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>	2022-07-18 13:46:39 +08:00
Zhongtao Hu	19eca71cd9	runtime-rs: remove the value of hypervisor path in DB config As a built in VMM, Path, jailer path, ctlpath are not needed for Dragonball. So we don't generate those value in Makefile. Fixes: #4677 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-18 13:37:51 +08:00
Ji-Xinyou	d8920b00cd	runtime-rs: support functionalities of ipvlan endpoint Add support for ipvlan endpoint Fixes: #4655 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-07-18 11:34:03 +08:00
xuejun-xj	2b01e9ba40	dragonball: fix warning Add map_err for vcpu_manager.set_reset_event_fd() function. Fixes: #4676 Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>	2022-07-18 09:52:13 +08:00
liubin	996a6b80bc	kata-sys-util: upgrade nix version New nix is supporting UMOUNT_NOFOLLOW, upgrade nix version to use this flag instead of the self-defined flag. Fixes: #4670 Signed-off-by: liubin <liubin0329@gmail.com>	2022-07-15 17:38:15 +08:00
Archana Shinde	f690b0aad0	qemu: Add liburing to qemu build io_uring is a Linux API for asynchronous I/O introduced in qemu 5.0. It is designed to better performance than older aio API. We could leverage this in order to get better storage performance. We should be adding liburing-dev to qemu build to leverage this feature. However liburing-dev package is not available in ubuntu 20.04, it is avaiable in 22.04. Upgrading the ubuntu version in the dockerfile to 22.04 is causing issues in the static qemu build related to libpmem. So instead we are building liburing from source until those build issues are solved. Fixes: #4645 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-07-14 19:21:47 -07:00
Fupan Li	d93e4b939d	container: kill all of the processes in this container When a container terminated, we should make sure there's no processes left after destroying the container. Before this commit, kata-agent depended on the kernel's pidns to destroy all of the process in a container after the 1 process exit in a container. This is true for those container using a separated pidns, but for the case of shared pidns within the sandbox, the container exit wouldn't trigger the pidns terminated, and there would be some daemon process left in this container, this wasn't expected. Fixes: #4663 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2022-07-14 16:39:49 +08:00
Bin Liu	575b5eb5f5	Merge pull request #4506 from cyyzero/runk-exec runk: Support `exec` sub-command	2022-07-14 14:22:24 +08:00
Bin Liu	9f49f7adca	Merge pull request #4493 from openanolis/runtime-rs-dev runtime-rs: hypervisor part	2022-07-14 13:49:34 +08:00
Quanwei Zhou	3c989521b1	dragonball: update for review update for review Fixes: #3785 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-07-14 10:43:59 +08:00
wllenyj	274598ae56	kata-runtime: add dragonball config check support. add dragonball config check support. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-14 10:43:50 +08:00
Chao Wu	1befbe6738	runtime-rs: Cargo lock for fix version problem Cargo lock for fix version problem Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-14 08:49:39 +08:00
Quanwei Zhou	3d6156f6ec	runtime-rs: support dragonball and runtime-binary Fixes: #3785 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com> Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-14 08:49:30 +08:00
Zhongtao Hu	3f6123b4dd	libs: update configuration and annotations 1. support annotation for runtime.name, hypervisor_name, agent_name. 2. fix parse memory from annotation Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-14 08:49:17 +08:00
Derek Lee	9ae2a45b38	cgroups: remove unnecessary get_paths() Change get_mounts to get paths from a borrowed argument rather than calling get_paths a second time. Fixes #3768 Signed-off-by: Derek Lee <derlee@redhat.com>	2022-07-13 09:17:14 -07:00
Bin Liu	0cc20f014d	Merge pull request #4647 from fidencio/topic/fix-clh-crash-when-booting-up-with-no-network-device clh: Don't crash if no network device is set by the upper layer	2022-07-13 21:28:46 +08:00
Fabiano Fidêncio	418a03a128	Merge pull request #4639 from fidencio/topic/packaging-rework-qemu-build-suffix packaging: Rework how ${BUILD_SUFFIX} is used with the QEMU builder scripts	2022-07-13 15:03:19 +02:00
Fabiano Fidêncio	be31207f6e	clh: Don't crash if no network device is set by the upper layer `ctr` doesn't set a network device when creating the sandbox, which leads to Cloud Hypervisor's driver crashing, see the log below: ``` panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x55641c23b248] goroutine 32 [running]: github.com/kata-containers/kata-containers/src/runtime/virtcontainers.glob..func1(0xc000397900) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:163 +0x128 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(cloudHypervisor).vmAddNetPut(...) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:1348 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(cloudHypervisor).bootVM(0xc000397900, {0x55641c76dfc0, 0xc000454ae0}) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:1378 +0x5a2 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(cloudHypervisor).StartVM(0xc000397900, {0x55641c76dff8, 0xc00044c240}, 0x55641b8016fd) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:659 +0x7ee github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(Sandbox).startVM.func2() /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/sandbox.go:1219 +0x190 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(LinuxNetwork).Run.func1({0xc0004a8910, 0x3b}) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:319 +0x1b github.com/kata-containers/kata-containers/src/runtime/virtcontainers.doNetNS({0xc000048440, 0xc00044c240}, 0xc0005d5b38) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:1045 +0x163 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(LinuxNetwork).Run(0xc000150c80, {0x55641c76dff8, 0xc00044c240}, 0xc00014e4e0) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:318 +0x105 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(Sandbox).startVM(0xc000107d40, {0x55641c76dff8, 0xc0005529f0}) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/sandbox.go:1205 +0x65f github.com/kata-containers/kata-containers/src/runtime/virtcontainers.createSandboxFromConfig({_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1}, {0x55641d033260, 0x0, ...}, ...}, ...) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/api.go:91 +0x346 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.CreateSandbox({_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1}, {0x55641d033260, 0x0, ...}, ...}, ...) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/api.go:51 +0x150 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(VCImpl).CreateSandbox(_, {_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1}, {0x55641d033260, ...}, ...}) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/implementation.go:35 +0x74 github.com/kata-containers/kata-containers/src/runtime/pkg/katautils.CreateSandbox({_, _}, {_, _}, {{0xc0004806c0, 0x9}, 0xc000140110, 0xc00000f7a0, {0x0, 0x0}, ...}, ...) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/katautils/create.go:175 +0x8b6 github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.create({0x55641c76dff8, 0xc0004129f0}, 0xc00034a000, 0xc00036a000) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/create.go:147 +0xdea github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.(service).Create.func2() /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/service.go:401 +0x32 created by github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.(service).Create /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/service.go:400 +0x534 ``` This bug has been introduced as part of the https://github.com/kata-containers/kata-containers/pull/4312 PR, which changed how we add the network device. In order to avoid the crash, let's simply check whether we have a device to be added before iterating the list of network devices. Fixes: #4618 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-13 10:40:21 +02:00
Peng Tao	39974fbacc	Merge pull request #4642 from fidencio/topic/clh-bump-to-v25.0-release versions: Update Cloud Hypervisor to v25.0	2022-07-13 16:08:01 +08:00
Fabiano Fidêncio	051181249c	packaging: Add a "-" in the dir name if $BUILD_DIR is available Currently $BUILD_DIR will be used to create a directory as: /opt/kata/share/kata-qemu${BUILD_DIR} It means that when passing a BUILD_DIR, like "foo", a name would be built like /opt/kata/share/kata-qemufoo We should, instead, be building it as /opt/kata/share/kata-qemu-foo. Fixes: #4638 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-12 21:27:41 +02:00
Fabiano Fidêncio	dc3b6f6592	versions: Update Cloud Hypervisor to v25.0 Cloud Hypervisor v25.0 has been released on July 7th, 2022, and brings the following changes: ch-remote Improvements The ch-remote command has gained support for creating the VM from a JSON config and support for booting and deleting the VM from the VMM. VM "Coredump" Support Under the guest_debug feature flag it is now possible to extract the memory of the guest for use in debugging with e.g. the crash utility. (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4012) Notable Bug Fixes * Always restore console mode on exit (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4249, https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4248) * Restore vCPUs in numerical order which fixes aarch64 snapshot/restore (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4244) * Don't try and configure IFF_RUNNING on TAP devices (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4279) * Propagate configured queue size through to vhost-user backend (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4286) * Always Program vCPU CPUID before running the vCPU to fix running on Linux 5.16 (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4156) * Enable ACPI MADT "Online Capable" flag for hotpluggable vCPUs to fix newer Linux guest Removals The following functionality has been removed: * The mergeable option from the virtio-pmem support has been removed (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/3968) * The dax option from the virtio-fs support has been removed (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/3889) Fixes: #4641 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-12 14:47:58 +00:00
Fabiano Fidêncio	201ff223f6	packaging: Use the $BUILD_SUFFIX when renaming the qemu binary Instead of always naming the binary as "-experimental", let's take advantage of the $BUILD_SUFFIX that's already passed and correctly name the binary according to it. Fixes: #4638 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-12 15:09:31 +02:00
Bin Liu	f3335c99ce	Merge pull request #4614 from Tim-0731-Hzt/runtime-rs-merge-main Runtime-rs merge main	2022-07-12 19:25:11 +08:00
Bin Liu	9f0e4bb775	Merge pull request #4628 from fidencio/topic/rework-tee-kernel-builds kernel: Deduplicate code used for building TEE kernels	2022-07-12 17:25:04 +08:00
Bin Liu	b424cf3c90	Merge pull request #4544 from openanolis/anolis/virtio_device_aarch64 runtime-rs: Dragonball-sandbox - add virtio device feature support for aarch64	2022-07-12 12:39:31 +08:00
Fabiano Fidêncio	cda1919a0a	Merge pull request #4609 from fidencio/topic/kata-deploy-simplify-config-path-handling packaging: Simplify config path handling	2022-07-11 23:48:54 +02:00
Fabiano Fidêncio	1a25afcdf5	kernel: Allow passing the URL to download the tarball Passing the URL to be used to download the kernel tarball is useful in various scenarios, mainly when doing a downstream build, thus let's add this new option. This new option also works around a known issue of the Dockerfile used to build the kernel not having `yq` installed. Fixes: #4629 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-11 14:23:49 +02:00
snir911	0024b8d10a	Merge pull request #4617 from Yuan-Zhuo/main build: save lines for repository_owner check	2022-07-11 15:04:35 +03:00
Fabiano Fidêncio	80c68b80a8	kernel: Deduplicate code used for building TEE kernels There's no need to have the entire function for building SEV / TDX duplicated. Let's remove those functions and create a `get_tee_kernel()` which takes the TEE as the argument. Fixes: #4627 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-11 13:25:17 +02:00
xuejun-xj	d2584991eb	dragonball: fix dependency unused warning Fix the warning "unused import: `dbs_arch::gic::Error as GICError`" and "unused import: `dbs_arch::gic::GICDevice`" in file src/vm/mod.rs when compiling. Fixes: #4544 Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com>	2022-07-11 17:55:04 +08:00
xuejun-xj	458f6f42f6	dragonball: use const string for legacy device type As string "com1", "com2" and "rtc" are used in two files (device_manager/mod.rs and device_manager/legacy.rs), we use public const variables COM1, COM2 and RTC to replace them respectively. Fixes: #4544 Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com>	2022-07-11 17:46:10 +08:00
James O. D. Hunt	58b0fc4794	Merge pull request #4192 from Tim-0731-Hzt/runtime-rs kata 3.0 Architecture	2022-07-11 09:34:17 +01:00
Zhongtao Hu	0826a2157d	Merge remote-tracking branch 'origin/main' into runtime-rs-1 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-11 09:47:23 +08:00
Zhongtao Hu	939959e726	docs: add Dragonball to hypervisors Fixes:#4193 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-11 09:38:17 +08:00
xuejun-xj	f6f96b8fee	dragonball: add legacy device support for aarch64 Implement RTC device for aarch64. Fixes: #4544 Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com>	2022-07-10 17:35:30 +08:00
xuejun-xj	7a4183980e	dragonball: add device info support for aarch64 Implement generate_virtio_device_info() and get_virtio_mmio_device_info() functions su support the mmio_device_info member, which is used by FDT. Fixes: #4544 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com>	2022-07-10 17:09:59 +08:00
Fabiano Fidêncio	46fd7ce025	Merge pull request #4595 from amshinde/fix-clh-tarball-build Fix clh tarball build	2022-07-08 20:15:30 +02:00
Peng Tao	30da3fb954	Merge pull request #4515 from openanolis/anolis/dragonball-3 runtime-rs: built-in Dragonball sandbox part III - virtio-blk, virtio-fs, virtio-net and VMM API support	2022-07-08 23:14:01 +08:00
Fabiano Fidêncio	f7ccf92dc8	kata-deploy: Rely on the configured config path Instead of passing a `KATA_CONF_FILE` environament variable, let's rely on the configured (in the container engine) config path, as both containerd and CRI-O support it, and we're using this for both of them. Fixes: #4608 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-08 15:02:26 +02:00
Fabiano Fidêncio	33360f1710	Merge pull request #4600 from ManaSugi/fix/selinux-hypervisor-config runtime: Fix DisableSelinux config	2022-07-08 13:05:25 +02:00
Fabiano Fidêncio	386a523a05	kata-deploy: Pass the config path to CRI-O As we're already doing for containerd, let's also pass the configuration path to CRI-O, as all the supported CRI-O versions do support this configuration option. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-08 12:36:47 +02:00
Yuan-Zhuo	13df57c393	build: save lines for repository_owner check repository_owner check in docs-url-alive-check.yaml now is specified for each step, it can be in job level to save lines. Fixes: #4611 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2022-07-08 10:40:30 +08:00
Bin Liu	f36bc8bc52	Merge pull request #4616 from GabyCT/topic/updatecontainerddoc docs: Update URL links for containerd documentation	2022-07-08 08:49:06 +08:00
Gabriela Cervantes	57c2d8b749	docs: Update URL links for containerd documentation This PR updates some url links related with containerd documentation. Fixes #4615 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-07-07 21:48:18 +00:00
Archana Shinde	e57a1c831e	build: Mark git repos as safe for build This is not an issue when the build is run as non-privilged user. Marking these as safe in case where the build may be run as root or some other user. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-07-07 12:11:00 -07:00
GabyCT	ee3f5558ae	Merge pull request #4606 from liubin/fix/4605-delete-cri-containerd-plugin docs: delete CRI containerd plugin statement	2022-07-07 09:35:36 -05:00
Fabiano Fidêncio	c09634dbc7	Merge pull request #4592 from fidencio/revert-kata-deploy-changes-after-2.5.0-rc0-release release: Revert kata-deploy changes after 2.5.0-rc0 release	2022-07-07 08:59:43 +02:00
liubin	2551924bda	docs: delete CRI containerd plugin statement There is no independent CRI containerd plugin for new containerd, the related documentation should be updated too. Fixes: #4605 Signed-off-by: liubin <liubin0329@gmail.com>	2022-07-07 12:06:25 +08:00
Bin Liu	bee7915932	Merge pull request #4533 from bookinabox/simplify-nproc tools/snap: simplify nproc	2022-07-07 11:38:29 +08:00
Chao Wu	9cee52153b	fmt: do cargo fmt and add a dependency for blk_dev fmt: do cargo fmt and add a dependency for blk_dev Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	47a4142e0d	fs: change vhostuser and virtio into const change fs mode vhostuser and virtio into const. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	e14e98bbeb	cpu_topo: add handle_cpu_topology function add handle_cpu_topology funciton to make it easier to understand the set_vm_configuration function. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	5d3b53ee7b	downtime: add downtime support add downtime support in `resume_all_vcpus_with_downtime` Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	6a1fe85f10	vfio: add vfio as TODO We add vfio as TODO in this commit and create a github issue for this. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	5ea35ddcdc	refractor: remove redundant by_id remove redundant by_id in get_vm_by_id_mut and get_vm_by_id. They are optimized to get_vm_mut and get_vm. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	b646d7cb37	config: remove ht_enabled Since cpu topology could tell whether hyper thread is enabled or not, we removed ht_enabled config from VmConfigInfo Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	cb54ac6c6e	memory: remove reserve_memory_bytes This is currently an unsupported feature and we will remove it from the current code. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	bde6609b93	hotplug: add room for other hotplug solution Add room in the code for other hotplug solution without upcall Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
wllenyj	d88b1bf01c	dragonball: update vsock dependency 1. fix vsock device init failed 2. fix VsockDeviceConfigInfo not found Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	dd003ebe0e	Dragonball: change error name and fix compile error Change error name from `StartMicrovm` to `StartMicroVm`, `StartMicrovmError` to `StartMicroVmError`. Besides, we fix a compile error in config_manager. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	38957fe00b	UT: fix compile error in unit tests fix compile error in unit tests for DummyConfigInfo. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
wllenyj	11b3f95140	dragonball: add virtio-fs device support Virtio-fs devices are supported. Fixes: #4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
wllenyj	948381bdbe	dragonball: add virtio-net device support Virtio-net devices are supported. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
wllenyj	3d20387a25	dragonball: add virtio-blk device support Virtio-blk devices are supported. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-07 10:32:35 +08:00
Chao Wu	87d38ae49f	Doc: add document for Dragonball API add detailed explanation for Dragonball API Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 10:32:26 +08:00
Zhongtao Hu	2bb1eeaecc	docs: further questions related to upcall add questions and answers for upcall Fixes:#4193 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-07-07 09:52:50 +08:00
Zhongtao Hu	026aaeeccc	docs: add FAQ to the report 1.provide answers for the qeustions will be frequently asked 2.format the document Fixes:#4193 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-07 09:52:50 +08:00
Christophe de Dinechin	fffcb81652	docs: update the content of the report 1. Explain why the current situation is a problem. 2. We are beyond a simple introduction now, it's a real proposal. 3. Explain why you think it is solid, and fix a grammatical error. 4. The Rust rationale does not really belong to the initial paragraph. Also, I rephrased it to highlight the contrast with Go and the Kata community's past experience switching to Rust for the agent. Fixes:#4193 Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>	2022-07-07 09:52:46 +08:00
Zhongtao Hu	42ea854eb6	docs: kata 3.0 Architecture An introduction for kata 3.0 architecture design Fixes:#4193 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com> Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>	2022-07-07 09:47:07 +08:00
Archana Shinde	efdb92366b	build: Fix clh source build as normal user While running make as non-privileged user, the make errors out with the following message: "INFO: Build cloud-hypervisor enabling the following features: tdx Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/create?fromImage=cloudhypervisor%2Fdev&tag=20220524-0": dial unix /var/run/docker.sock: connect: permission denied" Even though the user may be part of docker group, the clh build from source does a docker in docker build. It is necessary for the user of the nested container to be part of docker build for the build to succeed. Fixes #4594 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-07-06 18:28:00 -07:00
Derek Lee	0e40ecf383	tools/snap: simplify nproc Replaces calls of nproc with nproc with nproc ${CI:+--ignore 1} to run nproc with one less processing unit than the maximum to prevent DOS-ing the local machine. If process is being run in a container (determined via whether $CI is null), all processing units avaliable will be used. Fixes #3967 Signed-off-by: Derek Lee <derlee@redhat.com>	2022-07-06 15:04:08 -07:00
Chen Yiyang	f59939a31f	runk: Support `exec` sub-command `exec` will execute a command inside a container which exists and is not frozon or stopped. Inside means that the new process share namespaces and cgroup with the container init process. Command can be specified by `--process` parameter to read from a file, or from other parameters such as arg, env, etc. In order to be compatible with `create`/`run` commands, I refactor libcontainer. `Container` in builder.rs is divided into `InitContainer` and `ActivatedContainer`. `InitContainer` is used for `create`/`run` command. It will load spec from given bundle path. `ActivatedContainer` is used by `exec` command, and will read the container's status file, which stores the spec and `CreateOpt` for creating the rustjail::LinuxContainer. Adapt the spec by replacing the process with given options and updating the namesapces with some paths to join the container. I also rename the `ContainerContext` as `ContainerLauncher`, which is only used to spawn process now. It uses the `LinuxContaier` in rustjail as the runner. For `create`/`run`, the `launch` method will create a new container and run the first process. For `exec`, the `launch` method will spawn a process which joins a container. Fixes #4363 Signed-off-by: Chen Yiyang <cyyzero@qq.com>	2022-07-06 21:11:30 +08:00
Bin Liu	be68cf0712	Merge pull request #4597 from bergwolf/github/action action: revert commit message limit to 150 bytes	2022-07-06 17:13:15 +08:00
Manabu Sugimoto	4d89476c91	runtime: Fix DisableSelinux config Enable Kata runtime to handle `disable_selinux` flag properly in order to be able to change the status by the runtime configuration whether the runtime applies the SELinux label to VMM process. Fixes: #4599 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-07-06 15:50:28 +09:00
Fabiano Fidêncio	ac91fb7a12	Merge pull request #4591 from fidencio/2.5.0-rc0-branch-bump # Kata Containers 2.5.0-rc0	2022-07-06 08:24:14 +02:00
wllenyj	090de2dae2	dragonball: fix the clippy errors. fix clippy errors and do fmt in this PR. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-06 11:29:49 +08:00
wllenyj	a1593322bd	dragonball: add vsock api to api server Enables vsock to use the api for device configuration. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-06 11:29:49 +08:00
wllenyj	89b9ba8603	dragonball: add set_vm_configuration api Set virtual machine configuration configurations. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-06 11:29:49 +08:00
wllenyj	95fa0c70c3	dragonball: add start microvm support We add microvm start related support in thie pull request. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-06 11:29:49 +08:00
wllenyj	5c1ccc376b	dragonball: add Vmm struct The Vmm struct is global coordinator to manage API servers, virtual machines etc. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-06 11:29:49 +08:00
Jiang Liu	4d234f5742	dragonball: refactor code layout Refactored some code layout. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2022-07-06 11:29:49 +08:00
wllenyj	cfd5dae47c	dragonball: add vm struct The vm struct to manage resources and control states of an virtual machine instance. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-07-06 11:29:46 +08:00
wllenyj	527b73a8e5	dragonball: remove unused feature in AddressSpaceMgr log_dirty_pages is useless now and will be redesigned to support live migration in the future. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-06 11:28:32 +08:00
Peng Tao	3bafafec58	action: extend commit message line limit to 150 bytes So that we can add move info there and few people use such small terminals nowadays. Fixes: #4596 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-07-06 11:19:08 +08:00
Fabiano Fidêncio	5010c643c4	release: Revert kata-deploy changes after 2.5.0-rc0 release As 2.5.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup tags back to "latest", and re-add the kata-deploy-stable and the kata-cleanup-stable files. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2022-07-05 22:23:49 +02:00
Fabiano Fidêncio	2d29791c19	release: Kata Containers 2.5.0-rc0 - Drop in cfg files support - agent: enhance get handled signal - oci: fix serde skip serializing condition - agent: Run OCI poststart hooks after a container is launched - agent: Replace some libc functions with nix ones - runtime: overwrite mount type to bind for bind mounts - build: Set safe.directory for runtime repo - ci/cd: update check-commit-message - Set safe.directory against tests repository - runtime: delete Console from Cmd type - Add `default_maxmemory` config option - shim: set a non-zero return code if the wait process call failed. - Refactor how hypervisor config validation is handled - packaging: Remove unused kata docker configure script - kata-with-k8s: Add cgroupDriver for containerd - shim: support shim v2 logging plugin - device package cleanup/refactor - versions: Update kernel to latest LTS version 5.15.48 - agent: Allow BUILD_TYPE=debug - Fix clippy warnings and update agent's vendored code - block: Leverage multiqueue for virtio-block - kernel: Add CONFIG_EFI=y as part of the TDX fragments - runtime: Add heuristic to get the right value(s) for mem-reserve - runtime: enable sandbox feature on qemu - snap: fix snap build on ppc64le - packaging: Remove unused publish kata image script - rootfs: Fix chronyd.service failing on boot - tracing: Remove whitespace from root span - workflow: Removing man-db, workflow kept failing - docs: Update outdated URLs and keep them available - runtime: fix error when trying to parse sandbox sizing annotations - snap: Fix debug cli option - deps: Resolve dependabot bumps of containerd, crossbeam-utils, regex - Allow Cloud Hypervisor to run under the `container_kvm_t` - docs: Update containerd url link - agent: refactor reading file timing for debugging - safe-path: fix clippy warning - kernel building: efi_secret module - runtime: Switch to using the rust version of virtiofsd (all arches but powerpc) - shim: change the log level for GetOOMEvent call failures - docs: Add more kata monitor details - Allow io.katacontainers.config.hypervisor.enable_iommu annotation by … - versions: Bump virtiofsd to v1.3.0 - docs: Add storage limits to arch doc - docs: Update source for cri-tools - tools: Enable extra detail on error - docs: Add agent-ctl examples section `f4eea832a` release: Adapt kata-deploy for 2.5.0-rc0 `0ddb34a38` oci: fix serde skip serializing condition `fbb2e9bce` agent: Replace some libc functions with nix ones `acd3302be` agent: Run OCI poststart hooks after a container is launched `1f363a386` runtime: overwrite mount type to bind for bind mounts `4e48509ed` build: Set safe.directory for runtime repo `48ccd4233` ci: Set safe.directory against tests repository `2a4fbd6d8` agent: enhance get handled signal `433816cca` ci/cd: update check-commit-message `a5a25ed13` runtime: delete Console from Cmd type `96553e8bd` runtime: Add documentation of drop-in config file fragments `c656457e9` runtime: Add tests of drop-in config file decoding `99f5ca80f` runtime: Plug drop-in decoding into decodeConfig() `0f9856c46` runtime: Scan drop-in directory, read files and decode them `2c1efcc69` runtime: Add helpers to copy fields between tomlConfig instances `20f11877b` runtime: Add framework to manipulate config structs via reflection `ab5f1c956` shim: set a non-zero return code if the wait process call failed. `e5be5cb08` runtime: device: cleanup outdated comments `5f936f268` virtcontainers: config validation is host specific `323271403` virtcontainers: Remove unused function `0939f5181` config: Expose default_maxmemory `58ff2bd5c` clh,qemu: Adapt to using default_maxmemory `1a78c3df2` packaging: Remove unused kata docker configure script `afdc96042` hypervisor: Add default_maxmemory configuration `4e30e11b3` shim: support shim v2 logging plugin `bdf5e5229` virtcontainers: validate hypervisor config outside of hypervisor itself `469e09854` katautils: don't do validation when loading hypervisor config `e32bf5331` device: deduplicate state structures `f97d9b45c` runtime: device/persist: drop persist dependency from device pkgs `f9e96c650` runtime: device: move to top level package `3880e0c07` agent: refactor reading file timing for debugging `c70d3a2c3` agent: Update the dependencies `612fd79ba` random: Fix "nonminimal-bool" clippy warning `d4417f210` netlink: Fix "or-fun-call" clippy warnings `93874cb3b` packaging: Restrict kernel patches applied to top-level dir `07b1367c2` versions: Update kernel to latest LTS version 5.15.48 `1b7d36fdb` agent: Allow BUILD_TYPE=debug `9ff10c083` kernel: Add CONFIG_EFI=y as part of the TDX fragments `e227b4c40` block: Leverage multiqueue for virtio-block `e7e7dc9df` runtime: Add heuristic to get the right value(s) for mem-reserve `c7dd10e5e` packaging: Remove unused publish kata image script `0bbbe7068` snap: fix snap build on ppc64le `ef925d40c` runtime: enable sandbox feature on qemu `28995301b` tracing: Remove whitespace from root span `9941588c0` workflow: Removing man-db, workflow kept failing `90a7763ac` snap: Fix debug cli option `a305bafee` docs: Update outdated URLs and keep them available `bee770343` docs: Update containerd url link `ac5dbd859` clh: Improve logging related to the net dev addition `0b75522e1` network: Set queues to 1 to ensure we get the network fds `93b61e0f0` network: Add FFI_NO_PI to the netlink flags `bf3ddc125` clh: Pass the tuntap fds down to Cloud Hypervisor `55ed32e92` clh: Take care of the VmAdNetdPut request ourselves `01fe09a4e` clh: Hotplug the network devices `2e0753833` clh: Expose VmAddNetPut `1ef0b7ded` runtime: Switch to using the rust version of virtiofsd (all but power) `bb26bd73b` safe-path: fix clippy warning `1a5ba31cb` agent: refactor reading file timing for debugging `721ca72a6` runtime: fix error when trying to parse sandbox sizing annotations `9773838c0` virtiofsd: export env vars needed for building it `b0e090f40` versions: Bump virtiofsd to v1.3.0 `db5048d52` kernel: build efi_secret module for SEV `1b845978f` docs: Add storage limits to arch doc `412441308` docs: Add more kata monitor details `eff4e1017` shim: change the log level for GetOOMEvent call failures `5d7fb7b7b` build(deps): bump github.com/containerd/containerd in /src/runtime `d0ca2fcbb` build(deps): bump crossbeam-utils in /src/tools/trace-forwarder `a60dcff4d` build(deps): bump regex from 1.5.4 to 1.5.6 in /src/tools/agent-ctl `dbf50672e` build(deps): bump crossbeam-utils in /src/tools/agent-ctl `8e2847bd5` build(deps): bump crossbeam-utils from 0.8.6 to 0.8.8 in /src/libs `e9ada165f` build(deps): bump regex from 1.5.4 to 1.5.5 in /src/agent `adad9cef1` build(deps): bump crossbeam-utils from 0.8.5 to 0.8.8 in /src/agent `34bcef884` docs: Add agent-ctl examples section `815157bf0` docs: Remove erroneous whitespace `f5099620f` tools: Enable extra detail on error `8f10e13e0` config: Allow enable_iommu pod annotation by default `7ae11cad6` docs: Update source for cri-tools `0e2459d13` docs: Add cgroupDriver for containerd `1b7fd19ac` rootfs: Fix chronyd.service failing on boot Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2022-07-05 22:23:05 +02:00
Fabiano Fidêncio	f4eea832a1	release: Adapt kata-deploy for 2.5.0-rc0 kata-deploy files must be adapted to a new release. The cases where it happens are when the release goes from -> to: * main -> stable: * kata-deploy-stable / kata-cleanup-stable: are removed * stable -> stable: * kata-deploy / kata-cleanup: bump the release to the new one. There are no changes when doing an alpha release, as the files on the "main" branch always point to the "latest" and "stable" tags. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2022-07-05 22:23:05 +02:00
Fabiano Fidêncio	071dd4c790	Merge pull request #4109 from pmores/drop-in-cfg-files-support Drop in cfg files support	2022-07-05 22:21:24 +02:00
Peng Tao	514b4e7235	Merge pull request #4543 from openanolis/anolis/add_vcpu_configure_aarch64 runtime-rs: Dragonball sandbox - add Vcpu::configure() function for aarch64	2022-07-05 17:47:40 +08:00
Bin Liu	d9e868f44e	Merge pull request #4479 from quanweiZhou/enhance-get-handled-signal agent: enhance get handled signal	2022-07-05 15:18:21 +08:00
Bin Liu	b33ad7e57a	Merge pull request #4574 from jelipo/fix-serde-serializing oci: fix serde skip serializing condition	2022-07-05 13:51:43 +08:00
Bin Liu	0189738283	Merge pull request #4576 from ManaSugi/fix/oci-poststart-hook agent: Run OCI poststart hooks after a container is launched	2022-07-05 11:08:49 +08:00
Peng Tao	cd2d8c6fe2	Merge pull request #4580 from ManaSugi/fix/replace-libc-with-nix agent: Replace some libc functions with nix ones	2022-07-05 10:53:42 +08:00
Peng Tao	a1de394e51	Merge pull request #4550 from liubin/fix/4548-overwrite-mount-type-for-bind-mount runtime: overwrite mount type to bind for bind mounts	2022-07-04 19:56:26 +08:00
Peng Tao	44ec9684d8	Merge pull request #4573 from amshinde/unsafe-repo-runtime-shimv2 build: Set safe.directory for runtime repo	2022-07-04 19:51:00 +08:00
haining.cao	0ddb34a38d	oci: fix serde skip serializing condition There is an extra space on the serde serialization condition. Fixes: #4578 Signed-off-by: haining.cao <haining.cao@daocloud.io>	2022-07-04 16:16:04 +08:00
xuejun-xj	7120afe4ed	dragonball: add vcpu test function for aarch64 add create_vcpu() function in vcpu test unit for aarch64 Fixes: #4445 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com>	2022-07-04 15:23:43 +08:00
xuejun-xj	648d285a24	dragonball: add vcpu support for aarch64 add configure() function for aarch64 vcpu Fixes: #4543 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com>	2022-07-04 15:23:37 +08:00
xuejun-xj	7dad7c89f3	dragonball: update dbs-xxx dependency change to up-to-date commit ID Fixes: #4543 Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com>	2022-07-04 15:23:11 +08:00
Manabu Sugimoto	fbb2e9bce9	agent: Replace some libc functions with nix ones Replace `libc::setgroups()`, `libc::fchown()`, and `libc::sethostname()` functions with nix crate ones for safety and maintainability. Fixes: #4579 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-07-04 14:49:38 +09:00
Manabu Sugimoto	acd3302bef	agent: Run OCI poststart hooks after a container is launched Run the OCI `poststart` hooks must be called after the user-specified process is executed but before the `start` operation returns in accordance with OCI runtime spec. Fixes: #4575 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-07-03 18:03:51 +09:00
GabyCT	635fa543a3	Merge pull request #4560 from bookinabox/update-commit-message-check ci/cd: update check-commit-message	2022-07-01 11:30:03 -05:00
James O. D. Hunt	59cab9e835	Merge pull request #4380 from Tim-0731-Hzt/rund/makefile runtime-rs: makefile for dragonball	2022-07-01 09:12:38 +01:00
liubin	1f363a386c	runtime: overwrite mount type to bind for bind mounts Some clients like nerdctl may pass mount type of none for volumes/bind mounts, this will lead to container start fails. Referring to runc, it overwrites the mount type to bind and ignores the input value. Fixes: #4548 Signed-off-by: liubin <liubin0329@gmail.com>	2022-07-01 12:13:01 +08:00
Archana Shinde	4e48509ed9	build: Set safe.directory for runtime repo While doing a docker build for shim-v2, we see this: ``` fatal: unsafe repository ('/home/${user}/go/src/github.com/kata-containers/kata-containers' is owned by someone else) To add an exception for this directory, call: git config --global --add safe.directory /home/${user}/go/src/github.com/kata-containers/kata-containers ``` This is because the docker container build is run as root while the runtime repo is checked out as normal user. Unlike this error causing the rootfs build to error out, the error here does not really cause `make shim-v2-tarball` to fail. However its good to get rid of this error message showing during the make process. Fixes: #4572 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-06-30 20:52:44 -07:00
Bin Liu	18093251ec	Merge pull request #4527 from Tim-0731-Hzt/rund-new/netlink runtime-rs:refactor network model with netlink	2022-07-01 11:12:54 +08:00
Archana Shinde	c29038a2e2	Merge pull request #4562 from ManaSugi/git-safe-repo Set safe.directory against tests repository	2022-06-30 16:13:15 -07:00
GabyCT	02a51e75a7	Merge pull request #4554 from liubin/fix/delete-not-used-console-from-container-config runtime: delete Console from Cmd type	2022-06-30 11:40:07 -05:00
Fabiano Fidêncio	aa561b49f5	Merge pull request #4540 from fidencio/topic/default_maxmemory Add `default_maxmemory` config option	2022-06-30 12:08:15 +02:00
Manabu Sugimoto	48ccd42339	ci: Set safe.directory against tests repository Set `safe.directory` against `kata-containers/tests` repository before checkout because the user in the docker container is root, but the `tests` repository on the host machine is usually owned by the normal user. This works when we already have the `tests` repository which is not owned by root on the host machine and try to create a rootfs using Docker (`USE_DOCKER=true`). Fixes: #4561 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-06-30 17:36:29 +09:00
quanweiZhou	2a4fbd6d8c	agent: enhance get handled signal For runC, send the signal to the init process directly. For kata, we try to send `SIGKILL` instead of `SIGTERM` when the process has not installed the handler for `SIGTERM`. The `is_signal_handled` function determine which signal the container process has been handled. But currently `is_signal_handled` is only catching (SigCgt). While the container process is ignoring (SigIgn) or blocking (SigBlk) also should not be converted from the `SIGTERM` to `SIGKILL`. For example, when using terminationGracePeriodSeconds the k8s will send SIGTERM first and then send `SIGKILL`, in this case, the container ignores the `SIGTERM`, so we should send the `SIGTERM` not the `SIGKILL` to the container. Fixes: #4478 Signed-off-by: quanweiZhou <quanweiZhou@linux.alibaba.com>	2022-06-30 14:44:46 +08:00
Derek Lee	433816cca2	ci/cd: update check-commit-message Recently added check-commit-message to the tests repository. Minor changes were also made to action. For consistency's sake, copied changes over to here as well. tests - https://github.com/kata-containers/tests/pull/4878 Minor Changes: 1. Body length check is now 75 and consistent with guidelines 2. Lines without spaces are not counted in body length check Fixes #4559 Signed-off-by: Derek Lee <derlee@redhat.com>	2022-06-29 16:55:43 -07:00
GabyCT	2a94261df5	Merge pull request #4549 from liubin/fix/4419-set-status-if-wait-process-failed shim: set a non-zero return code if the wait process call failed.	2022-06-29 17:04:53 -05:00
Fabiano Fidêncio	1e12d56512	Merge pull request #4469 from egernst/config-validation-refactor Refactor how hypervisor config validation is handled	2022-06-29 14:42:11 +02:00
liubin	a5a25ed13d	runtime: delete Console from Cmd type There is much code related to this property, but it is not used anymore. Fixes: #4553 Signed-off-by: liubin <liubin0329@gmail.com>	2022-06-29 17:36:32 +08:00
Pavel Mores	96553e8bd2	runtime: Add documentation of drop-in config file fragments Added user manual for the drop-in config file fragments feature. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 10:56:53 +02:00
Pavel Mores	c656457e90	runtime: Add tests of drop-in config file decoding The tests ensure that interactions between drop-ins and the base configuration.toml and among drop-ins themselves work as intended, basically that files are evaluated in the correct order (base file first, then drop-ins in alphabetical order) and the last one to set a specific key wins. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:39 +02:00
Pavel Mores	99f5ca80fc	runtime: Plug drop-in decoding into decodeConfig() Fixes #4108 Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:38 +02:00
Pavel Mores	0f9856c465	runtime: Scan drop-in directory, read files and decode them updateFromDropIn() uses the infrastructure built by previous commits to ensure no contents of 'tomlConfig' are lost during decoding. To do this, we preserve the current contents of our tomlConfig in a clone and decode a drop-in into the original. At this point, the original instance is updated but its Agent and/or Hypervisor fields are potentially damaged. To merge, we update the clone's Agent/Hypervisor from the original instance. Now the clone has the desired Agent/Hypervisor and the original instance has the rest, so to finish, we just need to move the clone's Agent/Hypervisor to the original. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:38 +02:00
Pavel Mores	2c1efcc697	runtime: Add helpers to copy fields between tomlConfig instances These functions take a TOML key - an array of individual components, e.g. ["agent" "kata" "enable_tracing"], as returned by BurntSushi - and two 'tomlConfig' instances. They copy the value of the struct field identified by the key from the source instance to the target one if necessary. This is only done if the TOML key points to structures stored in maps by 'tomlConfig', i.e. 'hypervisor' and 'agent'. Nothing needs to be done in other cases. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:38 +02:00
Pavel Mores	20f11877be	runtime: Add framework to manipulate config structs via reflection For 'tomlConfig' substructures stored in Golang maps - 'hypervisor' and 'agent' - BurntSushi doesn't preserve their previous contents as it does for substructures stored directly (e.g. 'runtime'). We use reflection to work around this. This commit adds three primitive operations to work with struct fields identified by their `toml:"..."` tags - one to get a field value, one to set a field value and one to assign a source struct field value to the corresponding field of a target. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:38 +02:00
liubin	ab5f1c9564	shim: set a non-zero return code if the wait process call failed. Return code is an int32 type, so if an error occurred, the default value may be zero, this value will be created as a normal exit code. Set return code to 255 will let the caller(for example Kubernetes) know that there are some problems with the pod/container. Fixes: #4419 Signed-off-by: liubin <liubin0329@gmail.com>	2022-06-29 12:33:32 +08:00
Zhongtao Hu	07231b2f3f	runtime-rs:refactor network model with netlink add unit test for tcfilter Fixes: #4289 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-29 11:38:23 +08:00
Zhongtao Hu	c8a9052063	build: format files add Enter at the end of the file Fixes: #4379 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-29 11:19:10 +08:00
Zhongtao Hu	242992e3de	build: put install methods in utils.mk put install methods in utils.mk to avoid duplication Fixes: #4379 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-29 11:19:10 +08:00
Zhongtao Hu	8a697268d0	build: makefile for dragonball config use makefile to generate dragonball config file Fixes: #4379 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-29 11:19:07 +08:00
Zhongtao Hu	9c526292e7	runtime-rs:refactor network model with netlink refactor tcfilter with netlink Fixes: #4289 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-29 11:03:29 +08:00
Eric Ernst	e5be5cb086	runtime: device: cleanup outdated comments Prior device config move didn't update the comments. Let's address this, and make sure comments match the new path... Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-28 18:22:28 -07:00
Eric Ernst	5f936f268f	virtcontainers: config validation is host specific Ideally this config validation would be in a seperate package (katautils?), but that would introduce circular dependency since we'd call it from vc, and it depends on vc types (which, shouldn't be vc, but probably a hypervisor package instead). Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-28 18:22:28 -07:00
Fabiano Fidêncio	323271403e	virtcontainers: Remove unused function While working on the previous commits, some of the functions become non-used. Let's simply remove them. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 21:19:24 +02:00
Fabiano Fidêncio	0939f5181b	config: Expose default_maxmemory Expose the newly added `default_maxmemory` to the project's Makefile and to the configuration files. Fixes: #4516 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 21:19:24 +02:00
Fabiano Fidêncio	58ff2bd5c9	clh,qemu: Adapt to using default_maxmemory Let's adapt Cloud Hypervisor's and QEMU's code to properly behave to the newly added `default_maxmemory` config. While implementing this, a change of behaviour (or a bug fix, depending on how you see it) has been introduced as if a pod requests more memory than the amount avaiable in the host, instead of failing to start the pod, we simply hotplug the maximum amount of memory available, mimicing better the runc behaviour. Fixes: #4516 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 21:19:24 +02:00
Fabiano Fidêncio	ad055235a5	Merge pull request #4547 from GabyCT/topic/removeunuseddocker packaging: Remove unused kata docker configure script	2022-06-28 20:09:15 +02:00
GabyCT	b2c0387993	Merge pull request #4130 from surajssd/add-cgroup-driver-info kata-with-k8s: Add cgroupDriver for containerd	2022-06-28 10:30:18 -05:00
GabyCT	12c1b9e6d6	Merge pull request #4536 from Tim-0731-Hzt/runtime-rs-kata-main runtime-rs: Merge Main into runtime-rs branch	2022-06-28 10:27:35 -05:00
Gabriela Cervantes	1a78c3df2e	packaging: Remove unused kata docker configure script This PR removes an unused kata configure docker script which was used in packaging for kata 1.x but not longer being used in kata 2.x Fixes #4546 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-06-28 15:10:39 +00:00
Zhongtao Hu	f3907aa127	runtime-rs:Merge remote-tracking branch 'origin/main' into runtime-rs-newv Fixes:#4536 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-28 20:58:40 +08:00
Bin Liu	badbbcd8be	Merge pull request #4400 from openanolis/anolis/dragonball-2 runtime-rs: built-in Dragonball sandbox part II - vCPU manager	2022-06-28 20:41:36 +08:00
Tim Zhang	916ffb75d7	Merge pull request #4432 from liubin/fix/4420-binary-log shim: support shim v2 logging plugin	2022-06-28 16:29:07 +08:00
Fabiano Fidêncio	afdc960424	hypervisor: Add default_maxmemory configuration Let's add a `default_maxmemory` configuration, which allows the admins to set the maximum amount of memory to be used by a VM, considering the initial amount + whatever ends up being hotplugged via the pod limits. By default this value is 0 (zero), and it means that the whole physical RAM is the limit. Fixes: #4516 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 08:32:15 +02:00
Bin Liu	4e30e11b31	shim: support shim v2 logging plugin Now kata shim only supports stdout/stderr of fifo from containerd/CRI-O, but shim v2 supports logging plugins, and nerdctl default will use the binary schema for logs. This commit will add the others type of log plugins: - file - binary In case of binary, kata shim will receive a stdout/stderr like: binary:///nerdctl?_NERDCTL_INTERNAL_LOGGING=/var/lib/nerdctl/1935db59 That means the nerdctl process will handle the logs(stdout/stderr) Fixes: #4420 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-06-28 13:54:22 +08:00
Eric Ernst	bdf5e5229b	virtcontainers: validate hypervisor config outside of hypervisor itself Depending on the user of it, the hypervisor from hypervisor interface could have differing view on what is valid or not. To help decouple, let's instead check the hypervisor config validity as part of the sandbox creation, rather than as part of the CreateVM call within the hypervisor interface implementation. Fixes: #4251 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-27 11:53:41 -07:00
Eric Ernst	469e098543	katautils: don't do validation when loading hypervisor config Policy for whats valid/invalid within the config varies by VMM, host, and by silicon architecture. Let's keep katautils simple for just translating a toml to the hypervisor config structure, and leave validation to virtcontainers. Without this change, we're doing duplicate validation. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-27 10:13:26 -07:00
Chao Wu	71db2dd5b8	hotplug: add room for future acpi hotplug mechanism In order to support ACPI hotplug in the future with the cooperative work from the Kata community, we add ACPI feature and dbs-upcall feature to add room for ACPI hotplug. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-27 21:52:36 +08:00
Zizheng Bian	8bb00a3dc8	dragonball: fix a bug when generating kernel boot args We should refuse to generate boot args when hotplugging, not cold starting. Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>	2022-06-27 18:12:50 +08:00
Chao Wu	2aedd4d12a	doc: add document for vCPU, api and device Create the document for vCPU and api. Add some detail in the device document. Fixes: #4257 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-27 18:12:50 +08:00
wllenyj	bec22ad01f	dragonball: add api module It is used to define the vmm communication interface. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-27 18:12:50 +08:00
wllenyj	07f44c3e0a	dragonball: add vcpu manager Manage vcpu related operations. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-27 18:12:48 +08:00
wllenyj	78c9718752	dragonball: add upcall support Upcall is a direct communication tool between VMM and guest developed upon vsock. It is used to implement device hotplug. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>	2022-06-27 17:04:47 +08:00
wllenyj	7d1953b52e	dragonball: add vcpu Virtual CPU manager for virtual machines. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-27 17:04:42 +08:00
wllenyj	468c73b3cb	dragonball: add kvm context KVM operation context for virtual machines. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-27 16:02:06 +08:00
Bin Liu	27b1bb5ed9	Merge pull request #4467 from egernst/device-pkg device package cleanup/refactor	2022-06-27 14:40:53 +08:00
Eric Ernst	e32bf53318	device: deduplicate state structures Before, we maintained almost identical structures between our persist API and what we keep for our devices, with the persist API being a slight subset of device structures. Let's deduplicate this, now that persist is importing device package. Json unmarshal of prior persist structure will work fine, since it was an exact subset of fields. Fixes: #4468 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-26 21:31:29 -07:00
Eric Ernst	f97d9b45c8	runtime: device/persist: drop persist dependency from device pkgs Rather than have device package depend on persist, let's define the (almost duplicate) structures within device itself, and have the Kata Container's persist pkg import these. This'll help avoid unecessary dependencies within our core packages. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-26 21:31:29 -07:00
Eric Ernst	f9e96c6506	runtime: device: move to top level package Let's move device package to runtime/pkg instead of being buried under virtcontainers. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-26 21:31:29 -07:00
Bin Liu	3880e0c077	agent: refactor reading file timing for debugging In the original code, reads mountstats file and return the content in the error, but at this time the file maybe changed, we should return the file content that parsed line by line to check why there is not a fstype option. Fixes: #4246 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-06-26 21:27:43 -07:00
Archana Shinde	2488a0f6c0	Merge pull request #4439 from amshinde/update-kernel-to-5.15.46 versions: Update kernel to latest LTS version 5.15.48	2022-06-24 11:03:32 -07:00
Fabiano Fidêncio	083ca5f217	Merge pull request #4505 from yoheiueda/agent-debug-build agent: Allow BUILD_TYPE=debug	2022-06-24 14:04:23 +02:00
Fabiano Fidêncio	03fca8b459	Merge pull request #4526 from fidencio/topic/fix-clippy-warnings-and-update-agent-vendored-code Fix clippy warnings and update agent's vendored code	2022-06-24 14:02:28 +02:00
Fabiano Fidêncio	c70d3a2c35	agent: Update the dependencies Let's run a `cargo update` and ensure the deps are up-to-date before we cut the "-rc0" release. Fixes: #4525 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-24 11:37:25 +02:00
Fabiano Fidêncio	612fd79bae	random: Fix "nonminimal-bool" clippy warning The error shown below was caught during a dependency bump in the CCv0 branch, but we better fix it here first. ``` error: this boolean expression can be simplified --> src/random.rs:85:21 \| 85 \| assert!(!ret.is_ok()); \| ^^^^^^^^^^^^ help: try: `ret.is_err()` \| = note: `-D clippy::nonminimal-bool` implied by `-D warnings` = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool error: this boolean expression can be simplified --> src/random.rs:93:17 \| 93 \| assert!(!ret.is_ok()); \| ^^^^^^^^^^^^ help: try: `ret.is_err()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool ``` Fixes: #4523 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-24 11:37:05 +02:00
Fabiano Fidêncio	d4417f210e	netlink: Fix "or-fun-call" clippy warnings The error shown below was caught during a dependency bump in the CCv0 branch, but we better fix it here first. ``` error: use of `ok_or` followed by a function call --> src/netlink.rs:526:14 \| 526 \| .ok_or(anyhow!(nix::Error::EINVAL))?; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `ok_or_else(\|\| anyhow!(nix::Error::EINVAL))` \| = note: `-D clippy::or-fun-call` implied by `-D warnings` = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call error: use of `ok_or` followed by a function call --> src/netlink.rs:615:49 \| 615 \| let v = u8::from_str_radix(split.next().ok_or(anyhow!(nix::Error::EINVAL))?, 16)?; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `ok_or_else(\|\| anyhow!(nix::Error::EINVAL))` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call ``` Fixes: #4523 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-24 11:37:01 +02:00
Archana Shinde	93874cb3bb	packaging: Restrict kernel patches applied to top-level dir The apply_patches.sh script applies all patches in the patches directory, as well as subdirectories. This means if there is a sub-dir called "experimental" under a major kernel version directory, experimental patches would be applied to the default kernel supported by Kata. We did not come accross this issue earlier as typically the experimental kernel version was different from the default kernel. With both the default kernel and the arm-experimental kernel having the same major kernel version (5.15.x) at this time, trying to update the kernel patch version revealed that arm-experimental patches were being applied to the default kernel. Restricting the patches to be applied to the top level directory will solve the issue. The apply_patches script should ignore any sub-directories meant for experimental patches. Fixes #4520 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-06-23 10:43:52 -07:00
Archana Shinde	07b1367c2b	versions: Update kernel to latest LTS version 5.15.48 This brings in a few security fixes. Removing arm patches related to virtio-mem that are no longer required as they have been merged. Fixes #4438 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-06-23 10:43:52 -07:00
Fabiano Fidêncio	133528dd14	Merge pull request #4503 from amshinde/multi-queue-block block: Leverage multiqueue for virtio-block	2022-06-23 12:17:11 +02:00
Fabiano Fidêncio	f186a52b16	Merge pull request #4511 from fidencio/topic/add-config-efi-to-the-tdx-kernel kernel: Add CONFIG_EFI=y as part of the TDX fragments	2022-06-23 12:15:30 +02:00
Yohei Ueda	1b7d36fdb0	agent: Allow BUILD_TYPE=debug The cargo command creates debug build binaries, when the --release option is not specified. Specifying --debug option causes an error. This patch specifies --release option when BUILD_TYPE=release, and does not specify any build type option when BUILD_TYPE=debug. Fixes #4504 Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>	2022-06-23 13:54:32 +09:00
Fabiano Fidêncio	9ff10c0830	kernel: Add CONFIG_EFI=y as part of the TDX fragments Otherwise `./build-kernel.sh -x tdx setup` will fail with the following error: ``` $ ./build-kernel.sh -x tdx setup INFO: Config version: 92 INFO: Kernel version: tdx-guest-v5.15-4 INFO: kernel path does not exist, will download kernel INFO: Apply patches from /home/ffidenci/go/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/patches/tdx-guest-v5.15-4.x INFO: Found 0 patches INFO: Enabling config for 'tdx' confidential guest protection INFO: Constructing config from fragments: /home/ffidenci/go/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/configs/fragments/x86_64/.config WARNING: unmet direct dependencies detected for UNACCEPTED_MEMORY Depends on [n]: EFI [=n] && EFI_STUB [=n] Selected by [y]: - INTEL_TDX_GUEST [=y] && HYPERVISOR_GUEST [=y] && X86_64 [=y] && CPU_SUP_INTEL [=y] && PARAVIRT [=y] && SECURITY [=y] && X86_X2APIC[=y] INFO: Some CONFIG elements failed to make the final .config: INFO: Value requested for CONFIG_EFI_STUB not in final .config INFO: Generated config file can be found in /home/ffidenci/go/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/configs/fragments/x86_64/.config ERROR: Failed to construct requested .config file ERROR: failed to find default config ``` Fixes: #4510 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-22 15:21:30 +02:00
Fabiano Fidêncio	78e27de6c3	Merge pull request #4358 from zvonkok/memreserve runtime: Add heuristic to get the right value(s) for mem-reserve	2022-06-22 13:41:23 +02:00
Archana Shinde	e227b4c404	block: Leverage multiqueue for virtio-block Similar to network, we can use multiple queues for virtio-block devices. This can help improve storage performance. This commit changes the number of queues for block devices to the number of cpus for cloud-hypervisor and qemu. Today the default number of cpus a VM starts with is 1. Hence the queues used will be 1. This change will help improve performance when the default cold-plugged cpus is greater than one by changing this in the config file. This may also help when we use the sandboxing feature with k8s that passes down the sum of the resources required down to Kata. Fixes #4502 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-06-21 12:38:53 -07:00
Eric Ernst	72049350ae	Merge pull request #4288 from fengwang666/enable-qemu-sandbox runtime: enable sandbox feature on qemu	2022-06-21 09:22:26 -07:00
GabyCT	8eac22ac53	Merge pull request #4495 from Amulyam24/snap-fix snap: fix snap build on ppc64le	2022-06-21 09:21:23 -05:00
Zvonko Kaiser	e7e7dc9dfe	runtime: Add heuristic to get the right value(s) for mem-reserve Fixes: #2938 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2022-06-21 03:44:28 -07:00
Bin Liu	e422730c7f	Merge pull request #4497 from GabyCT/topic/removeunusedref packaging: Remove unused publish kata image script	2022-06-21 17:46:45 +08:00
James O. D. Hunt	e11fcf7d3c	Merge pull request #4168 from Champ-Goblem/patch/fix-chronyd-failure-on-boot rootfs: Fix chronyd.service failing on boot	2022-06-21 09:43:13 +01:00
Gabriela Cervantes	c7dd10e5ed	packaging: Remove unused publish kata image script This PR removes unused the publish kata image script which was used on kata 1.x when we had OBS packages which are not longer used on kata 2.x Fixes #4496 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-06-20 14:43:39 +00:00
Amulyam24	0bbbe70687	snap: fix snap build on ppc64le Fixes the syntax error while building rustdeps. Fixes: #4494 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2022-06-20 19:26:27 +05:30
Fabiano Fidêncio	6fd40085ef	Merge pull request #4484 from cmaf/tracing-update-rootspan-name tracing: Remove whitespace from root span	2022-06-20 08:37:45 +02:00
Fupan Li	98f041ed8e	Merge pull request #4486 from openanolis/runtime-rs-merge-main runtime-rs: runtime-rs merge main	2022-06-20 13:52:14 +08:00
Bin Liu	2c1b68d6e4	Merge pull request #4481 from zvonkok/fix-action workflow: Removing man-db, workflow kept failing	2022-06-20 11:10:48 +08:00
Chao Wu	86123f49f2	Merge branch 'main' into runtime-rs In order to keep update with the main, we will update runtime-rs every week. Fixes: #4485 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-20 10:01:58 +08:00
Liang Zhou	ef925d40ce	runtime: enable sandbox feature on qemu Enable "-sandbox on" in qemu can introduce another protect layer on the host, to make the secure container more secure. The default option is disable because this feature may introduce some performance cost, even though user can enable /proc/sys/net/core/bpf_jit_enable to reduce the impact. Fixes: #2266 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-06-17 15:30:46 -07:00
Chelsea Mafrica	28995301b3	tracing: Remove whitespace from root span Remove space from root span name to follow camel casing of other tracing span names in the runtime and to make parsing easier in testing. Fixes #4483 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-06-17 12:07:37 -07:00
Zvonko Kaiser	9941588c00	workflow: Removing man-db, workflow kept failing Fixes: #4480 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2022-06-17 04:55:12 -07:00
wllenyj	e89e6507a4	dragonball: add signal handler Used to register dragonball's signal handler. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-16 17:31:58 +08:00
Fabiano Fidêncio	f30fe86dc1	Merge pull request #4456 from Bevisy/fixIssue4454 docs: Update outdated URLs and keep them available	2022-06-16 10:26:24 +02:00
Bin Liu	553ec46115	Merge pull request #4436 from alex-matei/fix/sandbox-mem-overflow runtime: fix error when trying to parse sandbox sizing annotations	2022-06-16 11:18:24 +08:00
James O. D. Hunt	0d33b28802	Merge pull request #4459 from jodh-intel/snap-fix-cli-options snap: Fix debug cli option	2022-06-15 17:10:15 +01:00
James O. D. Hunt	9766a285a4	Merge pull request #4422 from snir911/dependabot_bumps deps: Resolve dependabot bumps of containerd, crossbeam-utils, regex	2022-06-15 15:57:53 +01:00
James O. D. Hunt	90a7763ac6	snap: Fix debug cli option `snap`/`snapcraft` seems to have changed recently. Since `snap` auto-updates all `snap` packages and since we use the `snapcraft` `snap` for building snaps, this is impacting all our CI jobs which now show: ``` Installing Snapcraft for Linux… snapcraft 7.0.4 from Canonical* installed Run snapcraft -d snap --destructive-mode Usage: snapcraft [options] command [args]... Try 'snapcraft pack -h' for help. Error: unrecognized arguments: -d Error: Process completed with exit code 1. ``` Move the debug option to make it a sub-command (long) option to resolve this issue. Fixes: #4457. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-15 10:00:56 +01:00
James O. D. Hunt	d06dd8fcdc	Merge pull request #4312 from fidencio/topic/pass-the-tuntap-fd-to-clh Allow Cloud Hypervisor to run under the `container_kvm_t`	2022-06-15 09:37:49 +01:00
Binbin Zhang	a305bafeef	docs: Update outdated URLs and keep them available By comparing the content of the old url and the new url, ensure that their content is consistent and does not contain ambiguities Fixes: #4454 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2022-06-15 16:34:28 +08:00
Archana Shinde	185360cb9a	Merge pull request #4452 from GabyCT/topic/updatedeveloperguide docs: Update containerd url link	2022-06-14 16:13:35 -07:00
Chelsea Mafrica	db2a4d6cdf	Merge pull request #4441 from liubin/fix/refactor-reading-mountstat-log agent: refactor reading file timing for debugging	2022-06-14 14:18:14 -07:00
Gabriela Cervantes	bee7703436	docs: Update containerd url link This PR updates the containerd url link in the Developer Guide. Fixes #4451 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-06-14 15:35:03 +00:00
Fabiano Fidêncio	ac5dbd8598	clh: Improve logging related to the net dev addition Let's improve the log so we make it clear that we're only actually adding the net device to the Cloud Hypervisor configuration when calling our own version of VmAddNetPut(). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	0b75522e1f	network: Set queues to 1 to ensure we get the network fds We want to have the file descriptors of the opened tuntap device to pass them down to the VMMs, so the VMMs don't have to explicitly open a new tuntap device themselves, as the `container_kvm_t` label does not allow such a thing. With this change we ensure that what's currently done when using QEMU as the hypervisor, can be easily replicated with other VMMs, even if they don't support multiqueue. As a side effect of this, we need to close the received file descriptors in the code of the VMMs which are not going to use them. Fixes: #3533 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	93b61e0f07	network: Add FFI_NO_PI to the netlink flags Adding FFI_NO_PI to the netlink flags causes no harm to the supported and tested hypervisors as when opening the device by its name Cloud Hypervisor[0], Firecracker[1], and QEMU[2] do set the flag already. However, when receiving the file descriptor of an opened tutap device Cloud Hypervisor is not able to set the flag, leaving the guest without connectivity. To avoid such an issue, let's simply add the FFI_NO_PI flag to the netlink flags and ensure, from our side, that the VMMs don't have to set it on their side when dealing with an already opened tuntap device. Note that there's a PR opened[3] just for testing that this change doesn't cause any breakage. [0]: `e52175c2ab/net_util/src/tap.rs (L129)` [1]: `b6d6f71213/src/devices/src/virtio/net/tap.rs (L126)` [2]: `3757b0d08b/net/tap-linux.c (L54)` [3]: https://github.com/kata-containers/kata-containers/pull/4292 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	bf3ddc125d	clh: Pass the tuntap fds down to Cloud Hypervisor This is basically a no-op right now, as: * netPair.TapInterface.VMFds is nil * the tap name is still passed to Cloud Hypervisor, which is the Cloud Hypervisor's first choice when opening a tap device. In the very near future we'll stop passing the tap name to Cloud Hypervisor, and start passing the file descriptors of the opened tap instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	55ed32e924	clh: Take care of the VmAdNetdPut request ourselves Knowing that VmAddNetPut works as expected, let's switch to manually building the request and writing it to the appropriate socket. By doing this it gives us more flexibility to, later on, pass the file descriptor of the tuntap device to Cloud Hypervisor, as openAPI doesn't support such operation (it has no notion of SCM Rights). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	01fe09a4ee	clh: Hotplug the network devices Instead of creating the VM with the network device already plugged in, let's actually add the network device after the VM is created, but before the Vm is actually booted. Although it looks like it doesn't make any functional difference between what's done in the past and what this commit introduces, this will be used to workaround a limitation on OpenAPI when it comes to passing down the network device's file descriptor to Cloud Hypervisor, so Cloud Hypervisor can use it instead of opening the device by its name on the VMM side. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:51:02 +00:00
Fabiano Fidêncio	2e07538334	clh: Expose VmAddNetPut VmAddNetPut is the API provided by the Cloud Hypervisor client (auto generated) code to hotplug a new network device to the VM. Let's expose it now as it'll be used as part this series, mostly to guide the reviewer through the process of what we have to do, as later on, spoiler alert, it'll end up being removed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:27:30 +00:00
Bin Liu	c84a425250	Merge pull request #4442 from openanolis/anolis/fix_safepath_clippy safe-path: fix clippy warning	2022-06-14 14:02:42 +08:00
Chelsea Mafrica	1d5448fbca	Merge pull request #4180 from Alex-Carter01/build-kernel-efi-secret kernel building: efi_secret module	2022-06-13 13:34:06 -07:00
Fabiano Fidêncio	a80eb33cd6	Merge pull request #4308 from fidencio/topic/virtiofsd-switch-to-using-the-rust-version-on-all-arches runtime: Switch to using the rust version of virtiofsd (all arches but powerpc)	2022-06-13 13:45:51 +02:00
Bin Liu	81acfc1286	Merge pull request #4425 from liubin/fix/4376-change-log-level-of-getoomevent shim: change the log level for GetOOMEvent call failures	2022-06-13 17:53:11 +08:00
James O. D. Hunt	9b93db0220	Merge pull request #4417 from jodh-intel/docs-monitor-considerations docs: Add more kata monitor details	2022-06-13 10:51:52 +01:00
Fabiano Fidêncio	1ef0b7ded0	runtime: Switch to using the rust version of virtiofsd (all but power) So far this has been done for x86_64. Now that the support for building and testing has been added for all arches, let's do the second part of the switch. We're still not done yet for powerpc, as some a virtifosd crash on the rust version has been found by the maintainer. Fixes: #4258, #4260 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-13 10:41:26 +02:00
wllenyj	b6cb2c4ae3	dragonball: add metrics system metrics system is added for collecting Dragonball metrics to analyze the system. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-13 13:51:51 +08:00
wllenyj	e80e0c4645	dragonball: add io manager wrapper Wrapper over IoManager to support device hotplug. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: jingshan <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-13 13:51:46 +08:00
Chao Wu	bb26bd73b1	safe-path: fix clippy warning fix clippy warnings in safe-path lib to make clippy happy. fixes: #4443 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-13 13:38:37 +08:00
Bin Liu	1a5ba31cb0	agent: refactor reading file timing for debugging In the original code, reads mountstats file and return the content in the error, but at this time the file maybe changed, we should return the file content that parsed line by line to check why there is not a fstype option. Fixes: #4246 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-06-13 10:56:51 +08:00
Bin Liu	f23d7092e3	Merge pull request #4265 from openanolis/anolis/dragonball-1 runtime-rs: built-in Dragonball sandbox part I - resource and device managers	2022-06-12 12:17:57 +08:00
Chao Wu	d5ee3fc856	safe-path: fix clippy warning fix clippy warnings in safe-path lib to make clippy happy. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-12 10:24:05 +08:00
Alexandru Matei	721ca72a64	runtime: fix error when trying to parse sandbox sizing annotations Changed bitsize for parsing functions to 64-bit in order to avoid parsing errors. Fixes #4435 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-06-11 18:51:10 +03:00
Chao Wu	93c10dfd86	runtime-rs: add crosvm license in Dragonball add THIRD-PARTY file to add license for crosvm. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:24:58 +08:00
Chao Wu	dfe6de7714	dragonball: add dragonball into kata README add dragonball description into kata README to help introduce dragonball sandbox. Fixes: #4257 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:24:56 +08:00
wllenyj	39ff85d610	dragonball: green ci Revert this patch, after dragonball-sandbox is ready. And all subsequent implementations are submitted. Fixes: #4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-11 17:24:17 +08:00
wllenyj	71f24d8271	dragonball: add Makefile. Currently supported: build, clippy, check, format, test, clean Fixes: #4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-06-11 17:24:17 +08:00
Chao Wu	a1df6d0969	Doc: Update Dragonball Readme and add document for device Update Dragonball Readme to fix style problem and add github issue for TODOs. Add document for devices in dragonball. This is the document for the current dragonball device status and we'll keep updating it when we introduce more devices in later pull requets. Fixes: #4257 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:24:17 +08:00
wllenyj	8619f2b3d6	dragonball: add virtio vsock device manager. Added VsockDeviceMgr struct to manage all vsock devices. Fixes: #4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:23:56 +08:00
wllenyj	52d42af636	dragonball: add device manager. Device manager to manage IO devices for a virtual machine. And added DeviceManagerTx to provide operation transaction for device management, added DeviceManagerContext to operation context for device management. Fixes: #4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:23:56 +08:00
wllenyj	c1c1e5152a	dragonball: add kernel config. It is used for holding guest kernel configuration information. Fixes: #4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:23:46 +08:00
wllenyj	6850ef99ae	dragonball: add configuration manager. It is used for managing a group of configuration information. Fixes: #4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:23:39 +08:00
wllenyj	0bcb422fcb	dragonball: add legacy devices manager The legacy devices manager is used for managing legacy devices. Fixes: #4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:23:33 +08:00
wllenyj	3c45c0715f	dragonball: add console manager. Console manager to manage frontend and backend console devcies. A virtual console are composed up of two parts: frontend in virtual machine and backend in host OS. A frontend may be serial port, virtio-console etc, a backend may be stdio or Unix domain socket. The manager connects the frontend with the backend. Fixes: #4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:23:27 +08:00
wllenyj	3d38bb3005	dragonball: add address space manager. Address space abstraction to manage virtual machine's physical address space. The AddressSpaceMgr Struct to manage address space. Fixes: #4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:21:41 +08:00
wllenyj	aff6040555	dragonball: add resource manager support. Resource manager manages all resources of a virtual machine instance. Fixes: #4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:21:41 +08:00
wllenyj	8835db6b0f	dragonball: initial commit The dragonball crate initial commit that includes dragonball README and basic code structure. Fixes: #4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2022-06-11 17:21:41 +08:00
Fupan Li	9cb15ab4c5	agent: add the FSGroup support Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2022-06-11 11:30:51 +08:00
Fupan Li	ff7874bc23	protobuf: upgrade the protobuf version to 2.27.0 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2022-06-11 10:05:52 +08:00
Archana Shinde	aefe11b9ba	Merge pull request #4331 from dgibson/config-enable-iommu-annotation Allow io.katacontainers.config.hypervisor.enable_iommu annotation by …	2022-06-10 17:43:27 -07:00
Chelsea Mafrica	7deb87dcbc	Merge pull request #4434 from fidencio/topic/bump-virtiofsd-release versions: Bump virtiofsd to v1.3.0	2022-06-10 12:08:33 -07:00
GabyCT	f811c8b60e	Merge pull request #4431 from jodh-intel/docs-arch-storage-limits docs: Add storage limits to arch doc	2022-06-10 11:52:45 -05:00
Zhongtao Hu	06f398a34f	runtime-rs: use withContext to evaluate lazily Fixes: #4129 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 22:03:13 +08:00
Quanwei Zhou	fd4c26f9c1	runtime-rs: support network resource Fixes: #3785 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 22:02:58 +08:00
Tim Zhang	4be7185aa4	runtime-rs: runtime part implement Fixes: #3785 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 22:01:12 +08:00
Zhongtao Hu	10343b1f3d	runtime-rs: enhance runtimes 1. support oom event 2. use ContainerProcess to store container_id and exec_id 3. support stats Fixes: #3785 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 22:01:05 +08:00
Quanwei Zhou	9887272db9	libs: enhance kata-sys-util and kata-types Fixes: #3785 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 21:59:47 +08:00
Quanwei Zhou	3ff0db05a7	runtime-rs: support rootfs volume for resource Fixes: #3785 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 19:58:01 +08:00
Tim Zhang	234d7bca04	runtime-rs: support cgroup resource Fixes: #3785 Signed-off-by: Tim Zhang <tim@hyper.sh>	2022-06-10 19:57:53 +08:00
Quanwei Zhou	75e282b4c1	runtime-rs: hypervisor base define Responsible for VM manager, such as Qemu, Dragonball Fixes: #3785 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 19:57:45 +08:00
Quanwei Zhou	bdfee005fa	runtime-rs: service and runtime framework 1. service: Responsible for processing services, such as task service, image service 2. Responsible for implementing different runtimes, such as Virt-container, Linux-container, Wasm-container Fixes: #3785 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 19:57:36 +08:00
Quanwei Zhou	4296e3069f	runtime-rs: agent implements Responsible for communicating with the agent, such as kata-agent in the VM Fixes: #3785 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 19:57:29 +08:00
Jakob Naucke	d3da156eea	runtime-rs: uint FsType for s390x statfs type on s390x should be c_uint, not __fsword_t Fixes: #3888 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-06-10 19:57:23 +08:00
quanwei.zqw	e705ee07c5	runtime-rs: update containerd-shim-protos to 0.2.0 Fixes: #3866 Signed-off-by: quanwei.zqw <quanwei.zqw@alibaba-inc.com>	2022-06-10 19:57:14 +08:00
quanwei.zqw	8c0a60e191	runtime-rs: modify the review suggestion Fixes: #3876 Signed-off-by: quanwei.zqw <quanwei.zqw@alibaba-inc.com>	2022-06-10 19:57:07 +08:00
Zack	278f843f92	runtime-rs: shim implements for runtime-rs Responsible for processing shim related commands: start, delete. This patch is extracted from Alibaba Cloud's internal repository runD Thanks to all contributors! Fixes: #3785 Signed-off-by: acetang <aceapril@126.com> Signed-off-by: Bin Liu <bin@hyper.sh> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Signed-off-by: Fupan Li <lifupan@gmail.com> Signed-off-by: gexuyang <gexuyang@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: He Rongguang <herongguang@linux.alibaba.com> Signed-off-by: Hui Zhu <teawater@gmail.com> Signed-off-by: Issac Hai <hjwissac@linux.alibaba.com> Signed-off-by: Jiahuan Chao <jhchao@linux.alibaba.com> Signed-off-by: lichenglong9 <lichenglong9@163.com> Signed-off-by: mengze <mengze@linux.alibaba.com> Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com> Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com> Signed-off-by: shiqiangzhang <shiyu.zsq@linux.alibaba.com> Signed-off-by: Simon Guo <wei.guo.simon@linux.alibaba.com> Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: wanglei01 <wllenyj@linux.alibaba.com> Signed-off-by: Wei Yang <wei.yang1@linux.alibaba.com> Signed-off-by: yanlei <yl.on.the.way@gmail.com> Signed-off-by: Yiqun Leng <yqleng@linux.alibaba.com> Signed-off-by: yuchang.xu <yuchang.xu@linux.alibaba.com> Signed-off-by: Yves Chan <lingfu@linux.alibaba.com> Signed-off-by: Zack <zmlcc@linux.alibaba.com> Signed-off-by: Zhiheng Tao <zhihengtao@linux.alibaba.com> Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>	2022-06-10 19:56:59 +08:00
Quanwei Zhou	641b736106	libs: enhance kata-sys-util 1. move verify_cid from agent to libs/kata-sys-util 2. enhance kata-sys-util/k8s Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 19:55:39 +08:00
Fupan Li	69ba1ae9e4	trans: fix the issue of wrong swapness type Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2022-06-10 19:46:25 +08:00
Quanwei Zhou	d2a9bc6674	agent: agent-protocol support async 1. support async. 2. update ttrpc and protobuf update ttrpc to 0.6.0 update protobuf to 2.23.0 3. support trans from oci Fixes: #3746 Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 19:36:55 +08:00
Fabiano Fidêncio	9773838c01	virtiofsd: export env vars needed for building it @jongwu, mentioned on an PR[0] that env vars should be exported to ensure that virtiofsd is statically built for non-x86_64 architectures. [0]: https://github.com/kata-containers/kata-containers/pull/4308#issuecomment-1137125592 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-10 13:27:02 +02:00
Liu Jiang	aee9633ced	libs/sys-util: provide functions to execute hooks Provide functions to execute OCI hooks. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Bin Liu <bin@hyper.sh> Signed-off-by: Huamin Tang <huamin.thm@alibaba-inc.com> Signed-off-by: Lei Wang <wllenyj@linux.alibaba.com> Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 19:24:30 +08:00
Liu Jiang	8509de0aea	libs/sys-util: add function to detect and update K8s emptyDir volume Add function to detect and update K8s emptyDir volume. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com>	2022-06-10 19:15:59 +08:00
Liu Jiang	6d59e8e197	libs/sys-util: introduce function to get device id Introduce get_devid() to get major/minor number of a block device. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>	2022-06-10 19:15:28 +08:00
Liu Jiang	5300ea23ad	libs/sys-util: implement reflink_copy() Implement reflink_copy() to copy file by reflink, and fallback to normal file copy. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>	2022-06-10 19:15:20 +08:00
Liu Jiang	1d5c898d7f	libs/sys-util: add utilities to parse NUMA information Add utilities to parse NUMA information. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com> Signed-off-by: Simon Guo <wei.guo.simon@linux.alibaba.com>	2022-06-10 19:15:12 +08:00
Liu Jiang	87887026f6	libs/sys-util: add utilities to manipulate cgroup Add utilities to manipulate cgroup, currently only v1 is supported. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: He Rongguang <herongguang@linux.alibaba.com> Signed-off-by: Jiahuan Chao <jhchao@linux.alibaba.com> Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com> Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com> Signed-off-by: Tim Zhang <tim@hyper.sh>	2022-06-10 19:14:59 +08:00
Fabiano Fidêncio	b0e090f40b	versions: Bump virtiofsd to v1.3.0 Changes since v1.2.0: !123 Update rust-vmm dependencies (main) ← (update-deps) !121 implement std::error::Error trait (main) ← (fix-impl-error) !120 Show the nofile hard limit value in the warning me... (main) ← (fix-rlimit-warn) !119 Do not create tmpdir and bind mount /proc/self/fd ... (main) ← (remove-tmp-dir-for-proc) !116 Disable killpriv_v2 by default (main) ← (no-killpriv-default) The one that affected Kata Containers the most was !119, as virtiofsd would get denied when SELinux was set to run on enforcing mode. Fixes: #4433 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-10 13:14:58 +02:00
Liu Jiang	ccd03e2cae	libs/sys-util: add wrappers for mount and fs Add some wrappers for mount and fs syscall. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Bin Liu <bin@hyper.sh> Signed-off-by: Fupan Li <lifupan@gmail.com> Signed-off-by: Huamin Tang <huamin.thm@alibaba-inc.com> Signed-off-by: Lei Wang <wllenyj@linux.alibaba.com> Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>	2022-06-10 19:14:06 +08:00
Liu Jiang	45a00b4f02	libs/sys-util: add kata-sys-util crate under src/libs The kata-sys-util crate is a collection of modules that provides helpers and utilities used by multiple Kata Containers components. Fixes: #3305 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-06-10 19:10:40 +08:00
Zhongtao Hu	48c201a1ac	libs/types: make the variable name easier to understand 1. modify default values for hypervisor 2. change the variable name 3. check the min memory limit Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 19:01:31 +08:00
Zhongtao Hu	b9b6d70aae	libs/types: modify implementation details 1. fix nit problems 2. use generic type when parsing different type Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 19:01:24 +08:00
Zhongtao Hu	05ad026fc0	libs/types: fix implementation details use ok_or_else to handle get_mut(hypervisor) to substitue unwrap Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 19:01:17 +08:00
Zhongtao Hu	d96716b4d2	libs/types:fix styles and implementation details 1. Some Nit problems are fixed 2. Make the code more readable 3. Modify some implementation details Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 19:01:09 +08:00
Zhongtao Hu	6cffd943be	libs/types:return Result to handle parse error If there is a parse error when we are trying to get the annotations, we will return Result<Option<type>> to handle that. Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 19:00:58 +08:00
Zhongtao Hu	6ae87d9d66	libs/types: use contains to make code more readable use contains to when validate hypervisor block_device_driver Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 19:00:50 +08:00
Zhongtao Hu	45e5780e7c	libs/types: fixed spelling and grammer error fixed spelling and grammer error in some files Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 19:00:43 +08:00
Zhongtao Hu	2599a06a56	libs/types:use include_str! in test file use include_str! to load toml file to string fmt Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 18:28:14 +08:00
Zhongtao Hu	8ffff40af4	libs/types:Option type to handle empty tomlconfig loading from empty string is only used to identity that the config is not initialized yet, so Option<TomlConfig> is a better option Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 18:28:05 +08:00
Zhongtao Hu	626828696d	libs/types: add license for test-config.rs add SPDX license identifier: Apache-2.0 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 18:27:57 +08:00
Zhongtao Hu	97d8c6c0fa	docs: modify move-issues-to-in-progress.yaml change issue backlog to runtime-rs Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 18:27:49 +08:00
Liu Jiang	8cdd70f6c2	libs/types: change method to update config by annotation Some annotations are used to override hypervisor configurations, and you know it's dangerous. We must be careful when overriding hypervisor configuration by annotations, to avoid security flaws. There are two existing mechanisms to prevent attacks by annotations: 1) config.hypervisor.enable_annotations defines the allowed annotation keys for config.hypervisor. 2) config.hyperisor.xxxx_paths defines allowd values for specific keys. The access methods for config.hypervisor.xxx enforces the permisstion checks for above rules. To update conifg, traverse the annotation hashmap,check if the key is enabled in hypervisor or not. If it is enabled. For path related annotation, check whether it is valid or not before updating conifg. For cpu and memory related annotation, check whether it is more than or less than the limitation for DB and qemu beforing updating config. If it is not enabled, there will be three possibilities, agent related annotation, runtime related annotation and hypervisor related annotation but not enabled. The function will handle agent and runtime annotation first, then the option left will be the invlaid hypervisor, err message will be returned. add more edge cases tests for updating config clean up unused functions, delete unused files and fix warnings Fixes: #3523 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-06-10 18:27:36 +08:00
Liu Jiang	e19d04719f	libs/types: implement KataConfig to wrap TomlConfig The TomlConfig structure is a parsed form of Kata configuration file, but it's a little inconveneient to access those configuration information directly. So introduce a wrapper KataConfig to easily access those configuration information. Two singletons of KataConfig is provided: - KATA_DEFAULT_CONFIG: the original version directly loaded from Kata configuration file. - KATA_ACTIVE_CONFIG: the active version is the KATA_DEFAULT_CONFIG patched by annotations. So the recommended to way to use these two singletons: - Load TomlConfig from configuration file and set it as the default one. - Clone the default one and patch it with values from annotations. - Use the default one for permission checks, such as to check for allowed annotation keys/values. - The patched version may be set as the active one or passed to clients. - The clients directly accesses information from the active/passed one, and do not need to check annotation for override. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-06-10 18:26:48 +08:00
Liu Jiang	387ffa914e	libs/types: support load Kata agent configuration from file Add structures to load Kata agent configuration from configuration files. Also define a mechanism for vendor to extend the Kata configuration structure. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-06-10 18:26:37 +08:00
Liu Jiang	69f10afb71	libs/types: support load Kata hypervisor configuration from file Add structures to load Kata hypevisor configuration from configuration files. Also define a mechanisms to: 1) for hypervisors to handle the configuration info. 2) for vendor to extend the Kata configuration structure. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 18:25:24 +08:00
Liu Jiang	21cc02d724	libs/types: support load Kata runtime configuration from file Add structures to load Kata runtime configuration from configuration files. Also define a mechanism for vendor to extend the Kata configuration structure. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-06-10 18:25:24 +08:00
Liu Jiang	5b89c1df2f	libs/types: add kata-types crate under src/libs Add kata-types crate to host constants and data types shared by multiple Kata Containers components. Fixes: #3305 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Fupan Li <lifupan@gmail.com> Signed-off-by: Huamin Tang <huamin.thm@alibaba-inc.com> Signed-off-by: Lei Wang <wllenyj@linux.alibaba.com> Signed-off-by: yanlei <yl.on.the.way@gmail.com>	2022-06-10 18:25:24 +08:00
Liu Jiang	4f62a7618c	libs/logging: fix clippy warnings Fix clippy warnings of libs/logging. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-06-10 18:25:24 +08:00
Liu Jiang	6f8acb94c2	libs: refine Makefile rules Refine Makefile rules to better support the KATA ci env. Fixes: #3536 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-06-10 18:25:24 +08:00
Liu Jiang	7cdee4980c	libs/logging: introduce a wrapper writer for logging Introduce a wrapper writer `LogWriter` which converts every line written to it into a log record. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Wei Yang <wei.yang1@linux.alibaba.com> Signed-off-by: yanlei <yl.on.the.way@gmail.com>	2022-06-10 18:25:24 +08:00
Liu Jiang	426f38de94	libs/logging: implement rotator for log files Add FileRotator to rotate log files. The FileRotator structure may be used as writer for create_logger() and limits the storage space occupied by log files. Fixes: #3304 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Wei Yang <wei.yang1@linux.alibaba.com> Signed-off-by: yanlei <yl.on.the.way@gmail.com>	2022-06-10 18:25:24 +08:00
Liu Jiang	392f1ecdf5	libs: convert to a cargo workspace Convert libs into a Cargo workspace, so all libraries could share the build infrastructure. Fixes #3282 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-06-10 18:25:24 +08:00
Liu Jiang	575df4dc4d	static-checks: Allow Merge commit to be >75 chars Some generated merge commit messages are >75 chars Allow these to not trigger the subject line length failure Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2022-06-10 18:25:24 +08:00
Alex Carter	db5048d52c	kernel: build efi_secret module for SEV Add kernel fork for sev to kernel builder with efi_secret. Additionally, install efi_secret module for sev. Fixes: #4179 Signed-off-by: Alex Carter <alex.carter@ibm.com>	2022-06-09 12:28:43 -05:00
Snir Sheriber	7676cde0c5	workflow: trigger test-kata-deploy with pull_request event that changes VERSION (i.e. a release PR) Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-06-09 18:17:47 +03:00
Snir Sheriber	f10827357e	workflow: require PR num input on test-kata-deploy workflow_dispatch this will require to set a PR number when triggering the test-kata-deploy workflow manually also make sure user variables are set correctly when workflow_dispatch is used Fixes: #4349 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-06-09 18:14:43 +03:00
James O. D. Hunt	1b845978f9	docs: Add storage limits to arch doc Updated the architecture document to explain that if you wish to constrain the amount of disk space a container uses, you need to use an existing facility such as `quota(1)`s or device mapper limits. Fixes: #4430. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-09 10:52:17 +01:00
James O. D. Hunt	412441308b	docs: Add more kata monitor details Add more detail to the `kata-monitor` doc to allow an admin to make a more informed decision about where and how to run the daemon. Fixes: #4416. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-09 09:20:11 +01:00
Bin Liu	ae911d0cd3	Merge pull request #4378 from cmaf/update-containerd-docs-critools docs: Update source for cri-tools	2022-06-09 15:12:37 +08:00
Bin Liu	05022975c8	Merge pull request #4413 from jodh-intel/tools-full-err-output tools: Enable extra detail on error	2022-06-09 13:52:08 +08:00
Chelsea Mafrica	aaa74e8a2b	Merge pull request #4415 from jodh-intel/agent-ctl-doc-examples docs: Add agent-ctl examples section	2022-06-08 09:51:30 -07:00
snir911	a57515bdae	Merge pull request #4384 from snir911/2.5.0-alpha2-branch-bump # Kata Containers 2.5.0-alpha2	2022-06-08 19:32:57 +03:00
Eric Ernst	4ebf9d38b9	Merge pull request #4310 from egernst/core-sched shim: add support for core scheduling	2022-06-08 17:42:45 +02:00
Bin Liu	eff4e1017d	shim: change the log level for GetOOMEvent call failures GetOOMEvent is a blocking call that will fail if the container exit, in this case, it's not an error or warning. Changing the log level for logs in case of GetOOMEvent call fails will reduce log noise in a large cluster that has pods creating/deleting frequently. Fixes: #4376 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-06-08 22:17:24 +08:00
Snir Sheriber	eb24e97150	release: Kata Containers 2.5.0-alpha2 - docs: Update storage documentation link - rustjail: get home dir using nix crate - runk: Support `list` sub-command - docs: Update vGPU use-case - runtime: ignore ESRCH error from stop container - docs: Update configuration reference for snap documentation - workflows: add workflow_dispatch triggering to test-kata-deploy - snap: Use helper script and cleanup - feature: add ability to interact with IPTables within the guest - agent: return mount file content if parse mountinfo failed - docs: Update Intel QAT documentation links - osbuilder: add iptables package - runk: Return error when tty is used without console socket - runk: Add Podman guide in README - agent: Pass standard I/O to container launched by runk - agent, runk: Enable test for the agent built with standard-oci-runtime feature - runk: Handle rootfs path in config.json properly - Update containerd docs - clh: Update to v24.0 - snap: Build and package rust version of virtiofsd - runk: merge oci-kata-agent into runk - virtiofsd: static build virtiofsd from rust code for non-x86 - Fix issues with direct-volume stats feature - runtime: fix incorrect Action function for direct-volume stats - runtime: Adding the correct detection of mediated PCIe devices - runtime: remove duplicate 'types' import - runtime: sync docstrings with function names - qemu: allow using legacy serial device for the console - docs: Remove clear containers reference in README - runtime: do not check for EOF error in console watcher - kernel: Remove nemu.conf from packaging - tools: delete unused param from get_from_kata_deps callers - agent: Fix is_signal_handled failing parsing str to u64 - Improve Go unit test script - packaging: Add kernel config option for SGX in Gramine - ci: Don't run Docs URL Alive Check workflow on forks - tools: Add QEMU patches for SGX numa support - docs: Update runc containerd runtime - Build and distribute the rust version of virtiofsd - doc: Update log parser link - Move the kata-log-parser from the tests repo - versions: Upgrade to Cloud Hypervisor v23.1 - agent: Add a macro to skip a loop easier - runk: use custom Kill command to support --all option - agent: add test coverage for functions find_process and online_resources `fe3c1d9cd` docs: Update storage documentation link `9d27c1fce` agent: ignore ESRCH error when destroying containers `9726f56fd` runtime: force stop container after the container process exits `168f325c4` docs: Update configuration reference for snap documentation `38a318820` runk: Support `list` sub-command `b9fc24ff3` docs: update release process github token instructions `c1476a174` docs: update release process with latest workflow triggering `002f2cd10` snap: Use helper script and cleanup `2e04833fb` docs: Update Intel QAT documentation links `8b57bf97a` workflows: add workflow_dispatch triggering to test-kata-deploy `6d0ff901a` docs: Update vGPU use-case `9b108d993` docs: Improve snap formatting `894f661cc` docs: Add warning to snap build `d759f6c3e` snap: Fix CH architecture check `590381574` agent: Pass standard I/O to container launched by runk `af2ef3f7a` agent-ctl: introduce handle for iptables get/set `65f0cef16` kata-runtime: add iptables CLI to test http endpoint `3201ad083` shim-client: ensure we check resp status for Put/Post `0706fb28a` kata-runtime: shmgmt: make url usage consistent `2a09378dd` shim-client: add support for DoPut `640173cfc` shim-mgmt: Add endpoint handler for interacting with iptables `0136be22c` virtcontainers: plumb iptable set/get from sandbox to agent `bd50d463b` agent: iptables: get/set handling for iptables `7c4049aab` osbuilder: add iptables package `03176a9e0` proto: update generated code based on proto update `38ebbc705` proto: update to add set/get iptables `78d45b434` agent: return mount file content if parse mountinfo failed `c7b3941c9` runk: Enable test for the agent built with standard-oci-runtime feature `6dbce7c3d` agent: Remove unused import in console test `6ecea84bc` rustjail: get home dir using nix crate `648b8d0ae` runk: Return error when tty is used without console socket `5205efd9b` runk: Add Podman guide in README `d862ca059` runk: Handle rootfs path in config.json properly `56591804b` docs: Improve snap build instructions `cb2b30970` snap: Build using destructive mode `60823abb9` docs: Move snap README `fff832874` clh: Update to v24.0 `49361749e` snap: Build and package rust version of virtiofsd `27d903b76` snap: Put the yq binary in the staging bin directory `d7b4ce049` snap: Remove unused variable `43de5440e` snap: Fix unbound variable error `c9b291509` snap: Fix whitespace `122a85e22` agent: remove bin oci-kata-agent `35619b45a` runk: merge oci-kata-agent into runk `10c13d719` qemu: remove virtiofsd option in qemu config `d20bc5a4d` virtiofsd: build rust based virtiofsd from source for non-x86_64 `c95ba63c0` docs: Remove information related to Kata 1.x `34b80382b` docs: Get rid of note related to networking. `dfad5728a` docs: Mention --cni flag while invoking ctr `8e7c5975c` agent: fix direct-assigned volume stats `4428ceae1` runtime: direct-volume stats use correct name `ffdc065b4` runtime: direct-volume stats update to use GET parameter `f29595318` runtime: fix incorrect Action function for direct-volume stats `7a5ccd126` runtime: sync docstrings with function names `ce2e521a0` runtime: remove duplicate 'types' import `834f93ce8` docs: fix annotations example `f4994e486` runtime: allow annotation configuration to use_legacy_serial `24a2b0f6a` docs: Remove clear containers reference in README `abad33eba` kernel: Remove nemu.conf from packaging `e87eb13c4` tools: delete unused param from get_from_kata_deps callers `8052fe62f` runtime: do not check for EOF error in console watcher `c67b9d297` qemu: allow using legacy serial device for the console `44814dce1` qemu: treat console kernel params within appendConsole `4f586d2a9` packaging: Add kernel config option for SGX in Gramine `4b437d91f` agent: Fix is_signal_handled failing parsing str to u64 `88fb9b72e` docs: Update runc containerd runtime `d1f2852d8` tools: Stop building virtiofsd with qemu (for x86_64) `c39852e83` runtime: Use ${LIBEXEC}/virtiofsd as the default virtiofsd path `b4b9068cb` tools: Add QEMU patches for SGX numa support `a475956ab` workflows: Add support for building virtiofsd `71f59f3a7` local-build: Add support for building virtiofsd `c7ac55b6d` dockerbuild: Install unzip `8e2042d05` tools: add script to pull virtiofsd `dbedea508` versions: Add virtiofsd entry `e73b70baf` runtime: Don't run unit tests verbose by default `f24a6e761` runtime: Consolidate flags setting in unit tests script `cf465feb0` runtime: Don't change test behaviour based on $CI or $KATA_DEV_MODE `34c4ac599` runtime: Remove redundant subcommands from go-test.sh `0aff5aaa3` runtime: Simplify package listing in go-test.sh `557c4cfd0` runtime: Don't chmod coverage files in Go tests `04c8b52e0` runtime: Remove HTML coverage option from go-test.sh `7f7691442` runtime: Add coverage.txt.tmp to gitignore `13c257700` runtime: Move go testing script locally `421064680` doc: Update log parser link `271933fec` log-parser: fix some of the documentation `c7dacb121` log-parser: move the kata-log-parser from the tests repo `82ea01828` versions: Upgrade to Cloud Hypervisor v23.1 `2a1d39414` runtime: Adding the correct detection of mediated PCIe devices `7bc4ab68c` ci: Don't run Docs URL Alive Check workflow on forks `475e3bf38` agent: add test coverage for functions find_process and online_resources `383be2203` agent: Add a macro to skip a loop easier `97d7b1845` runk: use custom Kill command to support --all option Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-06-08 11:56:30 +03:00
dependabot[bot]	5d7fb7b7b0	build(deps): bump github.com/containerd/containerd in /src/runtime Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.6.1 to 1.6.6. - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](https://github.com/containerd/containerd/compare/v1.6.1...v1.6.6) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-type: direct:production ... Fixes: #4421 Signed-off-by: dependabot[bot] <support@github.com>	2022-06-08 10:54:46 +03:00
dependabot[bot]	d0ca2fcbbc	build(deps): bump crossbeam-utils in /src/tools/trace-forwarder Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.5 to 0.8.8. - [Release notes](https://github.com/crossbeam-rs/crossbeam/releases) - [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md) - [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.5...crossbeam-utils-0.8.8) --- updated-dependencies: - dependency-name: crossbeam-utils dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2022-06-08 10:47:58 +03:00
dependabot[bot]	a60dcff4d8	build(deps): bump regex from 1.5.4 to 1.5.6 in /src/tools/agent-ctl Bumps [regex](https://github.com/rust-lang/regex) from 1.5.4 to 1.5.6. - [Release notes](https://github.com/rust-lang/regex/releases) - [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/regex/compare/1.5.4...1.5.6) --- updated-dependencies: - dependency-name: regex dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2022-06-08 10:47:58 +03:00
dependabot[bot]	dbf50672e1	build(deps): bump crossbeam-utils in /src/tools/agent-ctl Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.5 to 0.8.8. - [Release notes](https://github.com/crossbeam-rs/crossbeam/releases) - [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md) - [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.5...crossbeam-utils-0.8.8) --- updated-dependencies: - dependency-name: crossbeam-utils dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2022-06-08 10:47:58 +03:00
dependabot[bot]	8e2847bd52	build(deps): bump crossbeam-utils from 0.8.6 to 0.8.8 in /src/libs Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.6 to 0.8.8. - [Release notes](https://github.com/crossbeam-rs/crossbeam/releases) - [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md) - [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.6...crossbeam-utils-0.8.8) --- updated-dependencies: - dependency-name: crossbeam-utils dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2022-06-08 10:47:58 +03:00
dependabot[bot]	e9ada165ff	build(deps): bump regex from 1.5.4 to 1.5.5 in /src/agent Bumps [regex](https://github.com/rust-lang/regex) from 1.5.4 to 1.5.5. - [Release notes](https://github.com/rust-lang/regex/releases) - [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/regex/compare/1.5.4...1.5.5) --- updated-dependencies: - dependency-name: regex dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2022-06-08 10:47:58 +03:00
dependabot[bot]	adad9cef18	build(deps): bump crossbeam-utils from 0.8.5 to 0.8.8 in /src/agent Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.5 to 0.8.8. - [Release notes](https://github.com/crossbeam-rs/crossbeam/releases) - [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md) - [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.5...crossbeam-utils-0.8.8) --- updated-dependencies: - dependency-name: crossbeam-utils dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2022-06-08 10:47:58 +03:00
James O. D. Hunt	34bcef8846	docs: Add agent-ctl examples section Add a new `Examples` section to the `agent-ctl` docs giving some examples of how to use the tool with QEMU and stand-alone. Fixes: #4414. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-08 08:39:38 +01:00
James O. D. Hunt	815157bf02	docs: Remove erroneous whitespace Deleted an extra blank line. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-08 08:39:38 +01:00
GabyCT	5bd81ba232	Merge pull request #4399 from GabyCT/topic/updatestoragedoc docs: Update storage documentation link	2022-06-07 09:13:45 -05:00
James O. D. Hunt	f5099620f1	tools: Enable extra detail on error The `agent-ctl` and `trace-forwarder` tools make use of `anyhow::Context` to provide additional call site information on error. However, previously neither tool was using the "alternate debug" format to display the error, meaning full error output was not displayed. Fixes: #4411. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-07 14:00:29 +01:00
Gabriela Cervantes	fe3c1d9cdd	docs: Update storage documentation link This PR updates the storage documentation link for the devicemapper snapshotter. Fixes #4398 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-06-06 14:48:34 +00:00
Bin Liu	a238d8c6bd	Merge pull request #4300 from justxuewei/fix/rustjail/home-env rustjail: get home dir using nix crate	2022-06-06 11:03:46 +08:00
Bin Liu	f981190621	Merge pull request #4383 from cyyzero/runk-list runk: Support `list` sub-command	2022-06-06 10:25:33 +08:00
Bin Liu	f7b22eb777	Merge pull request #4344 from zvonkok/vgpu-documentation docs: Update vGPU use-case	2022-06-06 10:25:05 +08:00
David Gibson	8f10e13e07	config: Allow enable_iommu pod annotation by default Since #902 the `io.katacontainers.config.hypervisor` pod annotations have only been permitted if explicitly allowed in the global configuration. The default global configuration allows no such annotations. That's important because several of those annotations would cause Kata to execute arbitrary binaries, and so were wildly unsafe. However, this is inconvenient for the `io.katacontainers.config.hypervisor.enable_iommu` annotation specifically, which controls whether the sandbox VM includes a vIOMMU. A guest side vIOMMU is necessary to implement VFIO passthrough devices with `vfio_mode = vfio`, so enabling that mode of operation currently requires a global configuration change, and can't just be enabled per-pod. Unlike some of the other hypervisor annotations, the `enable_iommu` annotation is quite safe. By default the vIOMMU is not present, so allowing a user to override it for a pod only improves their facilities for isolation. Even if the global default were changed to enable the vIOMMU, that doesn't compel the guest kernel to use it, so allowing a user to disable the vIOMMU doesn't materially affect isolation either. Therefore, allow the io.katacontainers.config.hypervisor.enable_iommu annotation to work in the default configurations. fixes #4330 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-06-04 13:02:05 +10:00
Eric Ernst	430da47215	Merge pull request #4360 from fengwang666/shim-leak runtime: ignore ESRCH error from stop container	2022-06-02 12:42:19 -07:00
GabyCT	9c9e5984ba	Merge pull request #4342 from GabyCT/topic/updatesnapdoc docs: Update configuration reference for snap documentation	2022-06-02 14:00:22 -05:00
Feng Wang	9d27c1fced	agent: ignore ESRCH error when destroying containers destroy() method should ignore the ESRCH error from signal::kill and continue the operation as ESRCH is often considered harmless. Fixes: #4359 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-06-02 08:19:48 -07:00
Feng Wang	9726f56fdc	runtime: force stop container after the container process exits Set thestop container force flag to true so that the container state is always set to “StateStopped” after the container wait goroutine is finished. This is necessary for the following delete container step to succeed. Fixes: #4359 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-06-02 08:17:08 -07:00
Gabriela Cervantes	168f325c43	docs: Update configuration reference for snap documentation This PR updates the url link for the kata containers configuration for the general snap documentation. Fixes #4341 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-06-02 14:55:06 +00:00
Chen Yiyang	38a3188206	runk: Support `list` sub-command Support list sub-command. It will traverse the root directory, parse status file and print basic information of containers. Behavior and print format consistent with runc. To handle race with runk delete or system user modify, the loop will continue to traverse when errors are encountered. Fixes: #4362 Signed-off-by: Chen Yiyang <cyyzero@qq.com>	2022-06-02 18:24:51 +08:00
snir911	a0805742d6	Merge pull request #4350 from snir911/fix_workflow workflows: add workflow_dispatch triggering to test-kata-deploy	2022-06-02 13:19:13 +03:00
Fabiano Fidêncio	24182d72d9	Merge pull request #4322 from jodh-intel/snap-cleanup snap: Use helper script and cleanup	2022-06-02 11:47:02 +02:00
Peng Tao	295a01f9b1	Merge pull request #4159 from egernst/topic/iptables feature: add ability to interact with IPTables within the guest	2022-06-02 11:19:41 +08:00
Tim Zhang	b8e98b175c	Merge pull request #4355 from liubin/fix/add-debug-info-for-parse-mount-error agent: return mount file content if parse mountinfo failed	2022-06-02 10:31:46 +08:00
GabyCT	e8d0be364f	Merge pull request #4375 from GabyCT/topic/updateqat docs: Update Intel QAT documentation links	2022-06-01 15:52:02 -05:00
Chelsea Mafrica	7ae11cad67	docs: Update source for cri-tools Kubernetes-incubator was previously deprecated in favor of kubernetes-sigs. Fixes #4377 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-06-01 12:48:48 -07:00
Chelsea Mafrica	25b1317ead	Merge pull request #4357 from egernst/iptables-pkg osbuilder: add iptables package	2022-06-01 09:28:38 -07:00
Snir Sheriber	b9fc24ff3a	docs: update release process github token instructions and fix the gpg generating key url Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-06-01 19:08:41 +03:00
Snir Sheriber	c1476a174b	docs: update release process with latest workflow triggering instructions Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-06-01 19:08:25 +03:00
James O. D. Hunt	002f2cd109	snap: Use helper script and cleanup Move the common shell code to a helper script that is sourced by all parts. Add extra quoting to some variables in the snap config file and simplify. Fixes: #4304. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-01 16:09:29 +01:00
Gabriela Cervantes	2e04833fb9	docs: Update Intel QAT documentation links This PR updates some Intel QAT documentation url links. Fixes #4374 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-06-01 14:41:00 +00:00
Snir Sheriber	8b57bf97ab	workflows: add workflow_dispatch triggering to test-kata-deploy This will allow to trigger the test-kata-deploy workflow manually from any branch instead of using always the one that is defined on main See: https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/ Fixes: #4349 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-06-01 16:21:01 +03:00
Zvonko Kaiser	6d0ff901ab	docs: Update vGPU use-case Now that #4213 is merged we need updated documentation for vGPU time-sliced or vGPU MIG-backed. Fixes: #4343 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2022-06-01 05:58:46 -07:00
James O. D. Hunt	9b108d9937	docs: Improve snap formatting Improve the snap docs by using more consistent formatting and proper shell code in the shell example. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-01 12:00:40 +01:00
James O. D. Hunt	894f661cc4	docs: Add warning to snap build Since we must build with `--destructive-mode`, add a warning that the host environment could change the behaviour of the build, depending on the packages installed on the system. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-01 12:00:40 +01:00
James O. D. Hunt	d759f6c3e5	snap: Fix CH architecture check Correct the `cloud-hypervisor` part architecture check to use `x86_64`, not `x64_64`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-01 12:00:38 +01:00
Bin Liu	3e2817f7b5	Merge pull request #4325 from ManaSugi/runk/error-terminal runk: Return error when tty is used without console socket	2022-06-01 13:58:38 +08:00
Bin Liu	a9a3074828	Merge pull request #4339 from ManaSugi/runk/add-podman-instruction runk: Add Podman guide in README	2022-06-01 11:05:42 +08:00
Bin Liu	9f81c2dbf0	Merge pull request #4328 from ManaSugi/runk/output-stdout agent: Pass standard I/O to container launched by runk	2022-06-01 11:00:26 +08:00
Manabu Sugimoto	5903815746	agent: Pass standard I/O to container launched by runk The `kata-agent` passes its standard I/O file descriptors through to the container process that will be launched by `runk` without manipulation or modification in order to allow the container process can handle its I/O operations. Fixes: #4327 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-06-01 10:19:57 +09:00
Bin Liu	9658c6218e	Merge pull request #4353 from ManaSugi/runk/enable-agent-unit-tests agent, runk: Enable test for the agent built with standard-oci-runtime feature	2022-06-01 07:39:01 +08:00
Eric Ernst	d2df1209a5	docs: describe kata handling for core-scheduling Add initial documentation for core-scheduling. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 16:17:00 -07:00
Michael Crosby	22b6a94a84	shim: add support for core scheduling In linux 5.14 and hopefully some backports, core scheduling allows processes to be co scheduled within the same domain on SMT enabled systems. Containerd impl sets the core sched domain when launching a shim. This allows a clean way for each shim(container/pod) to be in its own domain and any additional containers, (v2 pods) be be launched with the same domain as well as any exec'd process added to the container. kernel docs: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/core-scheduling.html For Kata specifically, we will look for SCHED_CORE environment variable to be set to indicate we shuold create a new schedule core domain. This is equivalent to the containerd shim's PR: `e48bbe8394` Fixes: #4309 Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Michael Crosby <michael@thepasture.io>	2022-05-31 10:10:40 -07:00
Eric Ernst	af2ef3f7a5	agent-ctl: introduce handle for iptables get/set Add support for the updated agent API for iptables Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	65f0cef16c	kata-runtime: add iptables CLI to test http endpoint While end users can connect directly to the shim, let's provide a way to easily get/set iptables from kata-runtime itself. Fixes: #4080 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	3201ad0830	shim-client: ensure we check resp status for Put/Post Without this, potential errors are silently dropped. Let's ensure we return the error code as well as potenial data from the response. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	0706fb28ac	kata-runtime: shmgmt: make url usage consistent Before, we had a mix of slash, etc. Unfortunately, when cleaning URL paths, serve mux seems to mangle the request method, resulting in each request being a GET (instead of PUT or POST). Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	2a09378dd9	shim-client: add support for DoPut While at it, make sure we check for nil in DoPost Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	640173cfc2	shim-mgmt: Add endpoint handler for interacting with iptables Add two endpoints: ip6tables, iptables. Each url handler supports GET and PUT operations. PUT expects the requests' data to be []bytes, and to contain iptable information in format to be consumed by iptables-restore. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	0136be22ca	virtcontainers: plumb iptable set/get from sandbox to agent Introduce get/set iptable handling. We add a sandbox API for getting and setting the IPTables within the guest. This routes it from sandbox interface, through kata-agent, ultimately making requests to the guest agent. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	bd50d463b2	agent: iptables: get/set handling for iptables Initial support for getting and setting iptables in the guest. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	7c4049aabb	osbuilder: add iptables package Since we are introducing an agent API for interacting with guest iptables, let's ensure that our example rootfs' have iptables-save/restore installed. Fixes: #4356 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:21:02 -07:00
Eric Ernst	03176a9e09	proto: update generated code based on proto update Update the generated agent.pb.go code based on proto update. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 08:45:59 -07:00
Eric Ernst	38ebbc705b	proto: update to add set/get iptables Update the agent protocol definition to introduce support for setting and getting iptables from the guest. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 08:45:59 -07:00
Bin Liu	78d45b434f	agent: return mount file content if parse mountinfo failed Include mount file content in error message when parsing mountinfo failed for debug. Fixes: #4246, #4103 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-05-31 23:36:14 +08:00
Manabu Sugimoto	c7b3941c96	runk: Enable test for the agent built with standard-oci-runtime feature This enables tests for the kata-agent for runk that is built with standard-oci-runtime feature in CI. Fixes: #4351 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-05-31 21:54:28 +09:00
Manabu Sugimoto	6dbce7c3de	agent: Remove unused import in console test Remove some unused imports in console test module used by runk's test. Fixes: #4351 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-05-31 21:54:02 +09:00
Xuewei Niu	6ecea84bc5	rustjail: get home dir using nix crate Get user's home dir using `nix::unistd` crate instead of `utils` crate, and remove useless code from agent. Fixes: #4209 Signed-off-by: Xuewei Niu <justxuewei@apache.org>	2022-05-31 15:04:33 +08:00
Manabu Sugimoto	648b8d0aec	runk: Return error when tty is used without console socket runk always launches containers with detached mode, so users have to use a console socket with run or create operation when a terminal is used. If users set `terminal` to `true` in `config.json` and try to launch a container without specifying a console socket, runk returns an error with a message early. Fixes: #4324 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-05-31 09:55:39 +09:00
James O. D. Hunt	96c8df40b5	Merge pull request #4335 from ManaSugi/runk/fix-invalid-rootfs runk: Handle rootfs path in config.json properly	2022-05-30 14:03:58 +01:00
Manabu Sugimoto	5205efd9b4	runk: Add Podman guide in README runk can launch containers using Podman, so add the guide in README. Fixes: #4338 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-05-30 19:06:46 +09:00
James O. D. Hunt	d157f9b71e	Merge pull request #3871 from amshinde/update-containerd-docs Update containerd docs	2022-05-30 08:38:07 +01:00
Manabu Sugimoto	d862ca0590	runk: Handle rootfs path in config.json properly This commit enables runk to handle `root.path` in `config.json` properly even if the path is specified by a relative path that includes the single (`.`) or the double (`..`) dots. For example, with a bundle at `/to/bundle` and a rootfs directly under `/to/bundle` such as `/to/bundle/{bin,dev,etc,home,...}`, the `root.path` value can be either `/to/bundle` or just `.`. This behavior conforms to OCI runtime spec. Accordingly, a bundle path managed by runk's status file (`status.json`) always is statically stored as a canonical path. Previously, a bundle path has been　got by `oci_state()` of rustjail's API that returns the path as the parent directory path of a rootfs (`root.path`). In case of the kata-agent, this works properly because the kata containers assume that the rootfs path is always `/to/bundle/rootfs`. However in case of standard OCI runtimes, a rootfs can be placed anywhere under a bundle, so the rootfs path doesn't always have to be at a `/to/bundle/rootfs`. Fixes: #4334 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-05-30 14:41:26 +09:00
snir911	d50937435d	Merge pull request #4318 from fidencio/topic/update-clh-to-v24.0 clh: Update to v24.0	2022-05-29 15:06:17 +03:00
James O. D. Hunt	56591804b3	docs: Improve snap build instructions Make it clearer how to build the snap package manually. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-05-26 15:56:36 +01:00
James O. D. Hunt	cb2b30970d	snap: Build using destructive mode Destructive mode is required to build the Kata Containers snap. See: ``` .github/workflows/snap-release.yaml .github/workflows/snap.yaml ``` Hence, update the last file that we forgot to update with `--destructive-mode`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-05-26 15:56:36 +01:00
James O. D. Hunt	60823abb9c	docs: Move snap README Move the snap README to a subdirectory to resolve the warning given by `snapcraft` (folded and reformatted slightly for clarity): ``` The 'snap' directory is meant specifically for snapcraft, but it contains the following non-snapcraft-related paths, which is unsupported and will cause unexpected behavior: - README.md If you must store these files within the 'snap' directory, move them to 'snap/local', which is ignored by snapcraft. ``` Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-05-26 15:56:36 +01:00
James O. D. Hunt	4134beee39	Merge pull request #4301 from jodh-intel/snap-package-rust-virtiofsd snap: Build and package rust version of virtiofsd	2022-05-26 15:55:06 +01:00
Fabiano Fidêncio	fff832874e	clh: Update to v24.0 This release has been tracked through the v24.0 project. virtio-iommu specification describes how a device can be attached by default to a bypass domain. This feature is particularly helpful for booting a VM with guest software which doesn't support virtio-iommu but still need to access the device. Now that Cloud Hypervisor supports this feature, it can boot a VM with Rust Hypervisor Firmware or OVMF even if the virtio-block device exposing the disk image is placed behind a virtual IOMMU. Multiple checks have been added to the code to prevent devices with identical identifiers from being created, and therefore avoid unexpected behaviors at boot or whenever a device was hot plugged into the VM. Sparse mmap support has been added to both VFIO and vfio-user devices. This allows the device regions that are not fully mappable to be partially mapped. And the more a device region can be mapped into the guest address space, the fewer VM exits will be generated when this device is accessed. This directly impacts the performance related to this device. A new serial_number option has been added to --platform, allowing a user to set a specific serial number for the platform. This number is exposed to the guest through the SMBIOS. * Fix loading RAW firmware (#4072) * Reject compressed QCOW images (#4055) * Reject virtio-mem resize if device is not activated (#4003) * Fix potential mmap leaks from VFIO/vfio-user MMIO regions (#4069) * Fix algorithm finding HOB memory resources (#3983) * Refactor interrupt handling (#4083) * Load kernel asynchronously (#4022) * Only create ACPI memory manager DSDT when resizable (#4013) Deprecated features will be removed in a subsequent release and users should plan to use alternatives * The mergeable option from the virtio-pmem support has been deprecated (#3968) * The dax option from the virtio-fs support has been deprecated (#3889) Fixes: #4317 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-26 08:51:18 +00:00
James O. D. Hunt	49361749ed	snap: Build and package rust version of virtiofsd Update the snap config file to build the rust version of `virtiofsd` for x86_64, but build QEMU's C version for other platforms. Fixes: #4261. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-05-25 17:04:05 +01:00
James O. D. Hunt	27d903b76a	snap: Put the yq binary in the staging bin directory Rather than putting the `yq` binary in the staging directory itself, put it in the `bin/` sub-directory. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-05-25 09:40:09 +01:00
James O. D. Hunt	d7b4ce049e	snap: Remove unused variable Remove the unused `kata_url` variable and use the value in the `website` YAML metadata instead. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-05-25 09:40:09 +01:00
James O. D. Hunt	43de5440e5	snap: Fix unbound variable error Don't assume `GITHUB_REF` is set. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-05-25 09:40:09 +01:00
James O. D. Hunt	c9b291509d	snap: Fix whitespace Remove trailing space. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-05-25 09:40:09 +01:00
Fupan Li	62d1ed0651	Merge pull request #4290 from Tim-Zhang/remove-oci-kata-agent runk: merge oci-kata-agent into runk	2022-05-25 11:31:25 +08:00
Fabiano Fidêncio	8a2b82ff51	Merge pull request #4276 from jongwu/build_rust_virtiofsd virtiofsd: static build virtiofsd from rust code for non-x86	2022-05-24 14:57:21 +02:00
Eric Ernst	6d00701ec9	Merge pull request #4298 from yibozhuang/fix-direct-volume Fix issues with direct-volume stats feature	2022-05-23 15:23:51 -07:00
Tim Zhang	122a85e222	agent: remove bin oci-kata-agent Fixes: #4291 Signed-off-by: Tim Zhang <tim@hyper.sh>	2022-05-23 16:55:16 +08:00
Tim Zhang	35619b45aa	runk: merge oci-kata-agent into runk Merge two bins into one. Fixes: #4291 Signed-off-by: Tim Zhang <tim@hyper.sh>	2022-05-23 16:54:09 +08:00
Fabiano Fidêncio	b9315af092	Merge pull request #4294 from yibozhuang/direct-volume-stats runtime: fix incorrect Action function for direct-volume stats	2022-05-23 10:22:29 +02:00
Jianyong Wu	10c13d719a	qemu: remove virtiofsd option in qemu config As virtiofsd will be built base on rust, "virtiofsd" option is no longer needed in qemu. Fixes: #4258 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-05-23 12:57:59 +08:00
Jianyong Wu	d20bc5a4d2	virtiofsd: build rust based virtiofsd from source for non-x86_64 Based on @fidencio's opoinon, On Arm: static build virtiofsd using musl lib; on ppc64 & s390: static build virtiofsd using gnu lib; Fixes: #4258 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-05-23 12:57:59 +08:00
Archana Shinde	c95ba63c0c	docs: Remove information related to Kata 1.x Since Kata 2.x does not support runtime cli, remove information related to it. Update the configuration snippet accordingly. Fixes #3870 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-05-21 07:19:28 +05:30
Archana Shinde	34b80382b6	docs: Get rid of note related to networking. One may want to use standalone containerd without k8s and still have network enabled for the container. Getting rid of note due to inaccuracy. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-05-21 07:19:28 +05:30
Archana Shinde	dfad5728a7	docs: Mention --cni flag while invoking ctr Specify that the `--cni` flag needs to be passed to the `ctr` tool while starting a container in order to have networking enabled for the container. This flag allows containerd to call into the configured network plugin which in turn creates a network interface for the container. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-05-21 07:19:28 +05:30
Yibo Zhuang	8e7c5975c6	agent: fix direct-assigned volume stats The current implementation of walking the disks to match with the requested volume path in agent doesn't work because the volume path provided by the shim to the agent is the mount path within the guest and not the device name. The current logic is trying to match the device name to the volume path which will never match. This change will simplify the get_volume_capacity_stats and get_volume_inode_stats to just call statfs and get the bytes and inodes usage of the volume path directly. Fixes: #4297 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-05-20 18:43:27 -07:00
Yibo Zhuang	4428ceae16	runtime: direct-volume stats use correct name Today the shim does a translation when doing direct-volume stats where it takes the source and returns the mount path within the guest. The source for a direct-assigned volume is actually the device path on the host and not the publish volume path. This change will perform a lookup of the mount info during direct-volume stats to ensure that the device path is provided to the shim for querying the volume stats. Fixes: #4297 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-05-20 18:42:47 -07:00
Yibo Zhuang	ffdc065b4c	runtime: direct-volume stats update to use GET parameter The go default http mux AFAIK doesn’t support pattern routing so right now client is padding the url for direct-volume stats with a subpath of the volume path and this will always result in 404 not found returned by the shim. This change will update the shim to take the volume path as a GET query parameter instead of a subpath. If the parameter is missing or empty, then return 400 BadRequest to the client. Fixes: #4297 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-05-20 18:41:51 -07:00
Yibo Zhuang	f295953183	runtime: fix incorrect Action function for direct-volume stats The action function expects a function that returns error but the current direct-volume stats Action returns (string, error) which is invalid. This change fixes the format and print out the stats from the command instead. Fixes: #4293 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-05-20 14:55:00 -07:00
Peng Tao	2c238c8504	Merge pull request #4213 from zvonkok/vfio runtime: Adding the correct detection of mediated PCIe devices	2022-05-20 15:00:23 +08:00
Fabiano Fidêncio	811ac6a8ce	Merge pull request #4282 from r4f4/runtime-dedup-types-import runtime: remove duplicate 'types' import	2022-05-19 22:15:36 +02:00
Chelsea Mafrica	d8be0f8e9f	Merge pull request #4281 from r4f4/runtime-qemu-comments runtime: sync docstrings with function names	2022-05-19 09:17:38 -07:00
Rafael Fonseca	7a5ccd1264	runtime: sync docstrings with function names The functions were renamed but their docstrings were not. Fixes #4006 Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>	2022-05-19 14:31:47 +02:00
Greg Kurz	fa61bd43ee	Merge pull request #4238 from snir911/wip/legacy_console qemu: allow using legacy serial device for the console	2022-05-19 14:30:59 +02:00
Rafael Fonseca	ce2e521a0f	runtime: remove duplicate 'types' import Fallout of `09f7962ff` Fixes #4285 Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>	2022-05-19 13:49:47 +02:00
Snir Sheriber	834f93ce8a	docs: fix annotations example annotation value should always be quoted, regardless to its type Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-19 09:52:30 +03:00
GabyCT	d7aded7238	Merge pull request #4279 from GabyCT/topic/updateosbuilderreadme docs: Remove clear containers reference in README	2022-05-18 14:26:56 -05:00
Snir Sheriber	f4994e486b	runtime: allow annotation configuration to use_legacy_serial and update the docs and test Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-18 18:58:21 +03:00
Gabriela Cervantes	24a2b0f6a2	docs: Remove clear containers reference in README This PR removes the clear containers reference as this is not longer being used and is deprecated at the rootfs builder README. Fixes #4278 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-05-18 14:53:17 +00:00
Fabiano Fidêncio	c88a48be21	Merge pull request #4271 from r4f4/runtime-err-check-fix runtime: do not check for EOF error in console watcher	2022-05-18 09:49:48 +02:00
GabyCT	9458cc0053	Merge pull request #4273 from GabyCT/topic/removenemuconf kernel: Remove nemu.conf from packaging	2022-05-17 16:06:45 -05:00
Greg Kurz	42c64b3d2c	Merge pull request #4269 from r4f4/remove-unused-param-get_kata_deps tools: delete unused param from get_from_kata_deps callers	2022-05-17 18:54:47 +02:00
Gabriela Cervantes	abad33eba0	kernel: Remove nemu.conf from packaging This PR removes the nemu.conf as we are not longer using NEMU from the kernel configurations. Fixes #4272 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-05-17 16:23:17 +00:00
Chelsea Mafrica	04bd8f16f0	Merge pull request #4252 from Champ-Goblem/patch/fix-is-signal-handled agent: Fix is_signal_handled failing parsing str to u64	2022-05-17 08:31:48 -07:00
GabyCT	12f0ab120a	Merge pull request #4191 from dgibson/go-test-script Improve Go unit test script	2022-05-17 10:27:04 -05:00
Rafael Fonseca	e87eb13c4f	tools: delete unused param from get_from_kata_deps callers The param was deleted by `a09e58fa80`, so update the callers not to use it. Fixes #4245 Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>	2022-05-17 15:18:41 +02:00
Rafael Fonseca	8052fe62fa	runtime: do not check for EOF error in console watcher The documentation of the bufio package explicitly says "Err returns the first non-EOF error that was encountered by the Scanner." When io.EOF happens, `Err()` will return `nil` and `Scan()` will return `false`. Fixes #4079 Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>	2022-05-17 15:14:33 +02:00
Fabiano Fidêncio	5d43718494	Merge pull request #4267 from cmaf/packaging-config-add-numa packaging: Add kernel config option for SGX in Gramine	2022-05-17 13:10:24 +02:00
Snir Sheriber	c67b9d2975	qemu: allow using legacy serial device for the console This allows to get guest early boot logs which are usually missed when virtconsole is used. - It utilizes previous work on the govmm side: https://github.com/kata-containers/govmm/pull/203 - unit test added Fixes: #4237 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-17 12:06:11 +03:00
Snir Sheriber	44814dce19	qemu: treat console kernel params within appendConsole as it is tightly coupled with the appended console device additionally have it tested Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-17 12:05:31 +03:00
Fupan Li	856c8e81f1	Merge pull request #4220 from liubin/fix/4219 ci: Don't run Docs URL Alive Check workflow on forks	2022-05-17 12:19:55 +08:00
Chelsea Mafrica	4f586d2a91	packaging: Add kernel config option for SGX in Gramine For the Gramine Shielded Containers guest kernel, CONFIG_NUMA must be enabled. Fixes #4266 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-05-16 16:58:26 -07:00
Champ-Goblem	4b437d91f0	agent: Fix is_signal_handled failing parsing str to u64 In the is_signal_handled function, when parsing the hex string returned from `/proc/<pid>/status` the space/tab character after the colon is not removed. This patch trims the result of SigCgt so that all whitespace characters are removed. It also extends the existing test cases to check for this scenario. Fixes: #4250 Signed-off-by: Champ-Goblem <cameron@northflank.com>	2022-05-16 20:34:26 +02:00
Fabiano Fidêncio	6ffdebd202	Merge pull request #4255 from cmaf/tools-patch-qemu-sgx-numa tools: Add QEMU patches for SGX numa support	2022-05-16 18:10:41 +02:00
Chelsea Mafrica	ee9ee77388	Merge pull request #4264 from GabyCT/topic/updatecontainerdrunt docs: Update runc containerd runtime	2022-05-16 08:56:26 -07:00
Gabriela Cervantes	88fb9b72e2	docs: Update runc containerd runtime As we are using a containerd version > 1.4 we need to update the runc containerd runtime. Fixes #4263 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-05-16 14:33:48 +00:00
Suraj Deshmukh	0e2459d13e	docs: Add cgroupDriver for containerd This commit updates the "Run Kata Containers with Kubernetes" to include cgroupDriver configuration via "KubeletConfiguration". Without this setting kubeadm defaults to systemd cgroupDriver. Containerd with Kata cannot spawn conntainers with systemd cgroup driver. Fixes: #4262 Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>	2022-05-16 17:32:57 +05:30
Fabiano Fidêncio	d1f2852d8b	tools: Stop building virtiofsd with qemu (for x86_64) As we finally can move to using the rust virtiofs daemon, let's stop bulding and packaging the C version of the virtiofsd for x86_64. Fixes: #4249 Depends-on: github.com/kata-containers/tests#4785 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-16 09:30:24 +02:00
Fabiano Fidêncio	c39852e83f	runtime: Use ${LIBEXEC}/virtiofsd as the default virtiofsd path As now we build and ship the rust version of virtiofsd, which is not tied to QEMU, we need to update its default location to match with where we're installing this binary. Fixes: #4249 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-16 09:30:24 +02:00
Chelsea Mafrica	b4b9068cb7	tools: Add QEMU patches for SGX numa support There are a few patches for SGX numa support in QEMU added after the 6.2.0 release. Add them for SGX support in Kata. Fixes #4254 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-05-13 16:34:57 -07:00
Fabiano Fidêncio	b780be99d7	Merge pull request #4233 from fidencio/topic/virtiofsd-switch-to-the-rust-version Build and distribute the rust version of virtiofsd	2022-05-13 19:38:01 +02:00
Fabiano Fidêncio	a475956abd	workflows: Add support for building virtiofsd As already done for the other assets we rely on, let's build (well, pull in this very specific case) the virtiofsd binary, as we're relying on its standlone rust version from now on. Fixes: #4234 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-13 11:37:36 +02:00
Fabiano Fidêncio	71f59f3a7b	local-build: Add support for building virtiofsd As done for the other binaries we release, let's add support for "building" (or pulling down) the static binary we ship as part of the kata-containers static tarball (the same one used by kata-deploy). Right now the virtiofsd is installed in /opt/kata/libexec/virtiofsd, a different path than the virtiofsd that comes with QEMU. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-13 11:37:36 +02:00
Fabiano Fidêncio	c7ac55b6d7	dockerbuild: Install unzip As virtiofsd comes in the `zip` format, let's install unzip in the containers and then be able to access the virtiofsd binary. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-13 11:37:36 +02:00
Fabiano Fidêncio	8e2042d055	tools: add script to pull virtiofsd Right now this is very much x86_64 specific, but I'd like to count on the maintainers of the other architectures to expand it. Also, the name as it's now may be misleading, as we're actually only pulling the binary that's statically built using `musl` and released as part of virtiofsd official releases. But we'll need to build it for the other architectures, thus I'm following the naming of the scripts used by the other components. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-13 11:37:21 +02:00
Fabiano Fidêncio	dbedea5086	versions: Add virtiofsd entry As we're switching to using the rust version of the virtiofsd, let's give it its own entry in the versions.yaml file, as it's no longer part of QEMU. It's important to mention that GitLab doesn't provide a well formed URL for the releases. Instead, it adds there a hash, leading us to have to add the specific link for the tarball. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-13 11:23:39 +02:00
David Gibson	e73b70baff	runtime: Don't run unit tests verbose by default go-test.sh by default adds the -v option to 'go test' meaning that output will be printed from all the passing tests as well as any failing ones. This results in a lot of output in which it's often difficult to locate the failing tests you're interested in. So, remove -v from the default flags. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:22:31 +10:00
David Gibson	f24a6e761f	runtime: Consolidate flags setting in unit tests script One of the responsibilities of the go-test.sh script is setting up the default flags for 'go test'. This is constructed across several different places in the script using several unneeded intermediate variables though. Consolidate all the flag construction into one place. fixes #4190 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:22:29 +10:00
David Gibson	cf465feb02	runtime: Don't change test behaviour based on $CI or $KATA_DEV_MODE go-test.sh changes behaviour based on both the $CI and $KATA_DEV_MODE variables, but not in a way that makes a lot of sense. If either one is set it uses the test_coverage path, instead of the test_local path. That collects coverage information, as the name suggests, but it also means it runs the tests twice as root and non-root, which is very non-obvious. It's not clear what use case the test_local path is for at all. Developer local builds will typically have $KATA_DEV_MODE set and CI builds will have $CI set. There's essentially no downside to running coverage all the time - it has little impact on the test runtime. In addition, if both $CI and $KATA_DEV_MODE are set, the script refuses to run things as root, considering it "unsafe". While having both set might be unwise in a general sense, there's not really any way running sudo can be any more unsafe than it is with either one set. So, simplify everything by just always running the test_coverage path. This leaves the test_local path unused, so we can remove it entirely. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	34c4ac599c	runtime: Remove redundant subcommands from go-test.sh go-test.sh accepts subcommands, however invoking it in the usual way via the Makefile doesn't use them. In fact the only remaining subcommand is "help" and we already have another way of getting the usage information (-h or --help). We don't need a second way, so just drop subcommand handling. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	0aff5aaa39	runtime: Simplify package listing in go-test.sh go-test.sh defaults to testing all the packages listed by go list, except for a number filtered out. It turns out that none of those filters are necessary any more: * We've long required a Go newer than 1.9 which means the vendor filter isn't needed * The agent filter doesn't do anything now that we've moved to the Kata 2.x unified repo * The tests filters don't hit anything on the list of modules in src/runtime (which is the only user of the script) But since we don't need to filter anything out any more, we don't even need to iterate through a list ourselves. We can simply pass "./..." directly to go test and it will iterate through all the sub-packages itself. Interestingly this more than doubles the speed of "make test" for me - I suspect because go test's internal paralellism works better over a larger pool of tests. This also lets us remove handling of non-existent coverage files from test_go_package(), since with default options we will no longer test packages without tests by default. If the user explicitly requests testing of a package with no tests, then failing makes sense. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	557c4cfd00	runtime: Don't chmod coverage files in Go tests The go-test.sh script has an explicit chmod command, run as root, to set the mode of the temporary coverage files to 0644. AFAICT the point of this is specifically the 004 bit allowing world read access, so that we can then merge the temporary coverage file into the main coverage file. That's a convoluted way of doing things. Instead we can just run the tail command which reads the temporary file as the same user that generated it. In addition, go-test.sh became root to remove that temporary coverage file. This is not necessary, since deleting a regular file just requires write access to the directory, not the file itself. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	04c8b52e04	runtime: Remove HTML coverage option from go-test.sh The html-coverage option to this script doesn't really alter behaviour it just does the same thing as normal coverage, then converts the report to HTML. That conversion is a single command, plus a chmod to make the final output mode 0644. That overrides any umask the user has set, which doesn't seem like a policy decision this script should be making. Nothing in the kata-containers or tests repository uses this, so it doesn't really make sense to keep this logic inside this script. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	7f76914422	runtime: Add coverage.txt.tmp to gitignore In addition to coverage.txt, the go-test.sh script creates coverage.txt.tmp files while running. These are temporary and certainly shouldn't be committed, so add them to the gitignore file. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	13c2577004	runtime: Move go testing script locally The go unit tests for the runtime are invoked by the helper script ci/go-test.sh. Which calls the run_go_test() function in ci/lib.sh. Which calls into .ci/go-test.sh from the tests repository. But.. the runtime is the only user of this script, and generally stuff for unit tests (rather than functional or integration tests) lives in the main repository, not the tests repository. So, just move the actual script into src/runtime. A change to remove it from the tests repo will follow. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
Wainer Moschetta	97425a7fe6	Merge pull request #4240 from stevenhorsman/dev-guide-broken-link doc: Update log parser link	2022-05-12 11:51:51 -03:00
stevenhorsman	4210646802	doc: Update log parser link - Update log-parser link to reflect new location - Also update the link to be relative Fixes: #4239 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2022-05-12 14:23:13 +01:00
snir911	51fa4ab671	Merge pull request #4165 from snir911/mv_parser Move the kata-log-parser from the tests repo	2022-05-11 10:33:36 +03:00
Bo Chen	79fb4fc5cb	Merge pull request #4223 from likebreath/0509/clh_v23.1 versions: Upgrade to Cloud Hypervisor v23.1	2022-05-10 10:40:22 -07:00
Snir Sheriber	271933fec0	log-parser: fix some of the documentation minor fixes of links and text Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-10 13:23:25 +03:00
Snir Sheriber	c7dacb1211	log-parser: move the kata-log-parser from the tests repo to the kata-containers repo under the src/tools/log-parser folder and vendor the modules Fixes: #4100 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-10 13:23:25 +03:00
GabyCT	61a167139c	Merge pull request #4186 from liubin/fix/4185-skip-loop-by-user agent: Add a macro to skip a loop easier	2022-05-09 16:58:29 -05:00
Bo Chen	82ea018281	versions: Upgrade to Cloud Hypervisor v23.1 The following issues have been addressed from the latest bug fix release v23.1 of Cloud Hypervisor: 1) Add some missing seccomp rules; 2) Remove virtio-fs filesystem entries from config on removal; 3) Do not delete API socket on API server start; 4) Reject virtio-mem resize if the guest doesn't activate the device; 5) Fix OpenAPI naming of I/O throttling knobs; Fixes: #4222 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-05-09 14:15:12 -07:00
Fupan Li	8aad2c59c5	Merge pull request #4184 from liubin/fix/4182-runk-kill-all runk: use custom Kill command to support --all option	2022-05-09 17:56:10 +08:00
Zvonko Kaiser	2a1d394147	runtime: Adding the correct detection of mediated PCIe devices Fixes #4212 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2022-05-09 00:57:06 -07:00
Bin Liu	7bc4ab68c3	ci: Don't run Docs URL Alive Check workflow on forks This workflow is a scheduled job that runs at 23:00 every Sunday, it should only run the main repo but not the forked ones. Fixes: #4219 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-05-09 11:54:25 +08:00
James O. D. Hunt	79d93f1fe7	Merge pull request #4137 from Shensd/sandbox-tests-online_resources agent: add test coverage for functions find_process and online_resources	2022-05-06 09:20:57 +01:00
Chelsea Mafrica	e2f68c6093	Merge pull request #4187 from fidencio/test-hook-grpc-to-oci rustjail: Add tests for hook_grpc_to_oci	2022-05-04 09:25:45 -07:00
Fabiano Fidêncio	d16097a805	Merge pull request #4203 from fidencio/2.5.0-alpha1-branch-bump # Kata Containers 2.5.0-alpha1	2022-05-04 17:53:48 +02:00
Fabiano Fidêncio	9b863b0e01	release: Kata Containers 2.5.0-alpha1 - agent watchers: ensure uid/gid is preserved on copy/mkdir - clh: Rely on Cloud Hypervisor for generating the device ID - agent: add tests for create_logger_task function - runk: set BinaryName for runk for containerd - tools: Add a Rust-based standard OCI container runtime based on Kata agent - rustjail: add tests for parse_mount_table - Virtcontainers: Enable hot plugging vhost-user-blk device on ARM - docs: repropose direct-assigned volume - versions: change qemu tdx url and tag - doc: Update for NVIDIA GPUs - agent-ctl: Fix abstract socket connections - Implement network and disk rate limiter for Cloud Hypervisor - kata-deploy: Add support to RKE2 - docs: Update containerd link to installation guide - docs: remove pc machine type supports - Agent: Unit tests for random.rs - rustjail: Add tests for mount_grpc_to_oci - packaging: Fix broken path in `build-static-clh.sh` - Fix Go unit tests to clean up /tmp after themselves - rustjail: add tests for mount_from function - rustjail: Add tests for hooks_grpc_to_oci - agent: modify the type of swappiness to u64 - libs/safe-path: add crate to safely resolve fs paths - agent: move assert_result macro to test_utils file - rustjail: Add tests for root_grpc_to_oci - agent: add tests for mount_to_rootfs function - agent: add tests for update_container_namespaces - agent: add tests for is_signal_handled function - Upgrade to Cloud Hypervisor v23.0 - agent: best-effort removing mount point - test: Fix golangci-lint error for s390x - fsGroup support for direct-assigned volume - kata-monitor: add the README file - kata-monitor: update the hrefs in the debug/pprof index page - runtime: Base64 encode the direct volume mountInfo path - runtime: no need to write virtiofsd error to log - kata-monitor: add some links when generating pages for browsers - agent: Avoid agent panic when reading empty stats - docs: Update link to contributions guide - agent: add tests for mount_storage - agent: add test coverage for parse_mount_flags_and_options function - agent: add tests for do_write_stream function - runtime: delete debug option in virtiofsd - rustjail: add test coverage for process_grpc_to_oci function - agent: Allow the agent to be rebuilt with the change of Cargo features - protocols: add src/csi.rs to .gitignore - kata-runtime enable hugepage support - docs: Add a firecracker installation guide - runtime: Allow and require no initrd for SE - test: use `T.TempDir` to create temporary test directory - clh: Expose service offload configuration `33a8b705` clh: Rely on Cloud Hypervisor for generating the device ID `70eda2fa` agent: watchers: ensure uid/gid is preserved on copy/mkdir `7772f7dd` runk: set BinaryName for runk for containerd `7ffe5a16` docs: Direct-assigned volume design `081f6de8` versions: change qemu tdx url and tag `666aee54` docs: Add VSOCK localhost example for agent-ctl `86d348e0` docs: Use VM term in agent-ctl doc `4b9b62bb` agent-ctl: Fix abstract socket connections `b6467ddd` clh: Expose disk rate limiter config `7580bb5a` clh: Expose net rate limiter config `a88adaba` clh: Cloud Hypervisor has a built-in Rate Limiter `63c4da03` clh: Implement the Disk RateLimiter logic `511f7f82` config: Add DiskRateLimiter* to Cloud Hypervisor `5b18575d` hypervisor: Add disk bandwidth and operations rate limiters `1cf94692` clh: Implement the Network RateLimiter logic `00a5b1bd` utils: Define DefaultRateLimiterRefillTimeMilliSecs `be1bb7e3` utils: Move FC's function to revert bytes to utils `c9f6496d` config: Add NetRateLimiter* to Cloud Hypervisor `2d35e606` hypervisor: Add network bandwidth and operations rate limiters `b0e439cb` rustjail: add tests for parse_mount_table `ccb01839` kata-deploy: Add support to RKE2 `9d39362e` kata-deploy: Reestructure the installing section `18d27f79` kata-deploy: Add a missing `$` prefix in the README `6948b4b3` docs: Update containerd link to installation guide `b221a259` tools: Add runk `2c218a07` agent: Modify Kata agent for runk `dd4bd7f4` doc: Added initial doc update for NV GPUs `832c33d5` docs: remove pc machine type supports `b658dccc` tools: fix typo in clh directory name `afbd60da` packaging: Fix clh build from source fall-back `4b9e78b8` rustjail: Add tests for mount_grpc_to_oci `81f6b486` agent: add tests for create_logger_task function `96bc3ec2` rustjail: Add tests for hooks_grpc_to_oci `02395027` agent: modify the type of swappiness to u64 `1b931f42` runtime: Allock mockfs storage to be placed in any directory `ef6d54a7` runtime: Let MockFSInit create a mock fs driver at any path `5d8438e9` runtime: Move mockfs control global into mockfs.go `963d03ea` runtime: Export StoragePathSuffix `1719a8b4` runtime: Don't abuse MockStorageRootPath() for factory tests `bec59f9e` runtime: Make bind mount tests better clean up after themselves `f7ba21c8` runtime: Clean up mock hook logs in tests `90b2f5b7` runtime: Make SetupOCIConfigFile clean up after itself `2eeb5dc2` runtime: Don't use fixed /tmp/mountPoint path `0ad89ebd` safe-path: add more unit test cases `b63774ec` libs/safe-path: add crate to safely resolve fs paths `f385b21b` rustjail: add tests for mount_from function `0e7f1a5e` agent: move assert_result macro to test_utils file `2256bcb6` rustjail: Add tests for root_grpc_to_oci `7b2ff026` kata-monitor: add a README file `29e569aa` virtcontainers: clh: Re-generate the client code `6012c197` versions: Upgrade to Cloud Hypervisor v23.0 `aabcebbf` agent: best-effort removing mount point `d136c9c2` test: Fix golangci-lint error for s390x `86977ff7` kata-monitor: update the hrefs in the debug/pprof index page `78f30c33` agent: Avoid agent panic when reading empty stats `6e79042a` runtime: no need to write virtiofsd error to log `9b6f24b2` agent: add tests for mount_to_rootfs function `c3776b17` agent: add tests for is_signal_handled function `9c22d955` agent: add tests for update_container_namespaces `92c00c7e` agent: fsGroup support for direct-assigned volume `6e9e4e8c` docs: Update link to contributions guide `532d5397` runtime: fsGroup support for direct-assigned volume `6a47b82c` proto: fsGroup support for direct-assigned volume `9d5e7ee0` agent: add tests for mount_storage `f8cc5d1a` kata-monitor: add some links when generating pages for browsers `c31cd0e8` rustjail: add test coverage for process_grpc_to_oci function `1118a3d2` agent: add test coverage for parse_mount_flags_and_options function `9d5b03a1` runtime: delete debug option in virtiofsd `eff7c7e0` agent: Allow the agent to be rebuilt with the change of Cargo features `b975f2e8` Virtcontainers: Enable hot plugging vhost-user-blk device on ARM `962d05ec` protocols: add src/csi.rs to .gitignore `354cd3b9` runtime: Base64 encode the direct volume mountInfo path `485aeabb` agent: add tests for do_write_stream function `4405b188` docs: Add a firecracker installation guide `98750d79` clh: Expose service offload configuration `59c7165e` test: use `T.TempDir` to create temporary test directory `ff17c756` runtime: Allow and require no initrd for SE `1cad3a46` agent/random: Ensure data.len > 0 `33c953ac` agent: Add test_ressed_rng_not_root `39a35b69` agent: Add test to random::reseed_rng() `d8f39fb2` agent/random: Rename RNDRESEEDRNG to RNDRESEEDCRNG `a2f5c176` runtime/virtcontainers: Pass the hugepages resources to agent Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-04 16:16:53 +02:00
Fabiano Fidêncio	bd5da4a7d9	Merge pull request #4189 from yibozhuang/watchable-mount-permission agent watchers: ensure uid/gid is preserved on copy/mkdir	2022-05-04 12:29:24 +02:00
Fabiano Fidêncio	ec250c10e9	Merge pull request #4197 from fidencio/topic/workaround-race-condition-on-removing-and-adding-device-with-clh clh: Rely on Cloud Hypervisor for generating the device ID	2022-05-04 11:50:14 +02:00
Fabiano Fidêncio	33a8b70558	clh: Rely on Cloud Hypervisor for generating the device ID We're currently hitting a race condition on the Cloud Hypervisor's driver code when quickly removing and adding a block device. This happens because the device removal is an asynchronous operation, and we currently do not monitor events coming from Cloud Hypervisor to know when the device was actually removed. Together with this, the sandbox code doesn't know about that and when a new device is attached it'll quickly assign what may be the very same ID to the new device, leading to the Cloud Hypervisor's driver trying to hotplug a device with the very same ID of the device that was not yet removed. This is, in a nutshell, why the tests with Cloud Hypervisor and devmapper have been failing every now and then. The workaround taken to solve the issue is basically not passing down the device ID to Cloud Hypervisor and simply letting Cloud Hypervisor itself generate those, as Cloud Hypervisor does it in a manner that avoids such conflicts. With this addition we have then to keep a map of the device ID and the Cloud Hypervisor's generated ID, so we can properly remove the device. This workaround will probably stay for a while, at least till someone has enough cycles to implement a way to watch the device removal event and then properly act on that. Spoiler alert, this will be a complex change that may not even be worth it considering the race can be avoided with this commit. Fixes: #4176 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-04 09:04:03 +02:00
Jack Hance	475e3bf38f	agent: add test coverage for functions find_process and online_resources Add test coverage for the functions find_process and online_resources in src/sandbox.rs. Fixes #4085 Fixes #4136 Signed-off-by: Jack Hance <jack.hance@ndsu.edu>	2022-05-03 16:00:24 -05:00
Yibo Zhuang	70eda2fa6c	agent: watchers: ensure uid/gid is preserved on copy/mkdir Today in agent watchers, when we copy files/symlinks or create directories, the ownership of the source path is not preserved which can lead to permission issues. In copy, ensure that we do a chown of the source path uid/gid to the destination file/symlink after copy to ensure that ownership matches the source ownership. fs::copy() takes care of setting the permissions. For directory creation, ensure that we set the permissions of the created directory to the source directory permissions and also perform a chown of the source path uid/gid to ensure directory ownership and permissions matches to the source. Fixes: #4188 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-05-03 09:57:31 -07:00
Garrett Mahin	4a1e13bd8f	rustjail: Add tests for hook_grpc_to_oci Add test coverage for hook_grpc_to_oci in rustjail/src/lib.rs Fixes: #4125 Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>	2022-05-02 23:59:33 +02:00
Bin Liu	383be2203a	agent: Add a macro to skip a loop easier Add a macro to skip a loop easier without using a if {} else {} condition check. Fixes: #4185 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-04-30 20:45:41 +08:00
Bin Liu	c633780ba7	Merge pull request #4119 from bradenrayhorn/test-create-logger-task agent: add tests for create_logger_task function	2022-04-30 19:48:07 +08:00
Bin Liu	97d7b1845b	runk: use custom Kill command to support --all option runk uses liboci-cli crate to parse command line options, but liboci-cli does not support --all option for kill command, though this is the runtime spec behavior. But crictl will issue kill --all command when stopping containers, as a workaround, we use a custom kill command instead of the one provided by liboci-cli. Fixes: #4182 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-04-30 19:34:18 +08:00
Fabiano Fidêncio	1dd6f85a17	Merge pull request #4178 from liubin/4177 runk: set BinaryName for runk for containerd	2022-04-29 21:17:37 +02:00
Champ-Goblem	1b7fd19acb	rootfs: Fix chronyd.service failing on boot In at least kata versions 2.3.3 and 2.4.0 it was noticed that the guest operating system's clock would drift out of sync slowly over time whilst the pod was running. This had previously been raised and fixed in the old reposity via [1]. In essence kvm_ptp and chrony were paired together in order to keep the system clock up to date with the host. In the recent versions of kata metioned above, the chronyd.service fails upon boot with status `266/NAMESPACE` which seems to be due to the fact that the `/var/lib/chrony` directory no longer exists. This change sets the `/var/lib/chrony` directory for the `ReadWritePaths` to be ignored when the directory does not exist, as per [2]. [1] https://github.com/kata-containers/runtime/issues/1279 [2] https://www.freedesktop.org/software/systemd /man/systemd.exec.html#ReadWritePaths= Fixes: #4167 Signed-off-by: Champ-Goblem <cameron_mcdermott@yahoo.co.uk>	2022-04-29 17:15:29 +01:00
Bin Liu	7772f7dd99	runk: set BinaryName for runk for containerd The default runtime for io.containerd.runc.v2 is runc, to use runk, the containerd configuration should set the default runtime to runk or add BinaryName options for the runtime. Fixes: #4177 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-04-29 22:26:32 +08:00
James O. D. Hunt	cc839772d3	Merge pull request #2785 from ManaSugi/standard-container-runtime tools: Add a Rust-based standard OCI container runtime based on Kata agent	2022-04-29 13:20:59 +01:00
James O. D. Hunt	2d5f11501c	Merge pull request #4083 from bradenrayhorn/test-parse-mount-table rustjail: add tests for parse_mount_table	2022-04-29 11:34:22 +01:00
Jianyong Wu	982c32358a	Merge pull request #4031 from Jaylyn-Ren/kata-spdk Virtcontainers: Enable hot plugging vhost-user-blk device on ARM	2022-04-29 12:16:38 +08:00
Feng Wang	da11c21b4a	Merge pull request #3248 from fengwang666/direct-blk-design docs: repropose direct-assigned volume	2022-04-28 16:55:50 -07:00
Feng Wang	7ffe5a16f2	docs: Direct-assigned volume design Detail design description on direct-assigned volume Fixes: #1468 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-04-28 14:47:36 -07:00
Julio Montes	ea857bb1b8	Merge pull request #4172 from devimc/2022-04-28/fixQEMU versions: change qemu tdx url and tag	2022-04-28 15:31:52 -05:00
Archana Shinde	9fdc88101f	Merge pull request #3907 from zvonkok/nvidia doc: Update for NVIDIA GPUs	2022-04-28 12:42:44 -07:00
Julio Montes	081f6de874	versions: change qemu tdx url and tag https://github.com/intel/qemu-dcp is the new repo that supports qemu with Intel TDX fixes #4171 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-04-28 13:46:11 -05:00
Chelsea Mafrica	3f069c7acb	Merge pull request #4166 from jodh-intel/agent-ctl-fix-abstract agent-ctl: Fix abstract socket connections	2022-04-28 10:17:28 -07:00
James O. D. Hunt	666aee54d2	docs: Add VSOCK localhost example for agent-ctl Update the `agent-ctl` docs to show how to use a VSOCK local address when running the agent and the tool in the same environment. This is an alternative to using a Unix socket. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-04-28 13:33:23 +01:00
James O. D. Hunt	86d348e065	docs: Use VM term in agent-ctl doc Use the standard "VM" acronym to mean Virtual Machine. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-04-28 13:33:19 +01:00
James O. D. Hunt	4b9b62bb3e	agent-ctl: Fix abstract socket connections Unbreak the `agent-ctl` tool connecting to the agent with a Unix domain socket. It appears that [1] changed the behaviour of connecting to the agent using a local Unix socket (which is not used by Kata under normal operation). The change can be seen by reverting to commit `72b8144b56` (the one before [1]) and running the agent manually as: ```bash $ sudo KATA_AGENT_SERVER_ADDR=unix:///tmp/foo.socket target/x86_64-unknown-linux-musl/release/kata-agent ``` Before [1], in another terminal we see this: ```bash $ sudo lsof -U 2>/dev/null \|grep foo\|awk '{print $9}' @/tmp/foo.socket@ ``` But now, we see the following: ```bash $ sudo lsof -U 2>/dev/null \|grep foo\|awk '{print $9}' @/tmp/foo.socket ``` Note the last byte which represents a nul (`\0`) value. The `agent-ctl` tool used to add that trailing nul but now it seems to not be needed, so this change removes it, restoring functionality. No external changes are necessary so the `agent-ctl` tool can connect to the agent as below like this: ```bash $ cargo run -- -l debug connect --server-address "unix://@/tmp/foo.socket" --bundle-dir "$bundle_dir" -c Check -c GetGuestDetails ``` [1] - https://github.com/kata-containers/kata-containers/issues/3124 Fixes: #4164. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-04-28 13:33:09 +01:00
Fabiano Fidêncio	c4dd029566	Merge pull request #4135 from fidencio/topic/clh-net-rate-limitting Implement network and disk rate limiter for Cloud Hypervisor	2022-04-28 13:33:10 +02:00
Fabiano Fidêncio	9fb9c80fd3	Merge pull request #4161 from fidencio/topic/kata-deploy-plus-rke2 kata-deploy: Add support to RKE2	2022-04-28 11:35:11 +02:00
Fabiano Fidêncio	b6467ddd73	clh: Expose disk rate limiter config With everything implemented, let's now expose the disk rate limiter configuration options in the Cloud Hypervisor configuration file. Fixes: #4139 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:28:29 +02:00
Fabiano Fidêncio	7580bb5a78	clh: Expose net rate limiter config With everything implemented, let's now expose the net rate limiter configuration options in the Cloud Hypervisor configuration file. Fixes: #4017 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:28:13 +02:00
Fabiano Fidêncio	a88adabaae	clh: Cloud Hypervisor has a built-in Rate Limiter The notion of "built-in rate limiter" was added as part of `bd8658e362`, and that commit considered that only Firecracker had a built-in rate limiter, which I think was the case when that was introduced (mid 2020). Nowadays, however, Cloud Hypervisor takes advantage of the very same crate used by Firecraker to do I/O throttling. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:27:56 +02:00
Fabiano Fidêncio	63c4da03a9	clh: Implement the Disk RateLimiter logic Let's take advantage of the newly added DiskRateLimiter* options and apply those to the network device configuration. The logic here is identical to the one already present in the Network part of Cloud Hypervisor's driver. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:27:53 +02:00
Fabiano Fidêncio	511f7f822d	config: Add DiskRateLimiter* to Cloud Hypervisor Let's add the newly added disk rate limiter configurations to the Cloud Hypervisor's hypervisor configuration. Right now those are not used anywhere, and there's absolutely no way the users can set those up. That's coming later in this very same series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:27:15 +02:00
Fabiano Fidêncio	5b18575dfe	hypervisor: Add disk bandwidth and operations rate limiters This is the disk counterpart of the what was introduced for the network as part of the previous commits in this series. The newly added fields are: * DiskRateLimiterBwMaxRate, defined in bits per second, which is used to control the network I/O bandwidth at the VM level. * DiskRateLimiterBwOneTimeBurst, also defined in bits per second, which is used to define an initial max rate, which doesn't replenish. * DiskRateLimiterOpsMaxRate, the operations per second equivalent of the DiskRateLimiterBwMaxRate. * DiskRateLimiterOpsOneTimeBurst, the operations per second equivalent of the DiskRateLimiterBwOneTimeBurst. For now those extra fields have only been added to the hypervisor's configuration and they'll be used in the coming patches of this very same series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:27:11 +02:00
Fabiano Fidêncio	1cf9469297	clh: Implement the Network RateLimiter logic Let's take advantage of the newly added NetRateLimiter* options and apply those to the network device configuration. The logic here is quite similar to the one already present in the Firecracker's driver, with the main difference being the single Inbound / Outbound MaxRate and the presence of both Bandwidth and Operations rate limiter. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:26:38 +02:00
Fabiano Fidêncio	00a5b1bda9	utils: Define DefaultRateLimiterRefillTimeMilliSecs Firecracker's driver doesn't expose the RefillTime option of the rate limiter to the user. Instead, it uses a contant value of 1000 miliseconds (1 second). As we're following Firecracker's driver implementation, let's expose create a new constant, use it as part of the Firecracker's driver, and later on re-use it as part of the Cloud Hypervisor's driver. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:22:42 +02:00
Fabiano Fidêncio	be1bb7e39f	utils: Move FC's function to revert bytes to utils Firecracker's revertBytes function, now called "RevertBytes", can be exposed as part of the virtcontainers' utils file, as this function will be reused by Cloud Hypervisor, when adding the rate limiter logic there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:22:42 +02:00
Fabiano Fidêncio	c9f6496d6d	config: Add NetRateLimiter* to Cloud Hypervisor Let's add the newly added network rate limiter configurations to the Cloud Hypervisor's hypervisor configuration. Right now those are not used anywhere, and there's absolutely no way the users can set those up. That's coming later in this very same series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:22:42 +02:00
Fabiano Fidêncio	2d35e6066d	hypervisor: Add network bandwidth and operations rate limiters In a similar way to what's already exposed as RxRateLimiterMaxRate and TxRateLimiterMaxRate, let's add four new fields to the Hypervisor's configuration. The values added are related to bandwidth and operations rate limiters, which have to be added so we can expose I/O throttling configurations to users using Cloud Hypervisor as their preferred VMM. The reason we cannot simply re-use {Rx,Tx}RateLimiterMaxRate is because Cloud Hypervisor exposes a single MaxRate to be used for both inbound and outbound queues. The newly added fields are: * NetRateLimiterBwMaxRate, defined in bits per second, which is used to control the network I/O bandwidth at the VM level. * NetRateLimiterBwOneTimeBurst, also defined in bits per second, which is used to define an initial max rate, which doesn't replenish. * NetRateLimiterOpsMaxRate, the operations per second equivalent of the NetRateLimiterBwMaxRate. * NetRateLimiterOpsOneTimeBurst, the operations per second equivalent of the NetRateLimiterBwOneTimeBurst. For now those extra fields have only been added to the hypervisor's configuration and they'll be used in the coming patches of this very same series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:22:42 +02:00
Braden Rayhorn	b0e439cb66	rustjail: add tests for parse_mount_table Add tests for parse_mount_table function in rustjail/src/mount.rs. Includes some minor refactoring improve the testability of the function and improve its error values. Fixes: #4082 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-27 20:06:01 -05:00
Chelsea Mafrica	ab067cf074	Merge pull request #4163 from GabyCT/topic/fixdoccontainerd docs: Update containerd link to installation guide	2022-04-27 16:18:57 -07:00
Fabiano Fidêncio	ccb0183934	kata-deploy: Add support to RKE2 "RKE2 - Rancher's Next Generation Kuberentes Distribution" can easily be supported by kata-deploy with some simple adjustments to what we've been relying on for "k3s". The main differences between k3s and RKE2 are, basically: 1. The location where the containerd configuration is stored - k3s: /var/lib/rancher/k3s/agent/etc/containerd/ - rke2: /var/lib/rancher/rke2/agent/etc/containerd/ 2. The name of the systemd services used: - k3s: k3s.service or k3s-agent.service - rke2: rke2-server.service or rke2-agent.service Knowing this, let's add a new overlay for RKE2, adapt the kata-deploy and the kata-cleanup scripts, and that's it. Fixes: #4160 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-27 19:05:36 +02:00
Fabiano Fidêncio	9d39362e30	kata-deploy: Reestructure the installing section Let's move the specific installation instructions, such as for k3s, upper in the document. This helps reading (and also skipping) according to what the user is looking for. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2022-04-27 19:05:36 +02:00
Fabiano Fidêncio	18d27f7949	kata-deploy: Add a missing `$` prefix in the README Commit short-log says it all. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2022-04-27 19:05:36 +02:00
Gabriela Cervantes	6948b4b360	docs: Update containerd link to installation guide This PR updates the containerd url link for the installation guide Fixes #4162 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-04-27 16:52:53 +00:00
Manabu Sugimoto	b221a2590f	tools: Add runk Add a Rust-based standard OCI container runtime based on Kata agent. You can build and install runk as follows: ```sh $ cd src/tools/runk $ make $ sudo make install $ runk --help ``` Fixes: #2784 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-04-28 00:48:57 +09:00
Manabu Sugimoto	2c218a07b9	agent: Modify Kata agent for runk Generate an oci-kata-agent which is a customized agent to be called from runk which is a Rust-based standard OCI container runtime based on Kata agent. Fixes: #2784 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-04-28 00:48:57 +09:00
Zvonko Kaiser	dd4bd7f471	doc: Added initial doc update for NV GPUs Fixed rpm vs deb references Update to the shell portion Fixes #3379 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2022-04-27 16:38:35 +02:00
James O. D. Hunt	d02db3a268	Merge pull request #4156 from Kvasscn/kata_dev_fix_docs_pc_machine docs: remove pc machine type supports	2022-04-27 11:55:58 +01:00
James O. D. Hunt	0a6e7d443e	Merge pull request #3910 from etrunko/agent_random Agent: Unit tests for random.rs	2022-04-27 09:41:02 +01:00
James O. D. Hunt	7b20707197	Merge pull request #4107 from garrettmahin/test-mount-grpc-to-oci rustjail: Add tests for mount_grpc_to_oci	2022-04-27 08:50:24 +01:00
Fabiano Fidêncio	411053e2bd	Merge pull request #4152 from gkurz/fix-clh-build packaging: Fix broken path in `build-static-clh.sh`	2022-04-27 08:59:43 +02:00
Jason Zhang	832c33d5b5	docs: remove pc machine type supports Currently the 'pc' machine type is no longer supported in kata configuration, so remove it in the design docs. Fixes: #4155 Signed-off-by: Jason Zhang <zhanghj.lc@inspur.com>	2022-04-27 11:28:03 +08:00
Greg Kurz	b658dccc5f	tools: fix typo in clh directory name This allows to get released binaries again. Fixes: #4151 Signed-off-by: Greg Kurz <groug@kaod.org>	2022-04-26 17:57:32 +02:00
Greg Kurz	afbd60da27	packaging: Fix clh build from source fall-back If we fail to download the clh binary, we fall-back to build from source. Unfortunately, `pull_clh_released_binary()` leaves a `cloud_hypervisor` directory behind, which causes `build_clh_from_source()` not to clone the git repo: [ -d "${repo_dir}" ] \|\| git clone "${cloud_hypervisor_repo}" When building from a kata-containers git repo, the subsequent calls to `git` in this function thus apply to the kata-containers repo and eventually fail, e.g.: + git checkout v23.0 error: pathspec 'v23.0' did not match any file(s) known to git It doesn't quite make sense actually to keep an existing directory the content of which is arbitrary when we want to it to contain a specific version of clh. Just remove it instead. Fixes: #4151 Signed-off-by: Greg Kurz <groug@kaod.org>	2022-04-26 17:57:32 +02:00
Peng Tao	5b6e45ed6c	Merge pull request #4141 from dgibson/cleanup-tmp Fix Go unit tests to clean up /tmp after themselves	2022-04-26 15:43:34 +08:00
Garrett Mahin	4b9e78b837	rustjail: Add tests for mount_grpc_to_oci Add test coverage for mount_grpc_to_oci in rustjail/src/lib.rs Fixes: #4106 Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>	2022-04-25 08:37:17 -05:00
James O. D. Hunt	bc919cc54c	Merge pull request #4122 from bradenrayhorn/test-mount-from rustjail: add tests for mount_from function	2022-04-25 11:55:21 +01:00
James O. D. Hunt	cb8dd0f4fc	Merge pull request #4143 from garrettmahin/test-hooks-grpc-to-oci rustjail: Add tests for hooks_grpc_to_oci	2022-04-25 10:50:52 +01:00
Braden Rayhorn	81f6b48626	agent: add tests for create_logger_task function Add tests for create_logger_task function in src/main.rs. Fixes: #4113 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-24 21:38:32 -05:00
Bin Liu	2629c9fc7b	Merge pull request #4114 from yangfeiyu20102011/main agent: modify the type of swappiness to u64	2022-04-24 13:35:18 +08:00
Garrett Mahin	96bc3ec2e9	rustjail: Add tests for hooks_grpc_to_oci Add test coverage for hooks_grpc_to_oci in rustjail/src/lib.rs Fixes: #4142 Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>	2022-04-22 19:20:04 -05:00
holyfei	0239502781	agent: modify the type of swappiness to u64 The type of MemorySwappiness in runtime is uint64, and the type of swappiness in agent is int64, if we set max uint64 in runtime and pass it to agent, the value will be equal to -1. We should modify the type of swappiness to u64 Fixes: #4123 Signed-off-by: holyfei <yangfeiyu20092010@163.com>	2022-04-22 16:55:37 +08:00
David Gibson	1b931f4203	runtime: Allock mockfs storage to be placed in any directory Currently EnableMockTesting() takes no arguments and will always place the mock storage in the fixed location /tmp/vc/mockfs. This means that one test run can interfere with the next one if anything isn't cleaned up (and there are other bugs which means that happens). If if those were fixed this would allow developers testing on the same machine to interfere with each other. So, allow the mockfs to be placed at an arbitrary place given as a parameter to EnableMockTesting(). In TestMain() we place it under our existing temporary directory, so we don't need any additional cleanup just for the mockfs. fixes #4140 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:47:59 +10:00
David Gibson	ef6d54a781	runtime: Let MockFSInit create a mock fs driver at any path Currently MockFSInit always creates the mockfs at the fixed path /tmp/vc/mockfs. This change allows it to be initialized at any path given as a parameter. This allows the tests in fs_test.go to be simplified, because the by using a temporary directory from t.TempDir(), which is automatically cleaned up, we don't need to manually trigger initTestDir() (which is misnamed, it's actually a cleanup function). For now we still use the fixed path when auto-creating the mockfs in MockAutoInit(), but we'll change that later. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:23:36 +10:00
David Gibson	5d8438e939	runtime: Move mockfs control global into mockfs.go virtcontainers/persist/fs/mockfs.go defines a mock filesystem type for testing. A global variable in virtcontainers/persist/manager.go is used to force use of the mock fs rather than a normal one. This patch moves the global, and the EnableMockTesting() function which sets it into mockfs.go. This is slightly cleaner to begin with, and will allow some further enhancements. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:23:36 +10:00
David Gibson	963d03ea8a	runtime: Export StoragePathSuffix storagePathSuffix defines the file path suffix - "vc" - used for Kata's persistent storage information, as a private constant. We duplicate this information in fc.go which also needs it. Export it from fs.go instead, so it can be used in fc.go. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:23:36 +10:00
David Gibson	1719a8b491	runtime: Don't abuse MockStorageRootPath() for factory tests A number of unit tests under virtcontainers/factory use MockStorageRootPath() as a general purpose temporary directory. This doesn't make sense: the mockfs driver isn't even in use here since we only call EnableMockTesting for the pase virtcontainers package, not the subpackages. Instead use t.TempDir() which is for exactly this purpose. As a bonus it also handles the cleanup, so we don't need MockStorageDestroy any more. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:23:36 +10:00
David Gibson	bec59f9e39	runtime: Make bind mount tests better clean up after themselves There are several tests in mount_test.go which perform a sample bind mount. These need a corresponding unmount to clean up afterwards or attempting to delete the temporary files will fail due to the existing mountpoint. Most of them had such an unmount, but TestBindMountInvalidPgtypes was missing one. In addition, the existing unmounts where done inconsistently - one was simply inline (so wouldn't be executed if the test fails too early) and one is a defer. Change them all to use the t.Cleanup mechanism. For the dummy mountpoint files, rather than cleaning them up after the test, the tests were removing them at the beginning of the test. That stops the test being messed up by a previous run, but messily. Since these are created in a private temporary directory anyway, if there's something already there, that indicates a problem we shouldn't ignore. In fact we don't need to explicitly remove these at all - they'll be removed along with the rest of the private temporary directory. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:20:35 +10:00
David Gibson	f7ba21c86f	runtime: Clean up mock hook logs in tests The tests in hook_test.go run a mock hook binary, which does some debug logging to /tmp/mock_hook.log. Currently we don't clean up those logs when the tests are done. Use a test cleanup function to do this. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:14:52 +10:00
David Gibson	90b2f5b776	runtime: Make SetupOCIConfigFile clean up after itself SetupOCIConfigFile creates a temporary directory with os.MkDirTemp(). This means the callers need to register a deferred function to remove it again. At least one of them was commented out meaning that a /temp/katatest- directory was leftover after the unit tests ran. Change to using t.TempDir() which as well as better matching other parts of the tests means the testing framework will handle cleaning it up. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:14:52 +10:00
David Gibson	2eeb5dc223	runtime: Don't use fixed /tmp/mountPoint path Several tests in kata_agent_test.go create /tmp/mountPoint as a dummy directory to mount. This is not cleaned up after the test. Although it is in /tmp, that's still a little messy and can be confusing to a user. In addition, because it uses the same name every time, it allows for one run of the test to interfere with the next. Use the built in t.TempDir() to use an automatically named and deleted temporary directory instead. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:14:52 +10:00
Jiang Liu	83979ece18	Merge pull request #3462 from jiangliu/safe-path libs/safe-path: add crate to safely resolve fs paths	2022-04-21 11:17:49 +08:00
Liu Jiang	0ad89ebd7c	safe-path: add more unit test cases Add more unit test cases to improve code coverage. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-04-21 10:01:23 +08:00
Liu Jiang	b63774ec61	libs/safe-path: add crate to safely resolve fs paths There are always path(symlink) based attacks, so the `safe-path` crate tries to provde some mechanisms to harden path resolution related code. Fixes: #3451 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>	2022-04-21 10:01:21 +08:00
Braden Rayhorn	f385b21b05	rustjail: add tests for mount_from function Add tests for the mount_from function in rustjail mount.rs file. Fixes: #4121 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-20 20:04:57 -05:00
Fabiano Fidêncio	baa67d8cc5	Merge pull request #4104 from bradenrayhorn/share-assert-result agent: move assert_result macro to test_utils file	2022-04-20 17:51:12 +02:00
Braden Rayhorn	0e7f1a5e3a	agent: move assert_result macro to test_utils file Move the assert_result macro to the shared test_utils file so that it is not duplicated in individual files. Fixes: #4093 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-19 18:57:16 -05:00
Fabiano Fidêncio	604a795073	Merge pull request #4096 from garrettmahin/test-root-grpc-to-oci rustjail: Add tests for root_grpc_to_oci	2022-04-19 21:38:58 +02:00
Fabiano Fidêncio	f619c65b6a	Merge pull request #4074 from bradenrayhorn/test-mount-to-rootfs agent: add tests for mount_to_rootfs function	2022-04-19 21:36:11 +02:00
Fabiano Fidêncio	7ec42951f2	Merge pull request #4035 from bradenrayhorn/test-update-container-namespaces agent: add tests for update_container_namespaces	2022-04-19 21:36:02 +02:00
Fabiano Fidêncio	e6bc912439	Merge pull request #3940 from bradenrayhorn/test-is-signal-handled agent: add tests for is_signal_handled function	2022-04-19 21:35:48 +02:00
Archana Shinde	33e244f284	Merge pull request #4102 from likebreath/0414/clh_v23.0 Upgrade to Cloud Hypervisor v23.0	2022-04-19 06:01:04 -07:00
Fabiano Fidêncio	dbb0c67523	Merge pull request #4072 from fengwang666/dv-bug agent: best-effort removing mount point	2022-04-19 10:08:40 +02:00
Chelsea Mafrica	0af13b469d	Merge pull request #4086 from BbolroC/s390x-fix test: Fix golangci-lint error for s390x	2022-04-15 21:07:09 -07:00
Bin Liu	b19bfac7cd	Merge pull request #4042 from yibozhuang/direct-assign-fsgroup fsGroup support for direct-assigned volume	2022-04-16 10:23:15 +08:00
Bin Liu	4ec1967542	Merge pull request #4094 from fgiudici/kata-monitor_readme kata-monitor: add the README file	2022-04-16 08:27:22 +08:00
Bin Liu	362201605e	Merge pull request #4055 from fgiudici/kata-monitor_pprof kata-monitor: update the hrefs in the debug/pprof index page	2022-04-16 08:12:18 +08:00
Garrett Mahin	2256bcb6ab	rustjail: Add tests for root_grpc_to_oci Add test coverage for root_grpc_to_oci in rustjail/src/lib.rs Fixes: #4095 Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>	2022-04-15 11:09:18 -05:00
Francesco Giudici	7b2ff02647	kata-monitor: add a README file Fixes: #3704 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-04-15 18:03:23 +02:00
Bo Chen	29e569aa92	virtcontainers: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v23.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-04-14 12:56:01 -07:00
Bo Chen	6012c19707	versions: Upgrade to Cloud Hypervisor v23.0 Highlights from the Cloud Hypervisor release v23.0: 1) vDPA Support; 2) Updated OS Support list (Jammy 22.04 added with EOLed versions removed); 3) AArch64 Memory Map Improvements; 4) AMX Support; 5) Bug Fixes; Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v23.0 Fixes: #4101 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-04-14 12:52:35 -07:00
Feng Wang	aabcebbf58	agent: best-effort removing mount point During container exit, the agent tries to remove all the mount point directories, which can fail if it's a readonly filesytem (e.g. device mapper). This commit ignores the removal failure and logs a warning message. Fixes: #4043 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-04-13 22:40:23 -07:00
Chelsea Mafrica	32f92e75cc	Merge pull request #4021 from fengwang666/direct-volume-bug runtime: Base64 encode the direct volume mountInfo path	2022-04-13 13:15:38 -07:00
Greg Kurz	4443bb68a4	Merge pull request #4064 from tiezhuoyu/4063/no-need-to-write-error-of-virtiofsd-to-kata-log runtime: no need to write virtiofsd error to log	2022-04-13 11:59:19 +02:00
Hyounggyu Choi	d136c9c240	test: Fix golangci-lint error for s390x This is to fix a test failure for the kata-containers-2.0-ubuntu-20.04-s390x-main-baseline jenkins job Fixes: #4088 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2022-04-13 09:20:51 +02:00
Fupan Li	66aa07649b	Merge pull request #4062 from liubin/fix/4061-add-links-for-kata-monitor kata-monitor: add some links when generating pages for browsers	2022-04-13 11:30:21 +08:00
Peng Tao	8d8c0388fa	Merge pull request #4078 from fidencio/wip/agent-avoid-panic-when-getting-empty-stats agent: Avoid agent panic when reading empty stats	2022-04-12 23:07:17 +08:00
Francesco Giudici	86977ff780	kata-monitor: update the hrefs in the debug/pprof index page kata-monitor allows to get data profiles from the kata shim instances running on the same node by acting as a proxy (e.g., http://$NODE_ADDRESS:8090/debug/pprof/?sandbox=$MYSANDBOXID). In order to proxy the requests and the responses to the right shim, kata-monitor requires to pass the sandbox id via a query string in the url. The profiling index page proxied by kata-monitor contains the link to all the data profiles available. All the links anyway do not contain the sandbox id included in the request: the links result then broken when accessed through kata-monitor. This happens because the profiling index page comes from the kata shim, which will not include the query string provided in the http request. Let's add on-the-fly the sandbox id in each href tag returned by the kata shim index page before providing the proxied page. Fixes: #4054 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-04-12 15:53:59 +02:00
Fabiano Fidêncio	78f30c33c6	agent: Avoid agent panic when reading empty stats This was seen in an issue report, where we'd try to unwrap a None value, leading to a panic. Fixes: #4077 Related: #4043 Full backtrace: ``` "thread 'tokio-runtime-worker' panicked at 'called `Option::unwrap()` on a `None` value', rustjail/src/cgroups/fs/mod.rs:593:31" "stack backtrace:" " 0: 0x7f0390edcc3a - std::backtrace_rs::backtrace::libunwind::trace::hd5eff4de16dbdd15" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5" " 1: 0x7f0390edcc3a - std::backtrace_rs::backtrace::trace_unsynchronized::h04a775b4c6ab90d6" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5" " 2: 0x7f0390edcc3a - std::sys_common::backtrace::_print_fmt::h3253c3db9f17d826" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:67:5" " 3: 0x7f0390edcc3a - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h02bfc712fc868664" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:46:22" " 4: 0x7f0390a91fbc - core::fmt::write::hfd5090d1132106d8" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/fmt/mod.rs:1149:17" " 5: 0x7f0390edb804 - std::io::Write::write_fmt::h34acb699c6d6f5a9" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/io/mod.rs:1697:15" " 6: 0x7f0390edbee0 - std::sys_common::backtrace::_print::hfca761479e3d91ed" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:49:5" " 7: 0x7f0390edbee0 - std::sys_common::backtrace::print::hf666af0b87d2b5ba" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:36:9" " 8: 0x7f0390edbee0 - std::panicking::default_hook::{{closure}}::hb4617bd1d4a09097" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:211:50" " 9: 0x7f0390edb2da - std::panicking::default_hook::h84f684d9eff1eede" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:228:9" " 10: 0x7f0390edb2da - std::panicking::rust_panic_with_hook::h8e784f5c39f46346" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:606:17" " 11: 0x7f0390f0c416 - std::panicking::begin_panic_handler::{{closure}}::hef496869aa926670" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:500:13" " 12: 0x7f0390f0c3b6 - std::sys_common::backtrace::__rust_end_short_backtrace::h8e9b039b8ed3e70f" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:139:18" " 13: 0x7f0390f0c372 - rust_begin_unwind" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:498:5" " 14: 0x7f03909062c0 - core::panicking::panic_fmt::h568976b83a33ae59" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/panicking.rs:107:14" " 15: 0x7f039090641c - core::panicking::panic::he2e71cfa6548cc2c" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/panicking.rs:48:5" " 16: 0x7f0390eb443f - <rustjail::cgroups::fs::Manager as rustjail::cgroups::Manager>::get_stats::h85031fc1c59c53d9" " 17: 0x7f03909c0138 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hfa6e6cd7516f8d11" " 18: 0x7f0390d697e5 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hffbaa534cfa97d44" " 19: 0x7f039099c0b3 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hae3ab083a06d0b4b" " 20: 0x7f0390af9e1e - std::panic::catch_unwind::h1fdd25c8ebba32e1" " 21: 0x7f0390b7c4e6 - tokio::runtime::task::raw::poll::hd3ebbd0717dac808" " 22: 0x7f0390f49f3f - tokio::runtime::thread_pool::worker::Context::run_task::hfdd63cd1e0b17abf" " 23: 0x7f0390f3a599 - tokio::runtime::task::raw::poll::h62954f6369b1d210" " 24: 0x7f0390f37863 - std::sys_common::backtrace::__rust_begin_short_backtrace::h1c58f232c078bfe9" " 25: 0x7f0390f4f3dd - core::ops::function::FnOnce::call_once{{vtable.shim}}::h2d329a84c0feed57" " 26: 0x7f0390f0e535 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h137e5243c6233a3b" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/alloc/src/boxed.rs:1694:9" " 27: 0x7f0390f0e535 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h7331c46863d912b7" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/alloc/src/boxed.rs:1694:9" " 28: 0x7f0390f0e535 - std::sys::unix::thread::Thread::new::thread_start::h1fb20b966cb927ab" " at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys/unix/thread.rs:106:17" ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-12 11:19:08 +02:00
Zhuoyu Tie	6e79042aa0	runtime: no need to write virtiofsd error to log The scanner reads nothing from viriofsd stderr pipe, because param '--syslog' rediercts stderr to syslog. So there is no need to write scanner.Text() to kata log Fixes: #4063 Signed-off-by: Zhuoyu Tie <tiezhuoyu@outlook.com>	2022-04-12 15:59:57 +08:00
Braden Rayhorn	9b6f24b2ee	agent: add tests for mount_to_rootfs function Add test coverage for mount_to_rootfs function in src/mount.rs. Includes minor refactoring to make function more easily testable. Fixes #4073 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-11 21:42:38 -05:00
Braden Rayhorn	c3776b1792	agent: add tests for is_signal_handled function Add test coverage for is_signal_handled function in rpc.rs. Includes refactors to make the function testable and handle additional cases. Fixes #3939 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-11 21:23:55 -05:00
Braden Rayhorn	9c22d9554e	agent: add tests for update_container_namespaces Add test coverage for update_container_namespaces function in src/rpc.rs. Includes minor refactor to make function easier to test. Fixes #4034 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-11 18:27:30 -05:00
Fabiano Fidêncio	c108bb7a2a	Merge pull request #4071 from GabyCT/topic/updatelimidoc docs: Update link to contributions guide	2022-04-11 18:37:31 +02:00
Chelsea Mafrica	bf98c99f14	Merge pull request #4069 from bradenrayhorn/test-mount-storage agent: add tests for mount_storage	2022-04-11 09:14:05 -07:00
Yibo Zhuang	92c00c7e84	agent: fsGroup support for direct-assigned volume Adding two functions set_ownership and recursive_ownership_change to support changing group id ownership for a mounted volume. The set_ownership will be called in common_storage_handler after mount_storage performs the mount for the volume. set_ownership will be a noop if the FSGroup field in the Storage struct is not set which indicates no chown will be performed. If FSGroup field is specified, then it will perform the recursive walk of the mounted volume path to change ownership of all files and directories to the desired group id. It will also configure the SetGid bit so that files created the directory will have group following parent directory group. If the fsGroupChangePolicy is on root mismatch, then the group ownership will be skipped if the root directory group id alreasy matches the desired group id and if the SetGid bit is also set on the root directory. This is the same behavior as what Kubelet does today when performing the recursive walk to change ownership. Fixes #4018 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-04-11 08:57:13 -07:00
Gabriela Cervantes	6e9e4e8ce5	docs: Update link to contributions guide This PR updates the url link to the contributions guide at the Limitations document. Fixes #4070 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-04-11 15:49:57 +00:00
Yibo Zhuang	532d53977e	runtime: fsGroup support for direct-assigned volume The fsGroup will be specified by the fsGroup key in the direct-assign mountinfo metadate field. This will be set when invoking the kata-runtime binary and providing the key, value pair in the metadata field. Similarly, the fsGroupChangePolicy will also be provided in the mountinfo metadate field. Adding an extra fields FsGroup and FSGroupChangePolicy in the Mount construct for container mount which will be populated when creating block devices by parsing out the mountInfo.json. And in handleDeviceBlockVolume of the kata-agent client, it checks if the mount FSGroup is not nil, which indicates that fsGroup change is required in the guest, and will provide the FSGroup field in the protobuf to pass the value to the agent. Fixes #4018 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-04-11 08:41:13 -07:00
Yibo Zhuang	6a47b82c81	proto: fsGroup support for direct-assigned volume This change adds two fields to the Storage pb FSGroup which is a group id that the runtime specifies to indicate to the agent to perform a chown of the mounted volume to the specified group id after mounting is complete in the guest. FSGroupChangePolicy which is a policy to indicate whether to always perform the group id ownership change or only if the root directory group id does not match with the desired group id. These two fields will allow CSI plugins to indicate to Kata that after the block device is mounted in the guest, group id ownership change should be performed on that volume. Fixes #4018 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-04-11 08:41:13 -07:00
Braden Rayhorn	9d5e7ee0d4	agent: add tests for mount_storage Add test coverage for mount_storage function in src/mount.rs. Fixes: #4068 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-10 21:42:20 -05:00
bin	f8cc5d1ad8	kata-monitor: add some links when generating pages for browsers Add some links to rendered webpages for better user experience, let users can jump to pages only by clicking links in browsers. Fixes: #4061 Signed-off-by: bin <bin@hyper.sh>	2022-04-11 09:29:56 +08:00
Fabiano Fidêncio	698e45f403	Merge pull request #4057 from bradenrayhorn/test-parse-mount-flags-and-options agent: add test coverage for parse_mount_flags_and_options function	2022-04-08 14:42:18 +02:00
Fabiano Fidêncio	761e8313de	Merge pull request #3985 from bradenrayhorn/test-do-write-stream agent: add tests for do_write_stream function	2022-04-08 14:34:57 +02:00
Peng Tao	4f551e3428	Merge pull request #4048 from liubin/fix/3303-delete-virtiofsd-debug-option runtime: delete debug option in virtiofsd	2022-04-08 15:42:38 +08:00
Peng Tao	a83a16e32c	Merge pull request #4059 from garrettmahin/test-process-grpc-to-oci rustjail: add test coverage for process_grpc_to_oci function	2022-04-08 15:39:28 +08:00
Peng Tao	95e45fab38	Merge pull request #4053 from ManaSugi/fix-makefile-for-features agent: Allow the agent to be rebuilt with the change of Cargo features	2022-04-08 15:38:25 +08:00
garrettmahin	c31cd0e81a	rustjail: add test coverage for process_grpc_to_oci function Add test coverage for the process_grpc_to_oci function in src/rustjail/lib.rs Fixes #4058 Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>	2022-04-07 20:50:48 -05:00
Bin Liu	9c1c219a3f	Merge pull request #4007 from liubin/fix/3959-add-csi-rs-to-gitignore protocols: add src/csi.rs to .gitignore	2022-04-08 09:33:04 +08:00
Braden Rayhorn	1118a3d2da	agent: add test coverage for parse_mount_flags_and_options function Add test coverage for the parse_mount_flags_and_options function in src/mount.rs. Fixes #4056 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-07 17:46:35 -05:00
bin	9d5b03a1b7	runtime: delete debug option in virtiofsd virtiofsd's debug will be enabled if hypervisor's debug has been enabled, this will generate too many noisy logs from virtiofsd. Unbind the relationship of log level between virtiofsd and hypervisor, if users want to see debug log of virtiofsd, can set it by: virtio_fs_extra_args = ["-o", "log_level=debug"] Fixes: #3303 Signed-off-by: bin <bin@hyper.sh>	2022-04-07 19:55:22 +08:00
Manabu Sugimoto	eff7c7e0ff	agent: Allow the agent to be rebuilt with the change of Cargo features This allows the kata-agent to be rebuilt when Cargo "features" is changed. The Makefile for the agent do not need to specify the sources for prerequisites by having Cargo check for the sources changes. Fixes: #4052 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-04-07 20:09:20 +09:00
Greg Kurz	d0d3787233	Merge pull request #3696 from shippomx/main kata-runtime enable hugepage support	2022-04-06 16:47:04 +02:00
Fabiano Fidêncio	465d3a5506	Merge pull request #4012 from nubificus/how-to-fc-guide docs: Add a firecracker installation guide	2022-04-06 12:59:55 +02:00
Jaylyn Ren	b975f2e8d2	Virtcontainers: Enable hot plugging vhost-user-blk device on ARM The vhost-user-blk can be hotplugged on the PCI bridge successfully on X86, but failed on Arm. However, hotplugging it on Root Port as a PCIe device can work well on ARM. Open the "pcie_root_port" in configuration.toml is needed. Fixes: #4019 Signed-off-by: Jaylyn Ren <jaylyn.ren@arm.com>	2022-04-06 17:37:51 +08:00
bin	962d05ec86	protocols: add src/csi.rs to .gitignore After running make in src/agent, the git working area will be changed: Untracked files: (use "git add <file>..." to include in what will be committed) src/libs/protocols/src/csi.rs The generated file by `build.rs` should be ignored in git. Fixes: #3959 Signed-off-by: bin <bin@hyper.sh>	2022-04-06 09:55:38 +08:00
Fabiano Fidêncio	b39caf43f1	Merge pull request #3923 from Jakob-Naucke/no-initrd-se runtime: Allow and require no initrd for SE	2022-04-05 09:26:07 +02:00
Feng Wang	354cd3b9b6	runtime: Base64 encode the direct volume mountInfo path This is to avoid accidentally deleting multiple volumes. Fixes #4020 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-04-04 19:56:46 -07:00
Braden Rayhorn	485aeabb6b	agent: add tests for do_write_stream function Add test coverage for do_write_stream function of AgentService in src/rpc.rs. Includes minor refactoring to make function more easily testable. Fixes #3984 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-04-04 08:21:01 -05:00
George Ntoutsos	4405b188e8	docs: Add a firecracker installation guide Add info on setting up kata with firecracker. Fixes: #3555 Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk> Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2022-04-04 14:59:41 +03:00
Archana Shinde	e62bc8e7f3	Merge pull request #3915 from Juneezee/test/t.TempDir test: use `T.TempDir` to create temporary test directory	2022-04-04 01:34:46 -07:00
Fabiano Fidêncio	8980d04e25	Merge pull request #4023 from fidencio/wip/expose-service-offload-option-to-clh clh: Expose service offload configuration	2022-04-01 14:10:33 +02:00
Fabiano Fidêncio	3f668b84f3	Merge pull request #4025 from bergwolf/2.5.0-alpha0-branch-bump # Kata Containers 2.5.0-alpha0	2022-04-01 14:00:19 +02:00
Fabiano Fidêncio	98750d792b	clh: Expose service offload configuration This configuration option is valid for all the hypervisor that are going to be used with the confidential containers effort, thus exposing the configuration option for Cloud Hypervisor as well. Fixes: #4022 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-01 11:15:55 +02:00
Greg Kurz	bbdfac4fd8	Merge pull request #4011 from gkurz/bump-fc-0-23-4 versions: Bump firecracker to v0.23.4	2022-04-01 11:01:15 +02:00
Bin Liu	416cc90b7a	Merge pull request #3972 from wfly1998/main agent: use ms as unit of cputime instead of ticks	2022-04-01 15:34:06 +08:00
Peng Tao	c9e24433d8	release: Kata Containers 2.5.0-alpha0 - agent: fix container stop error with signal SIGRTMIN+3 - doc: Improve kata-deploy README.md by changing sh blocks to bash blocks - docs: Remove kata-proxy reference - kata-monitor: fix duplicated output when printing usage - Stop getting OOM events from agent for "ttrpc closed" error - tools/packaging: Fix error path in `kata-deploy-binaries.sh -s` - kata-deploy: fix version bump from -rc to stable - release: Include all the rust vendored code into the vendored tarball - docs: Remove VPP documentation - runtime: Remove the explicit VirtioMem set and fix the comment - tools/packaging/kata-deploy: Copy install_yq.sh before starting parallel builds - docs: Remove kata-proxy references in documentation - agent: Signal the whole process group - osbuilder/qat: don't pull kata sources if exist - docs: fix markdown issues in how-to-run-docker-with-kata.md - osbuilder/qat: use centos as base OS - docs: Update vcpu handling document - Agent: fix unneeded late initialization lint - static-build,clh: Add the ability to build from a PR - Don't use a globally installed mock hook for hook tests - ci: Weekly check whether the docs url is alive - Multistrap Ubuntu & enable cross-building guest - device: using const strings for block-driver option instead of hard coding - doc: update Intel SGX use cases document - tools: update QEMU to 6.2 - action: Update link for format patch documentation - runtime: properly handle ESRCH error when signaling container - docs: Update k8s documentation - rustjail: optimization, merged several writelns into one - doc: fix kata-deploy README typo - versions: Upgrade to Cloud Hypervisor v22.1 - Add debug and self-test control options to Kata Manager - scripts: Change here document delimiters - agent: add tests for get_memory_info function - CI: Update GHA secret name - tools: release: Do not consider release candidates as stable releases - kernel: fix cve-2022-0847 - docs: Update contact link in runtime README - Improve error checking of hugepage allocation - CI: Create GHA to add PR sizing label - release: Revert kata-deploy changes after 2.4.0-rc0 release `2b91dcfe` docs: Remove kata-proxy reference `0d765bd0` agent: fix container stop error with signal SIGRTMIN+3 `a63bbf97` kata-monitor: fix duplicated output when printing usage `9e4ca0c4` doc: Improve kata-deploy README.md by changing sh blocks to bash blocks `a779e19b` tools/packaging: Fix error path in 'kata-deploy-binaries.sh -s' `0baebd2b` tools/packaging: Fix usage of kata-deploy-binaries.sh `3606923a` workflows,release: Ship all the rust vendored code `2eb07455` tools: Add a generate_vendor.sh script `5e1c30d4` runtime: add logs around sandbox monitor `fb8be961` runtime: stop getting OOM events when ttrpc: closed error `93d03cc0` kata-deploy: fix version bump from -rc to stable `a9314023` docs: Remove kata-proxy references in documentation `66f05c5b` runtime: Remove the explicit VirtioMem set and fix the comment `0928eb9f` agent: Kill the all the container processes of the same cgroup `c2796327` osbuilder/qat: don't pull kata sources if exist `154c8b03` tools/packaging/kata-deploy: Copy install_yq.sh in a dedicated script `1ed7da8f` packaging: Eliminate TTY_OPT and NO_TTY variables in kata-deploy `bad859d2` tools/packaging/kata-deploy/local-build: Add build to gitignore `19f372b5` runtime: Add more debug logs for container io stream copy `459f4bfe` osbuilder/qat: use centos as base OS `9a5b4770` docs: Update vcpu handling document `ecf71d6d` docs: Remove VPP documentation `c77e34de` runtime: Move mock hook source `86723b51` virtcontainers: Remove unused install/uninstall targets `0e83c95f` virtcontainers: Run mock hook from build tree rather than system bin dir `77434864` docs: fix markdown issues in how-to-run-docker-with-kata.md `32131cb8` Agent: fix unneeded late initialization lint `e65db838` virtcontainers: Remove VC_BIN_DIR `c20ad283` virtcontainers: Remove unused Makefile defines `c776bdf4` virtcontainers: Remove unused parameter from go-test.sh `ebec6903` static-build,clh: Add the ability to build from a PR `24b29310` doc: update Intel SGX use cases document `18d4d7fb` tools: update QEMU to 6.2 `62351637` action: Update link for format patch documentation `aa5ae6b1` runtime: Properly handle ESRCH error when signaling container `efa19c41` device: use const strings for block-driver option instead of hard coding `dacf6e39` doc: fix filename typo `92ce5e2d` rustjail: optimization, merged several writelns into one `7a18e32f` versions: Upgrade to Cloud Hypervisor v22.1 `5c434270` docs: Update k8s documentation `5d6d39be` scripts: Change here document delimiters `be12baf3` manager: Change here documents to use standard delimiter `9576a7da` manager: Add options to change self test behaviour `d4d65bed` manager: Add option to enable component debug `019da91d` manager: Whitespace fix `d234cb76` manager: Create containerd link `c088a3f3` agent: add tests for get_memory_info function `4b1e2f52` CI: Update GHA secret name `ffdf961a` docs: Update contact link in runtime README `5ec7592d` kernel: fix cve-2022-0847 `6a850899` CI: Create GHA to add PR sizing label `2b41d275` release: Revert kata-deploy changes after 2.4.0-rc0 release `4adf93ef` tools: release: Do not consider release candidates as stable releases `72f7e9e3` osbuilder: Multistrap Ubuntu `df511bf1` packaging: Enable cross-building agent `0a313eda` osbuilder: Fix use of LIBC in rootfs.sh `2c86b956` osbuilder: Simplify Rust installation `0072cc2b` osbuilder: Remove musl installations `5c3e5536` osbuilder: apk add --no-cache `42e35505` agent: Verify that we allocated as many hugepages as we need `608e003a` agent: Don't attempt to create directories for hugepage configuration `168fadf1` ci: Weekly check whether the docs url is alive Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-04-01 06:23:21 +00:00
Bin Liu	5d0adb2164	Merge pull request #3995 from wxx213/main agent: fix container stop error with signal SIGRTMIN+3	2022-04-01 11:29:14 +08:00
David Esparza	a06e51dae0	Merge pull request #3944 from dborquez/improve-readme-format doc: Improve kata-deploy README.md by changing sh blocks to bash blocks	2022-03-31 14:48:53 -06:00
GabyCT	f026e78716	Merge pull request #4014 from GabyCT/topic/acrndoc docs: Remove kata-proxy reference	2022-03-31 12:01:13 -06:00
Gabriela Cervantes	2b91dcfeef	docs: Remove kata-proxy reference This PR removes the kata-proxy reference from this document as it is not longer a component in kata 2.0 Fixes #4013 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-03-31 16:30:03 +00:00
Greg Kurz	0d5f80b803	versions: Bump firecracker to v0.23.4 This release changes Docker images repository from DockerHub to Amazon ECR. This resolves the `You have reached your pull rate limit` error when building the firecracker tarball. Fixes #4001 Signed-off-by: Greg Kurz <groug@kaod.org>	2022-03-31 13:25:19 +02:00
Wang Xingxing	0d765bd082	agent: fix container stop error with signal SIGRTMIN+3 The nix::sys::signal::Signal package api cannot deal with SIGRTMIN+3, directly use libc function to send the signal. Fixes: #3990 Signed-off-by: Wang Xingxing <stellarwxx@163.com>	2022-03-31 10:49:45 +08:00
Eng Zer Jun	59c7165ee1	test: use `T.TempDir` to create temporary test directory The directory created by `T.TempDir` is automatically removed when the test and all its subtests complete. This commit also updates the unit test advice to use `T.TempDir` to create temporary directory in tests. Fixes: #3924 Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-03-31 09:31:36 +08:00
snir911	18dc578134	Merge pull request #3999 from fgiudici/kata-monitor_fix_help kata-monitor: fix duplicated output when printing usage	2022-03-30 18:56:59 +03:00
Francesco Giudici	a63bbf9793	kata-monitor: fix duplicated output when printing usage (default: "/run/containerd/containerd.sock") is duplicated when printing kata-monitor usage: [root@kubernetes ~]# kata-monitor --help Usage of kata-monitor: -listen-address string The address to listen on for HTTP requests. (default ":8090") -log-level string Log level of logrus(trace/debug/info/warn/error/fatal/panic). (default "info") -runtime-endpoint string Endpoint of CRI container runtime service. (default: "/run/containerd/containerd.sock") (default "/run/containerd/containerd.sock") the golang flag package takes care of adding the defaults when printing usage. Remove the explicit print of the value so that it would not be printed on screen twice. Fixes: #3998 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-03-30 11:58:53 +02:00
David Esparza	9e4ca0c4f8	doc: Improve kata-deploy README.md by changing sh blocks to bash blocks The idea is to pass this README file to kata-doc-to-script.sh script and then execute the result. Added comments with a file name on top of each YAML snippet. This helps in assigning a file name when we cat the YAML to a file. Fixes: #3943 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2022-03-30 05:30:41 -04:00
Peng Tao	6837ab7213	Merge pull request #3989 from liubin/fix/3815-redue-oom-logs Stop getting OOM events from agent for "ttrpc closed" error	2022-03-30 17:02:05 +08:00
snir911	f1a88371c8	Merge pull request #3991 from gkurz/fix-kata-deploy-binaries-sh tools/packaging: Fix error path in `kata-deploy-binaries.sh -s`	2022-03-30 11:51:43 +03:00
Hui Zhu	e1a39bde8b	Merge pull request #3987 from bergwolf/kata-deploy kata-deploy: fix version bump from -rc to stable	2022-03-30 16:13:27 +08:00
Fabiano Fidêncio	e1875d1879	Merge pull request #3974 from fidencio/wip/release-include-all-rust-vendored-code-to-the-vendored-tarball release: Include all the rust vendored code into the vendored tarball	2022-03-29 23:25:17 +02:00
Greg Kurz	a779e19bee	tools/packaging: Fix error path in 'kata-deploy-binaries.sh -s' `make kata-tarball` relies on `kata-deploy-binaries.sh -s` which silently ignores errors, and you may end up with an incomplete tarball without noticing it because `make`'s exit status is 0. `kata-deploy-binaries.sh` does set the `errexit` option and all the code in the script seems to assume that since it doesn't do error checking. Unfortunately, bash automatically disables `errexit` when calling a function from a conditional pipeline, like done in the `-s` case: if [ "${silent}" == true ]; then if ! handle_build "${t}" &>"$log_file"; then ^^^^^^ this disables `errexit` and `handle_build` ends with a `tar tvf` that always succeeds. Adding error checking all over the place isn't really an option as it would seriously obfuscate the code. Drop the conditional pipeline instead and print the final error message from a `trap` handler on the special ERR signal. This requires the `errtrace` option as `trap`s aren't propagated to functions by default. Since all outputs of `handle_build` are redirected to the build log file, some file descriptor duplication magic is needed for the handler to be able to write to the orignal stdout and stderr. Fixes #3757 Signed-off-by: Greg Kurz <groug@kaod.org>	2022-03-29 19:00:46 +02:00
Greg Kurz	0baebd2b37	tools/packaging: Fix usage of kata-deploy-binaries.sh Add missing documentation for -s . Signed-off-by: Greg Kurz <groug@kaod.org>	2022-03-29 19:00:46 +02:00
GabyCT	2dc092fe60	Merge pull request #3947 from GabyCT/topic/removevpp docs: Remove VPP documentation	2022-03-29 10:45:21 -06:00
Fabiano Fidêncio	3606923ac8	workflows,release: Ship all the rust vendored code Instead of only vendoring the code needed by the agent, let's ensure we vendor all the needed rust code, and let's do it using the newly introduced enerate_vendor.sh script. Fixes: #3973 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-29 12:37:00 +02:00
Fabiano Fidêncio	2eb07455d0	tools: Add a generate_vendor.sh script This script is responsible for generating a tarball with all the rust vendored code that is needed for fully building kata-containers on a disconnected environment. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-29 12:36:36 +02:00
bin	5e1c30d484	runtime: add logs around sandbox monitor For debugging purposes, add some logs. Fixes: #3815 Signed-off-by: bin <bin@hyper.sh>	2022-03-29 16:59:12 +08:00
bin	fb8be96194	runtime: stop getting OOM events when ttrpc: closed error getOOMEvents is a long-waiting call, it will retry when failed. For cases of agent shutdown, the retry should stop. When the agent hasn't detected agent has died, we can also check whether the error is "ttrpc: closed". Fixes: #3815 Signed-off-by: bin <bin@hyper.sh>	2022-03-29 16:39:01 +08:00
Peng Tao	93d03cc064	kata-deploy: fix version bump from -rc to stable In such case, we should bump from "latest" tag rather than from current_version. Fixes: #3986 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-03-29 03:41:12 +00:00
Bin Liu	9495316145	Merge pull request #3962 from yaoyinnan/fix/3750-VirtioMem runtime: Remove the explicit VirtioMem set and fix the comment	2022-03-29 10:20:05 +08:00
David Gibson	025fa60268	Merge pull request #3969 from gkurz/kata-deploy-copy-yq-installer tools/packaging/kata-deploy: Copy install_yq.sh before starting parallel builds	2022-03-29 12:56:09 +11:00
Julio Montes	c9178b0750	Merge pull request #3981 from GabyCT/topic/removekata-proxy docs: Remove kata-proxy references in documentation	2022-03-28 14:52:41 -06:00
Gabriela Cervantes	a931402375	docs: Remove kata-proxy references in documentation This PR removes the kata-proxy references in VSocks documentation, as this is not a component in kata 2.0 and all the examples that were used belonged to kata 1.x. Fixes #3980 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-03-28 16:36:22 +00:00
yaoyinnan	66f05c5bcb	runtime: Remove the explicit VirtioMem set and fix the comment Modify the 2Mib in the comment to 4Mib. VirtioMem is set by configuration file or annotation. And setupVirtioMem is called only when VirtioMem is true. Fixes: #3750 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2022-03-28 21:21:38 +08:00
Yu Li	800e4a9cfb	agent: use ms as unit of cputime instead of ticks For the library `procfs`, the unit of values in `CpuTime` is ticks, and we do not know how many ticks per second from metrics because the `tps` in `CpuTime` is private. But there are some implements in `CpuTime` for getting these values, e.g., `user_ms()` for `user`, and `nice_ms()` for `nice`. With these values, accurate time can be obtained. Fixes: #3979 Acked-by: zhaojizhuang <571130360@qq.com> Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>	2022-03-28 19:30:09 +08:00
Peng Tao	e723dd5bba	Merge pull request #3955 from fengwang666/container-leak agent: Signal the whole process group	2022-03-28 17:11:34 +08:00
Feng Wang	0928eb9f4e	agent: Kill the all the container processes of the same cgroup Otherwise the container process might leak and cause an unclean exit Fixes: #3913 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-27 10:06:58 -07:00
GabyCT	a07956a369	Merge pull request #3966 from devimc/2022-03-22/fixOsbuilderQAT osbuilder/qat: don't pull kata sources if exist	2022-03-25 15:12:03 -06:00
Jakob Naucke	ff17c756d2	runtime: Allow and require no initrd for SE Previously, it was not permitted to have neither an initrd nor an image. However, this is the exact config to use for Secure Execution, where the initrd is part of the image to be specified as `-kernel`. Require the configuration of no initrd for Secure Execution. Also - remove redundant code for image/initrd checking -- no need to check in `newQemuHypervisorConfig` (calling) when it is also checked in `getInitrdAndImage` (called) - use `QemuCCWVirtio` constant when possible Fixes: #3922 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-03-25 18:36:12 +01:00
Julio Montes	c27963276b	osbuilder/qat: don't pull kata sources if exist don't pull kata sources if they already exist under GOPATH fixes #3965 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-03-25 09:09:52 -06:00
Greg Kurz	154c8b03d3	tools/packaging/kata-deploy: Copy install_yq.sh in a dedicated script 'make kata-tarball' sometimes fails early with: cp: cannot create regular file '[...]/tools/packaging/kata-deploy/local-build/dockerbuild/install_yq.sh': File exists This happens because all assets are built in parallel using the same `kata-deploy-binaries-in-docker.sh` script, and thus all try to copy the `install_yq.sh` script to the same location with the `cp` command. This is a well known race condition that cannot be avoided without serialization of `cp` invocations. Move the copying of `install_yq.sh` to a separate script and ensure it is called before parallel builds. Make the presence of the copy a prerequisite for each sub-build so that they still can be triggered individually. Update the GH release workflow to also call this script before calling `kata-deploy-binaries-in-docker.sh`. Fixes #3756 Signed-off-by: Greg Kurz <groug@kaod.org>	2022-03-25 15:59:24 +01:00
David Gibson	1ed7da8fc7	packaging: Eliminate TTY_OPT and NO_TTY variables in kata-deploy NO_TTY configured whether to add the -t option to docker run. It makes no sense for the caller to configure this, since whether you need it depends on the commands you're running. Since the point here is to run non-interactive build scripts, we don't need -t, or -i either. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Greg Kurz <groug@kaod.org>	2022-03-25 15:52:02 +01:00
David Gibson	bad859d2f8	tools/packaging/kata-deploy/local-build: Add build to gitignore This directory consists entirely of files built during a make kata-tarball, so it should not be committed to the tree. A symbolic link to this directory might be created during 'make tarball', ignore it as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [greg: - rearranged the subject to make the subsystem checker happy - also ignore the symbolic link created by `kata-deploy-binaries-in-docker.sh`] Signed-off-by: Greg Kurz <groug@kaod.org>	2022-03-25 15:52:02 +01:00
James O. D. Hunt	486322a0f1	Merge pull request #3930 from liubin/fix/3929-doc-for-dind docs: fix markdown issues in how-to-run-docker-with-kata.md	2022-03-25 10:49:19 +00:00
Feng Wang	19f372b5f5	runtime: Add more debug logs for container io stream copy This can help debugging container lifecycle issues Fixes: #3913 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-24 21:35:16 -07:00
GabyCT	4776e346a0	Merge pull request #3952 from devimc/2022-03-23/fixQATCI osbuilder/qat: use centos as base OS	2022-03-24 10:10:52 -06:00
Julio Montes	459f4bfedb	osbuilder/qat: use centos as base OS move away from ubuntu, since now it's easier to build using CentOS as base OS fixes #3936 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-03-24 08:18:29 -06:00
Peng Tao	853dd98b7b	Merge pull request #3951 from GabyCT/topic/vcpusdoc docs: Update vcpu handling document	2022-03-24 16:02:59 +08:00
Peng Tao	098374b179	Merge pull request #3934 from dcmiddle/fix-agent-check Agent: fix unneeded late initialization lint	2022-03-24 16:02:11 +08:00
GabyCT	d9cd8cde2b	Merge pull request #3909 from fidencio/wip/clh-allow-testing-a-specific-pr static-build,clh: Add the ability to build from a PR	2022-03-23 15:24:34 -06:00
Gabriela Cervantes	9a5b477062	docs: Update vcpu handling document This PR updates the vcpu handling document by removing docker information which is not longer being used in kata 2.x and leaving only k8s information. Fixes #3950 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-03-23 17:58:49 +00:00
Fabiano Fidêncio	7a8b96b857	Merge pull request #3942 from dgibson/kata1420 Don't use a globally installed mock hook for hook tests	2022-03-23 17:57:16 +01:00
Gabriela Cervantes	ecf71d6dd6	docs: Remove VPP documentation This PR is removing VPP documentation as it is not longer valid with kata 2.x, all the instructions were used for kata 1.x Fixes #3946 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-03-23 15:50:37 +00:00
David Gibson	c77e34de33	runtime: Move mock hook source src/runtime/virtcontainers/hook/mock contains a simple example hook in Go. The only thing this is used for is for some tests in src/runtime/pkg/katautils/hook_test.go. It doesn't really have anything to do with the rest of the virtcontainers package. So, move it next to the test code that uses it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-23 19:37:35 +11:00
David Gibson	86723b51ae	virtcontainers: Remove unused install/uninstall targets We've now removed the need to install the mock hook binary for unit tests. However, it turns out that managing that was the only thing that the install and uninstall targets in the virtcontainers Makefile handled. So, remove them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-23 19:37:18 +11:00
David Gibson	0e83c95fac	virtcontainers: Run mock hook from build tree rather than system bin dir Running unit tests should generally have minimal dependencies on things outside the build tree. It definitely shouldn't modify system wide things outside the build tree. Currently the runtime "make test" target does so, though. Several of the tests in src/runtime/pkg/katautils/hook_test.go require a sample hook binary. They expect this hook in /usr/bin/virtcontainers/bin/test/hook, so the makefile, as root, installs the test binary to that location. Go tests automatically run within the package's directory though, so there's no need to use a system wide path. We can use a relative path to the binary build within the tree just as easily. fixes #3941 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-23 19:34:50 +11:00
bin	7743486413	docs: fix markdown issues in how-to-run-docker-with-kata.md Some links in how-to-run-docker-with-kata.md is not correct, and some typos. Fixes: #3929 Signed-off-by: bin <bin@hyper.sh>	2022-03-23 08:15:02 +08:00
Dan Middleton	32131cb8ba	Agent: fix unneeded late initialization lint Clippy v1.58 added needless_late_init Fixes #3933 Signed-off-by: Dan Middleton <dan.middleton@intel.com>	2022-03-22 10:17:24 -05:00
David Gibson	e65db838ff	virtcontainers: Remove VC_BIN_DIR The VC_BIN_DIR variable in the virtcontainers Makefile is almost unused. It's used to generate TEST_BIN_DIR, and it's created in the install target. However, we also create TEST_BIN_DIR, which is a subdirectory of VC_BIN_DIR with mkdir -p, so it will necessarily create VC_BIN_DIR along the way. So we can drop the unnecessary mkdir and expand the definition of VC_BIN_DIR in the definition of TEST_BIN_DIR. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-22 16:53:59 +11:00
David Gibson	c20ad2836c	virtcontainers: Remove unused Makefile defines The INSTALL_EXEC and UNINSTALL_EXEC definitions from the virtcontainers Makefile (unlike those from the runtime Makefile in the parent directory) are entirely unused. Remove them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-22 16:40:57 +11:00
David Gibson	c776bdf4a8	virtcontainers: Remove unused parameter from go-test.sh The check-go-test target passes the path to the mock hook test binary to go-test.sh when it invokes it. But go-test.sh just calls run_go_test from ci/lib.sh, which invokes a script from the tests repo without any parameters. That is, this parameter is ignored anyway, so remove it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-22 16:39:22 +11:00
Fabiano Fidêncio	aa6886f1ed	Merge pull request #2482 from Bevisy/main-815 ci: Weekly check whether the docs url is alive	2022-03-21 17:15:40 +01:00
James O. D. Hunt	3edf25b6c9	Merge pull request #3682 from Jakob-Naucke/cross Multistrap Ubuntu & enable cross-building guest	2022-03-21 11:11:47 +00:00
James O. D. Hunt	f8fb0d3bb6	Merge pull request #3322 from Kvasscn/kata_dev_block_driver_option device: using const strings for block-driver option instead of hard coding	2022-03-21 10:56:25 +00:00
Fabiano Fidêncio	ebec6903b8	static-build,clh: Add the ability to build from a PR Right now it doesn't do much for us, as we're always building from a specific version. However, this opens the possibility for us to add a CI, similar to the one we have for CRI-O, for testing against each cloud-hypervisor PR, on the cloud-hypervisor branch. Fixes: #3908 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-20 11:24:40 +01:00
GabyCT	f194c8da1b	Merge pull request #3912 from devimc/2022-03-17/updateSGXDoc doc: update Intel SGX use cases document	2022-03-18 14:08:53 -06:00
Eduardo Lima (Etrunko)	1cad3a4696	agent/random: Ensure data.len > 0 Also adds a test to cover this scenario Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>	2022-03-18 15:13:51 -03:00
Eduardo Lima (Etrunko)	33c953ace4	agent: Add test_ressed_rng_not_root Same as previous test, but does not skip if it is not running as root. Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>	2022-03-18 15:13:51 -03:00
Julio Montes	24b29310b2	doc: update Intel SGX use cases document Installation section is not longer needed because of the latest default kata kernel supports Intel SGX. Include QEMU to the list of supported hypervisors. fixes #3911 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-03-18 09:19:09 -06:00
Wainer dos Santos Moschetta	39a35b693a	agent: Add test to random::reseed_rng() Introduced an unit test for the random::reseed_rng() function. Fixes #291 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2022-03-18 10:23:22 -03:00
Eduardo Lima (Etrunko)	d8f39fb269	agent/random: Rename RNDRESEEDRNG to RNDRESEEDCRNG Make this definition match the one in kernel: `5bfc75d92e/include/uapi/linux/random.h (L38-L39)` Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>	2022-03-18 10:23:22 -03:00
Julio Montes	bc3f63bf0a	Merge pull request #3903 from devimc/2022-03-15/bumpQEMU6.2 tools: update QEMU to 6.2	2022-03-17 10:28:23 -06:00
Julio Montes	18d4d7fb1d	tools: update QEMU to 6.2 bring Intel SGX support Changes tha may impact in Kata Containers Arm: The 'virt' machine now supports an emulated ITS The 'virt' machine now supports more than 123 CPUs in TCG emulation mode The pl031 real-time clock device now supports sending RTC_CHANGE QMP events PowerPC: Improved POWER10 support for the 'powernv' machine Initial support for POWER10 DD2.0 CPU added Added support for FORM2 PAPR NUMA descriptions in the "pseries" machine type s390x: Improved storage key emulation (e.g. fixed address handling, lazy storage key enablement for TCG, ...) New gen16 CPU features are now enabled automatically in the latest machine type KVM: Support for SGX in the virtual machine, using the /dev/sgx_vepc device on the host and the "memory-backend-epc" backend in QEMU. New "hv-apicv" CPU property (aliased to "hv-avic") sets the HV_DEPRECATING_AEOI_RECOMMENDED bit in CPUID[0x40000004].EAX. virtio-mem: QEMU now fully supports guest memory dumps with virtio-mem. QEMU now cleanly supports precopy migration, postcopy migration and background snapshots with virtio-mem. fixes #3902 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-03-16 10:35:39 -06:00
Fabiano Fidêncio	55e1304fef	Merge pull request #3901 from GabyCT/topic/fixcommitm action: Update link for format patch documentation	2022-03-15 20:13:15 +01:00
Gabriela Cervantes	62351637da	action: Update link for format patch documentation This PR updates the link for the format patch documentation for the commit message check. Fixes #3900 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-03-15 16:11:43 +00:00
Miao Xia	a2f5c1768e	runtime/virtcontainers: Pass the hugepages resources to agent The hugepages resources claimed by containers should be limited by cgroup in the guest OS. Fixes: #3695 Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>	2022-03-15 18:46:08 +08:00
Feng Wang	84aebac327	Merge pull request #3875 from fengwang666/fix-shim-leak runtime: properly handle ESRCH error when signaling container	2022-03-14 12:47:35 -07:00
Feng Wang	aa5ae6b17c	runtime: Properly handle ESRCH error when signaling container Currently kata shim v2 doesn't translate ESRCH signal, causing container fail to stop and shim leak. Fixes: #3874 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-14 11:03:05 -07:00
GabyCT	bbcdfaa494	Merge pull request #3868 from cmaf/update-k8s-docs-1 docs: Update k8s documentation	2022-03-14 09:32:58 -06:00
James O. D. Hunt	afa090ad7b	Merge pull request #3867 from Shensd/main rustjail: optimization, merged several writelns into one	2022-03-14 10:05:48 +00:00
Peng Tao	2edb33ee4a	Merge pull request #3880 from garrettmahin/fix-readme-typo doc: fix kata-deploy README typo	2022-03-14 16:20:01 +08:00
zhanghj	efa19c41eb	device: use const strings for block-driver option instead of hard coding Currently, the block driver option is specifed by hard coding, maybe it is better to use const string variables instead of hard coded strings. Another modification is to remove duplicate consts for virtio driver in manager.go. Fixes: #3321 Signed-off-by: Jason Zhang <zhanghj.lc@inspur.com>	2022-03-14 09:20:43 +08:00
Garrett Mahin	dacf6e3955	doc: fix filename typo Corrects a filename typo in cleanup cluster part of kata-deploy README.md Fixes: #3869 Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>	2022-03-13 17:39:08 -05:00
Fabiano Fidêncio	358081c4ae	Merge pull request #3873 from likebreath/0311/clh_v22.1 versions: Upgrade to Cloud Hypervisor v22.1	2022-03-12 10:27:53 +01:00
Jack Hance	92ce5e2dc4	rustjail: optimization, merged several writelns into one Optimized several writelns by merging them into one in src/utils.rs Fixes: #3772 Signed-off-by: Jack Hance <jack.hance@ndsu.edu>	2022-03-11 13:18:58 -06:00
Bo Chen	7a18e32fa7	versions: Upgrade to Cloud Hypervisor v22.1 This is a bug fix release. The following issues have been addressed: 1) VFIO ioctl reordering to fix MSI on AMD platforms; 2) Fix virtio-net control queue. Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v22.1 Fixes: #3872 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-03-11 08:27:08 -08:00
James O. D. Hunt	095bc2d50a	Merge pull request #3858 from jodh-intel/kata-manager-add-more-options Add debug and self-test control options to Kata Manager	2022-03-11 13:42:00 +00:00
Chelsea Mafrica	5c434270d1	docs: Update k8s documentation Update documentation with missing step to untaint node to enable scheduling and update the example to run a pod using the kata runtime class instead of untrusted workloads, which applies to versions of CRI-O prior to v1.12. Fixes #3863 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-03-10 21:11:14 -08:00
Fabiano Fidêncio	036a76e79c	Merge pull request #3865 from jodh-intel/scripts-fix-here-docs scripts: Change here document delimiters	2022-03-10 20:09:38 +01:00
James O. D. Hunt	5d6d39be48	scripts: Change here document delimiters Fix the outstanding scripts using non standard shell here document delimiters. This should have been caught by https://github.com/kata-containers/tests/pull/3937, but there is a bug in the checker which is fixed on https://github.com/kata-containers/tests/pull/4569. Fixes: #3864. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-10 09:23:37 +00:00
James O. D. Hunt	be12baf3cf	manager: Change here documents to use standard delimiter All scripts should use `EOF` as the shell here document delimiter as this is checked by the static checker. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-10 09:19:29 +00:00
James O. D. Hunt	9576a7da5d	manager: Add options to change self test behaviour Added new `kata-manager` options to control the self-test behaviour. By default, after installation the manager will run a test to ensure a Kata Containers container can be created. New options allow: - The self test to be disabled. - Only the self test to be run (no installation). These features allow changes to be made to the installed system before the self test is run. Fixes: #3851. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-10 09:19:29 +00:00
James O. D. Hunt	d4d65bed38	manager: Add option to enable component debug Added a `-d` option to `kata-manager` to enable Kata Containers and containerd debug. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-10 09:19:29 +00:00
James O. D. Hunt	019da91d79	manager: Whitespace fix Remove additional blank line in the `kata-manager`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-10 09:19:29 +00:00
James O. D. Hunt	d234cb76b5	manager: Create containerd link Make the `kata-manager` create a `containerd` link to ensure the downloaded containerd systemd service file can find the daemon when using the GitHub packaged version of containerd. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-10 09:19:29 +00:00
Fabiano Fidêncio	5a7fd943c1	Merge pull request #3838 from bradenrayhorn/get-memory-info-tests agent: add tests for get_memory_info function	2022-03-09 23:21:20 +01:00
Braden Rayhorn	c088a3f3ad	agent: add tests for get_memory_info function Add test coverage for get_memory_info function in src/rpc.rs. Includes some minor refactoring of the function. Fixes #3837 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-03-09 11:34:35 -06:00
Fabiano Fidêncio	443c04ec6c	Merge pull request #3857 from jodh-intel/ci-update-gha-token-name CI: Update GHA secret name	2022-03-09 11:53:00 +01:00
Eric Ernst	e042593208	Merge pull request #3848 from fidencio/wip/release-dont-consider-rc-as-stable tools: release: Do not consider release candidates as stable releases	2022-03-08 15:09:04 -08:00
Julio Montes	200494cde4	Merge pull request #3853 from devimc/2022-03-08/fix-cve-2022-0847 kernel: fix cve-2022-0847	2022-03-08 13:26:54 -06:00
GabyCT	5620e23c0f	Merge pull request #3855 from GabyCT/topic/updoc docs: Update contact link in runtime README	2022-03-08 11:44:54 -06:00
James O. D. Hunt	4b1e2f527e	CI: Update GHA secret name Change the secret used by the GitHub Action that adds the PR size label to one with the correct set of privileges. Fixes: #3856. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-08 17:06:16 +00:00
Gabriela Cervantes	ffdf961ae9	docs: Update contact link in runtime README This PR updates the contact link in the runtime README document. Fixes #3854 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-03-08 16:27:34 +00:00
Julio Montes	293e61dc6e	Merge pull request #3766 from dgibson/hugepages Improve error checking of hugepage allocation	2022-03-08 10:21:57 -06:00
Julio Montes	5ec7592dfa	kernel: fix cve-2022-0847 bump guest kernel version to fix cve-2022-0847 "Dirty Pipe" fixes #3852 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-03-08 09:49:15 -06:00
James O. D. Hunt	6c52168dd8	Merge pull request #3842 from jodh-intel/ci-gha-add-pr-size-label CI: Create GHA to add PR sizing label	2022-03-08 15:14:10 +00:00
James O. D. Hunt	6a850899c9	CI: Create GHA to add PR sizing label Created a new GitHub Action workflow file that adds a sizing label to each PR. Fixes: #3841. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-08 14:11:17 +00:00
Peng Tao	99f794ca4d	Merge pull request #3846 from egernst/revert-kata-deploy-changes-after-2.4.0-rc0-release release: Revert kata-deploy changes after 2.4.0-rc0 release	2022-03-08 13:52:44 +08:00
Eric Ernst	2b41d275a6	release: Revert kata-deploy changes after 2.4.0-rc0 release As 2.4.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup tags back to "latest", and re-add the kata-deploy-stable and the kata-cleanup-stable files. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-03-07 14:14:56 -08:00
Eric Ernst	8d545f7438	Merge pull request #3845 from egernst/2.4.0-rc0-branch-bump # Kata Containers 2.4.0-rc0	2022-03-07 13:58:47 -08:00
Eric Ernst	a4dcaf3cf4	release: Kata Containers 2.4.0-rc0 - Enhancement: fix comments/logs and delete not used function - storage: make k8s emptyDir volume creation location configurable - Implement direct-assigned volume - Bump containerd to 1.6.1 - experimentally enable vcpu hotplug and virtio-mem on arm64 in kernel part - versions: Upgrade to Cloud Hypervisor v22.0 - katatestutils: remove distro constraints - Minor fixes for the `disable_block_device_use` comments - clh: stop virtofsd if clh fails to boot up the vm - clh: tdx: Don't use sharedFS with Confidential Guests - runtime: Build golang components with extra security options - snap: Use git clone depth 1 for QEMU and dependencies - snap: Don't build cloud-hypevisor on ppc64le - build: always reset ARCH after getting it - virtcontainers: remove temp dir created for vsock in test code - docs: Add unit testing presentation - virtcontainers: Use available s390x hugepages - Update QEMU >= 6.1.0 in configure-hypervisor.sh - Fix monitor listen address - snap: clh: Re-use kata-deploy script here - osbuilder: Add CentOS Stream rootfs - runtime: Gofmt fixes - Update `confidential_guest` comments - cleanup runtime pkgs for Darwin build, add basic Darwin build/unit test - docs: Update Readme document - runtime: use Cmd.StdoutPipe instead of self-created pipe - docs: Developer-Guide build a custom Kata agent with musl - kata-agent: Fix mismatching error of cgroup and mountinfo. - runtime, config: make selinux configurable - Fix unbound variable / typo on error mesage - clh: Add TDX support - virtcontainers: Do not add a virtio-rng-ccw device - kata-monitor: fix collecting metrics for sandboxes not started through CRI - runtime: fix package declaration for ppc64le - Make the hypervisor framework not Linux specific - kata-deploy: Simplify Dockerfile and support s390x - Support nerdctl OCI hooks - shim: log events for CRI-O - docs: Update contributing link - kata-deploy: Use (kata with) qemu as the default shim-v2 binary - kata-monitor: simplify sandbox cache management and attach kubernetes POD metadata to metrics - nydus: add lazyload support for kata with clh - kernel: remove SYS_SUPPORTS_HUGETLBFS from powerpc fragments - packaging: Use `patch` for applying patches - virtcontainers: Remove duplicated assert messages in utils test code - versions: add nydus-snapshotter - docs: Update limitations document - packaging: support qemu-tdx - Kata manager fix install - versions: Linux 5.15.x - trace-forwarder/agent-ctl: run cargo fmt/clippy in make check - docs: Improve top-level README - runtime: use github.com/mdlayher/vsock@v1.1.0 - tools: Build cloud-hypervisor with "--features tdx" - virtiofsd: Use "-o announce_submounts" - feature: hugepages support - tools: clh: Allow to set when to build from sources and the build flags passed down to cargo - docs: Remove docker run and shared memory from limitations - versions: Udpate Cloud Hypervisor to 55479a64d237 - kernel: add missing config fragment for TDx - runtime: The index variable is initialized multiple times in for - scripts: fix a typo while to check build_type - versions: bump CRI-O to its 1.23 release - feature(nydusd): add nydusd support to introduce lazyload ability - docs: Fix relative links in Markdown - kernel: support TDx - device: Actually update PCIDEVICE_ environment variables for the guest - docs: Update link to EFK stack docs - runtime: support QEMU SGX - snap: update qemu version to 6.1.0 for arm - Release process related fixes - openshift-ci: switch to CentOS Stream - virtcontainers: Split the rootless package into OS specific parts - runtime: suppport split firmware - kata-deploy: for testing, make sure we use the PR branch - docs: Remove Zun documentation with kata containers - agent: Fix execute_hook() args error - workflows: stop checking revert commit `84dff440` release: Adapt kata-deploy for 2.4.0-rc0 `b257e0e5` rustjail: delete function signal in BaseContainer `d647b28b` agent: delete meaningless FIXME comment `1b34494b` runtime: fix invalid comments for pkg/resourcecontrol `afc567a9` storage: make k8s emptyDir creation configurable `e76519af` runtime: small refactor to improve readability `7e5f11a5` vendor: Update containerd to 1.6.1 `42771fa7` runtime: don't set socket and thread for arm/virt `8828ef41` kernel: add arm experimental kernel build support `8a9007fe` config: remove 2 config as they are removed in 5.15 `1b6f7401` kernel: add arm experimental patches to support vcpu hotplug and virtio-mem `f905161b` runtime: mount direct-assigned block device fs only once `27fb4902` agent: add get volume stats handler in agent `ea51ef1c` runtime: forward the stat and resize requests from shimv2 to kata agent `c39281ad` runtime: update container creation to work with direct assigned volumes `4e00c237` agent: add grpc interface for stat and resize operations `e9b5a255` runtime: add stat and resize APIs to containerd-shim-v2 `6e0090ab` runtime: persist direct volume mount info `fa326b4e` runtime: augment kata-runtime CLI to support direct-assigned volume `b8844fb8` versions: Upgrade to Cloud Hypervisor v22.0 `af804734` clh: stop virtofsd if clh fails to boot up the vm `97951a2d` clh: Don't use SharedFS with Confidential Guests `c30b3a9f` clh: Adding a volume is not supported without SharedFS `f889f1f9` clh: introduce supportsSharedFS() `54d27ed7` clh: introduce loadVirtiofsDaemon() `ae2221ea` clh: introduce stopVirtiofsDaemon() `e8bc26f9` clh: introduce setupVirtiofsDaemon() `413b3b47` clh: introduce createVirtiofsDaemon() `55cd0c89` runtime: Build golang components with extra security options `76e4f6a2` Revert "hypervisors: Confidential Guests do not support Device hotplug" `fa8b9392` config: qemu: Fix disable_block_device_use comments `9615c8bc` config: fc: Don't expose disable_block_device_use `c1fb4bb7` snap: Don't build cloud-hypevisor on ppc64le `58913694` snap: Use git clone depth 1 for QEMU and dependencies `b27c7f40` docs: Add unit testing presentation `e64c54a2` monitor: Listen to localhost only by default `e6350d3d` monitor: Fix build options `a67b93bb` snap: clh: Re-use kata-deploy script here `f31125fe` version: Bump cloud-hypervisor to b0324f85571c441f `54d0a672` subsystem: build `edf20766` docs: Update Readme document `eda8ea15` runtime: Gofmt fixes `4afb278f` ci: add github action to exercise darwin build, unit tests `e355a718` container: file is not linux specific `b31876ee` device-manager: move linux-only test to a linux-only file `6a5c6344` resourcecontrol: SystemdCgroup check is not necessarily linux specific `cc58cf69` resourcecontrol: convert stats dev_t to unit64types `5be188cc` utils: Add darwin stub `ad044919` virtcontainers: Convert stats dev_t to uint64 `56751089` katautils: Use a syscall wrapper for the hook JSON state `7d64ae7a` runtime: Add a syscall wrapper package `abc681ca` katautils: Add Darwin stub for the netNS API `de574662` config: Expand confidential_guest comments `641d475f` config: clh: Use "Intel TDX" instead of just "TDX" `0bafa2de` config: clh: Mention supported TEEs `81ed269e` runtime: use Cmd.StdoutPipe instead of self-created pipe `8edca8bb` kata-agent: Fix mismatching error of cgroup and mountinfo. `a9ba7c13` clh: Fix typo on HotplugRemoveDevice `827ab82a` tools: clh: Fix unbound variable `082d538c` runtime: make selinux configurable `1103f5a4` virtcontainers: Use FilesystemSharer for sharing the containers files `533c1c0e` virtcontainers: Keep all filesystem sharing prep code to sandbox.go `61590bbd` virtcontainers: Add a Linux implementation for the FilesystemSharer `03fc1cbd` virtcontainers: Add a filesystem sharing interface `72434333` clh: Add TDX support `a13b4d5a` clh: Add firmware to the config file `a8827e0c` hypervisors: Confidential Guests do not support NVDIMM `f50ff9f7` hypervisors: Confidential Guests do not support Memory hotplug `df8ffecd` hypervisors: Confidential Guests do not support Device hotplug `28c4c044` hypervisors: Confidential Guests do not support VCPUs hotplug `29ee870d` clh: Add confidential_guest to the config file `9621c596` clh: refactor image / initrd configuration set `dcdc412e` clh: use common kernel params from the hypervisor code `4c164afb` versions: Update Cloud Hypervisor to 5343e09e7b8db `b2a65f90` virtcontainers: Use available s390x hugepages `cb4230e6` runtime: fix package declaration for ppc64le `fec26f8e` kata-monitor: trivial: rename symbols & labels `9fd4e551` runtime: Move the resourcecontrol package one layer up `823faee8` virtcontainers: Rename the cgroups package `0d1a7da6` virtcontainers: Rename and clean the cgroup interface `ad10e201` virtcontainers: cgroups: Move non Linux routine to utils.go `d49d0b6f` virtcontainers: cgroups: Define a cgroup interface `3ac52e81` kata-monitor: fix updating sandbox cache at startup `160bb621` kata-monitor: bump version to 0.3.0 `1a3381b0` docs: Developer-Guide build a custom Kata agent with musl `f6fc1621` shim: log events for CRI-O `1d68a08f` docs: Update contributing link `9123fc09` kata-deploy: Simplify Dockerfile and support s390x `11220f05` kata-deploy: Use (kata with) qemu as the default shim-v2 binary `3175aad5` virtiofs-nydus: add lazyload support for kata with clh `94b831eb` virtcontainers: remove temp dir created for vsock in test code `8cc1b186` kernel: remove SYS_SUPPORTS_HUGETLBFS from powerpc fragments `5c9d2b41` packaging: Use `patch` for applying patches `5b3fb6f8` kernel: Build SGX as part of the vanilla kernel `2c35d8cb` workflows: Stop building the experimental kernel `32e7845d` snap: Build vanilla kernel for all arches `27de212f` runtime: Always add network endpoints from the pod netns `1cee0a94` virtcontainers: Remove duplicated assert messages in utils test code `6c1d149a` docs: Update limitations document `7c4ee6ec` packaging/qemu: create no_patches file for qemu-tdx `d47c488b` versions: add qemu tdx section `77c29bfd` container: Remove VFIO lazy attach handling `7241d618` versions: add nydus-snapshotter `26b3f001` virtcontainers: Split hypervisor into Linux and OS agnostic bits `fa0e9dc6` virtcontainers: Make all Linux VMMs only build on Linux `c91035d0` virtcontainers: Move non QEMU specific constants to hypervisor.go `10ae0591` virtcontainers: Move guest protection definitions to hypervisor.go `b28d0274` virtcontainers: Make max vCPU config less QEMU specific `a5f6df6a` govmm: Define the number of supported vCPUs per architecture `a6b40151` tools: clh: Remove unused variables `5816c132` tools: Build cloud-hypervisor with "--features tdx" `e6060cb7` versions: Linux 5.15.x `9818cf71` docs: Improve top-level and runtime README `36c3fc12` agent: support hugepages for containers `81a8baa5` runtime: add hugepages support `7df677c0` runtime: Update calculateSandboxMemory to include Hugepages Limit `948a2b09` tools: clh: Ensure the download binary is executable `72bf5496` agent: handle hook process result `80e8dbf1` agent: valid envs for hooks `4f96e3ea` katautils: Pass the nerdctl netns annotation to the OCI hooks `a871a33b` katautils: Run the createRuntime hooks `d9dfce14` katautils: Run the preStart hook in the host namespace `6be6d0a3` katautils: Pass the OCI annotations back to the called OCI hooks `493ebc8c` utils: Update kata manager docs `34b2e67d` utils: Added more kata manager cli options `714c9f56` utils: Improve containerd configuration `c464f326` utils: kata-manager: Force containerd sym link creation `4755d004` utils: Fix unused parameter `601be4e6` utils: Fix containerd installation `ae21fcc7` utils: Fix Kata tar archive check `f4d1e45c` utils: Add kata-manager CLI options for kata and containerd `395cff48` docs: Remove docker run and shared memory from limitations `e07545a2` tools: clh: Allow passing down a build flag `55cdef22` tools: clh: Add the possibility to always build from sources `3f87835a` utils: Switch kata manager to use getopts `4bd945b6` virtiofsd: Use "-o announce_submounts" `37df1678` build: always reset ARCH after getting it `3a641b56` katatestutils: remove distro constraints `90fd625d` versions: Udpate Cloud Hypervisor to 55479a64d237 `573a37b3` osbuilder: Add CentOS Stream rootfs `f10642c8` osbuilder: Source .cargo/env before checking Rust `955d359f` kernel: add missing config fragment for TDx `734b618c` agent-ctl: run cargo fmt/clippy in make check `12c37faf` trace-forwarder: add make check for Rust `c1ce67d9` runtime: use github.com/mdlayher/vsock@v1.1.0 `42a878e6` runtime: The index variable is initialized multiple times in for `1797b3eb` packaging/kernel: build TDX guest kernel `98752529` versions: add url and tag for tdx kernel `bc8464e0` packaging/kernel: add option -s option `2d9f89ae` feature(nydusd): add nydusd support to introduse lazyload ability `b19b6938` docs: Fix relative links in Markdown `9590874d` device: Update PCIDEVICE_ environment variables for the guest `7b7f426a` device: Keep host to VM PCI mapping persistently `0b2bd641` device: Rework update_spec_pci() to update_env_pci() `982f14fa` runtime: support QEMU SGX `40aa43f4` docs: Update link to EFK stack docs `54e1faec` scripts: fix a typo while to check build_type `07b9d93f` virtcontainer: Simplify the sandbox network creation flow `2c7087ff` virtcontainers: Make all endpoints Linux only `49d2cde1` virtcontainers: Split network tests into generic and OS specific parts `0269077e` virtcontainers: Remove the netlink package dependency from network.go `7fca5792` virtcontainers: Unify Network endpoints management interface `c67109a2` virtcontainers: Remove the Network PostAdd method `e0b26443` virtcontainers: Define a Network interface `5e119e90` virtcontainers: Rename the Network structure fields and methods `b858d0de` virtcontainers: Make all Network fields private `49eee79f` virtcontainers: Remove the NetworkNamespace structure `844eb619` virtcontainers: Have CreateVM use a Network reference `d7b67a7d` virtcontainers: Network API cleanups and simplifications `2edea883` virtcontainers: Make the Network structure manage endpoints `8f48e283` virtcontainers: Expand the Network structure `5ef522f7` runtime: check kvm module `sev` correctly `419d8134` snap: update qemu version to 6.1.0 for arm `00722187` docs: update Release-Process.md `496bc10d` tools: check for yq before using it `88a70d32` Revert "workflows: Ensure a label change re-triggers the actions" `a9bebb31` openshift-ci: switch to CentOS Stream `89047901` kata-deploy-push: only run if PR modifying tools path `7ffe9e51` virtcontainers: Do not add a virtio-rng-ccw device `1f29478b` runtime: suppport split firmware `24796d2f` kata-deploy: for testing, make sure we use the PR branch `1cc1c8d0` docs: Remove images from Zun documentation `5861e52f` docs: Remove Zun documentation with kata containers `903a6a45` versions: Bump critools to its 1.23 release `63eb1158` versions: bump CRI-O to its 1.23 release `5083ae65` workflows: stop checking revert commit `14e7f52a` virtcontainers: Split the rootless package into OS specific parts `ab447285` kata-monitor: add kubernetes pod metadata labels to metrics `834e199e` kata-monitor: drop unused functions `7516a8c5` kata-monitor: rework the sandbox cache sync with the container manager `e78d80ea` kata-monitor: silently ignore CHMOD events on the sandboxes fs `e9eb34ce` kata-monitor: improve debug logging `4fc4c76b` agent: Fix execute_hook() args error Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-03-07 11:15:25 -08:00
Eric Ernst	84dff44057	release: Adapt kata-deploy for 2.4.0-rc0 kata-deploy files must be adapted to a new release. The cases where it happens are when the release goes from -> to: * main -> stable: * kata-deploy-stable / kata-cleanup-stable: are removed * stable -> stable: * kata-deploy / kata-cleanup: bump the release to the new one. There are no changes when doing an alpha release, as the files on the "main" branch always point to the "latest" and "stable" tags. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-03-07 11:15:25 -08:00
Fabiano Fidêncio	4adf93ef2c	tools: release: Do not consider release candidates as stable releases During the release of 2.4.0-rc0 @egernst noticed an incositency in the way we handle release tags, as release candidates are being taken as "stable" releases, while both the kata-deploy tests and the release action consider this as "latest". Ideally we should have our own tag for "release candidate", but that's something that could and should be discussed more extensively outside of the scope of this quick fix. For now, let's align the code generating the PR for bumping the release with what we already do as part of the release action and kata-deploy test, and tag "-rc" as latest, regardless of which branch it's coming from. Fixes: #3847 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-07 20:09:18 +01:00
Jakob Naucke	72f7e9e300	osbuilder: Multistrap Ubuntu Use `multistrap` for building Ubuntu rootfs. Adds support for building for foreign architectures using the `ARCH` environment variable. In the process, the Ubuntu rootfs workflow is vastly simplified. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-03-07 11:58:46 +01:00
Jakob Naucke	df511bf179	packaging: Enable cross-building agent Requires setting ARCH and CC. - Add CC linker option for building agent. - Set host for building libseccomp. Fixes: #3681 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-03-07 11:58:46 +01:00
Jakob Naucke	0a313eda1c	osbuilder: Fix use of LIBC in rootfs.sh - Add a doc comment - Pass to build container, e.g. to build x86_64 with glibc (would always use musl) Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-03-07 11:58:46 +01:00
Jakob Naucke	2c86b956fa	osbuilder: Simplify Rust installation no double export, direct target Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-03-07 11:58:46 +01:00
Jakob Naucke	0072cc2b66	osbuilder: Remove musl installations Remove a lot of cruft of musl installations -- we needed those for the Go agent, but Rustup just takes care of everything. aarch64 on Debian-based & Alpine is an exception -- create a symlink `aarch64-linux-musl-gcc` to `musl-tools`'s `musl-gcc` or `gcc` on Alpine. This is unified -- arch-specific Dockerfiles are removed. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-03-07 11:58:46 +01:00
Jakob Naucke	5c3e553624	osbuilder: apk add --no-cache Hadolint DL3019. If you're wondering why this is in this PR, that's because I touch the file later, and we're only triggering the lints for changed files. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-03-07 11:58:46 +01:00
Bin Liu	deb8ce97a8	Merge pull request #3836 from liubin/fix/minor-fix Enhancement: fix comments/logs and delete not used function	2022-03-07 17:26:30 +08:00
bin	b257e0e5ab	rustjail: delete function signal in BaseContainer Function signal in BaseContainer is not used anymore. Fixes: #3835 Signed-off-by: bin <bin@hyper.sh>	2022-03-05 10:33:15 +08:00
bin	d647b28bb8	agent: delete meaningless FIXME comment The test has passed, the FIX comment should be deleted. Fixes: #3835 Signed-off-by: bin <bin@hyper.sh>	2022-03-05 10:33:15 +08:00
bin	1b34494b2f	runtime: fix invalid comments for pkg/resourcecontrol Some comments are copied and not adjusted to the pkg/resourcecontrol package. Fixes: #3835 Signed-off-by: bin <bin@hyper.sh>	2022-03-05 10:32:31 +08:00
Eric Ernst	522eb8f3c3	Merge pull request #2056 from evanfoster/guest-empty-dir storage: make k8s emptyDir volume creation location configurable	2022-03-04 16:53:31 -08:00
Evan Foster	afc567a9ae	storage: make k8s emptyDir creation configurable This change introduces the `disable_guest_empty_dir` config option, which allows the user to change whether a Kubernetes emptyDir volume is created on the guest (the default, for performance reasons), or the host (necessary if you want to pass data from the host to a guest via an emptyDir). Fixes #2053 Signed-off-by: Evan Foster <efoster@adobe.com>	2022-03-04 12:02:42 -08:00
Eric Ernst	1e301482e7	Merge pull request #3406 from fengwang666/direct-blk-assignment Implement direct-assigned volume	2022-03-04 11:58:37 -08:00
Feng Wang	e76519af83	runtime: small refactor to improve readability Remove some confusing/duplicate code so it's more readable Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-04 10:00:52 -08:00
Fabiano Fidêncio	09d7f89ea8	Merge pull request #3822 from fidencio/wip/bump-containerd-to-1.6.1 Bump containerd to 1.6.1	2022-03-04 17:53:12 +01:00
Fabiano Fidêncio	7e5f11a52b	vendor: Update containerd to 1.6.1 Let's bring in the latest release of Containerd, 1.6.1, released on March 2nd, 2022. With this, we take the opportunity to remove containerd/api reference as we shouldn't need a separate module only for the API. Here's the list of changes needed in the code due to the bump: * stop using `grpc.WithInsecure()` as it's been deprecated - use `grpc.WithTransportCredentials(insecure.NewCredentials())` instead Fixes: #3820 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-04 10:28:40 +01:00
Fabiano Fidêncio	2af91b23e1	Merge pull request #3281 from jongwu/vcpu_hotplug_arm64 experimentally enable vcpu hotplug and virtio-mem on arm64 in kernel part	2022-03-04 09:14:31 +01:00
Fabiano Fidêncio	d4545ca099	Merge pull request #3826 from likebreath/0303/clh_v22.0 versions: Upgrade to Cloud Hypervisor v22.0	2022-03-04 09:08:59 +01:00
Jianyong Wu	42771fa726	runtime: don't set socket and thread for arm/virt As this is just a initial vcpu hotplug support, thread and socket has not been supported. So, don't set socket and thread when hotadd cpu for arm/virt. Fixes: #3280 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-03-04 11:22:18 +08:00
Jianyong Wu	8828ef4176	kernel: add arm experimental kernel build support Add a new entry of arm-kernel-experimental and let the kernel build script support to build it. Fixes: #3280 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-03-04 11:22:18 +08:00
Jianyong Wu	8a9007fe45	config: remove 2 config as they are removed in 5.15 I'm sure that it is correct to remove CONFIG_ARM64_UAO and CONFIG_MANDATORY_FILE_LOCKING and . Both are gone in 5.15. Maintain a specific config files for a kernel version is a little ugly. If someone needs them, shout at me. Fixes: #3280 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-03-04 11:22:18 +08:00
Jianyong Wu	1b6f7401e0	kernel: add arm experimental patches to support vcpu hotplug and virtio-mem As the support for vcpu hotplug is on the road, I pick them up here as experimental to let user try cpu hotplug and virtio-mem on arm64. Fixes: #3280 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-03-04 11:22:18 +08:00
Feng Wang	f905161bbb	runtime: mount direct-assigned block device fs only once Mount the direct-assigned block device fs only once and keep a refcount in the guest. Also use the ro flag inside the options field to determine whether the block device and filesystem should be mounted as ro Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:57:02 -08:00
shuochen0311	27fb490228	agent: add get volume stats handler in agent retrieve the stats of direct-assigned volumes from the guest Fixes: #3454 Signed-off-by: shuochen0311 <shuo.chen@databricks.com>	2022-03-03 18:57:02 -08:00
Feng Wang	ea51ef1c40	runtime: forward the stat and resize requests from shimv2 to kata agent Translate the volume path from host-known path to guest-known path and forward the request to kata agent. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:57:02 -08:00
Feng Wang	c39281ad65	runtime: update container creation to work with direct assigned volumes During the container creation, it will parse the mount info file of the direct assigned volumes and update the in memory mount object. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:57:02 -08:00
Feng Wang	4e00c2377c	agent: add grpc interface for stat and resize operations Add GetVolumeStats and ResizeVolume APIs for the runtime to query stat and resize fs in the guest. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:57:02 -08:00
Feng Wang	e9b5a25502	runtime: add stat and resize APIs to containerd-shim-v2 To query fs stats and resize fs, the requests need to be passed to kata agent through containerd-shim-v2. So we're adding to rest APIs on the shim management endpoint. Also refactor shim management client to its own go file. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:56:53 -08:00
Feng Wang	6e0090abb5	runtime: persist direct volume mount info In the direct assigned volume scenario, Kata Containers persists the information required for managing the volume inside the guest on host filesystem. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 15:32:12 -08:00
Feng Wang	fa326b4e0f	runtime: augment kata-runtime CLI to support direct-assigned volume Add commands to add, remove, resize and get stats of a direct-assigned volume. These commands are expected to be consumed by CSI. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 15:32:03 -08:00
Bo Chen	b8844fb8a9	versions: Upgrade to Cloud Hypervisor v22.0 Highlights from the Cloud Hypervisor release v22.0: 1) GDB Debug Stub Support; 2) `virtio-iommu` Backed Segments (to facilitate hotplug devices that require being behind an IOMMU, e.g. QAT); 3) Before Boot Configuration Changes; 4) `virtio-balloon` Free Page Reporting; 5) Support for Direct Kernel Booting with TDX; 6) PMU Support for AArch64; 7) Documentation Under CC-BY-4.0 License; 8) Deprecation of "Classic" virtiofsd (rust-based virtiofsd now is recommended); 9) Bug fixes on `virtio-balloon`, `virtio-net` with multiple TAP fd support, REST APIs, seccomp filters, migration with `vhost-user`, etc; Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v22.0 Fixes: #3825 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-03-03 15:15:54 -08:00
Fabiano Fidêncio	a2422cf2a1	Merge pull request #3389 from zhsj/rm-distro-test katatestutils: remove distro constraints	2022-03-03 23:26:58 +01:00
Fabiano Fidêncio	12af632952	Merge pull request #3814 from fidencio/wip/disable-block-device-use-minor-fixes Minor fixes for the `disable_block_device_use` comments	2022-03-03 23:26:05 +01:00
Julio Montes	6628977fcd	Merge pull request #3823 from fidencio/wip/clh-stop-virtiofsd-if-clh-fails-to-boot-up-the-vm clh: stop virtofsd if clh fails to boot up the vm	2022-03-03 14:53:52 -06:00
Fabiano Fidêncio	af80473496	clh: stop virtofsd if clh fails to boot up the vm If, for some reason, we're able to launch cloud hypervisor but not able to boot the VM up, the virtiofsd process would be left behind. Let's ensure, via defer, that we stop virtiofsd in case of errors. Fixes: #3819 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 19:10:37 +01:00
Fabiano Fidêncio	c54bc8e657	Merge pull request #3811 from fidencio/wip/clh-tdx-round-2 clh: tdx: Don't use sharedFS with Confidential Guests	2022-03-03 19:03:28 +01:00
Chelsea Mafrica	343138623c	Merge pull request #3818 from jodh-intel/golang-build-more-securely runtime: Build golang components with extra security options	2022-03-03 09:50:51 -08:00
James O. D. Hunt	799c2f4f2a	Merge pull request #3800 from jodh-intel/git-clone-depth-1-where-possible snap: Use git clone depth 1 for QEMU and dependencies	2022-03-03 16:27:07 +00:00
Fabiano Fidêncio	97951a2d12	clh: Don't use SharedFS with Confidential Guests kata-containers/pulls#3771 added TDX support for Cloud Hypervisor, but two big things got overlooked while doing that. 1. virtio-fs, as of now, cannot be part of the trust boundary, so the Confidential Guest will not be using it. 2. virtio-block hotplug should be enabled in order to use virtio-block for the rootfs (used with the devmapper plugin). When trying to use cloud-hypervisor with TDX using virtio-fs, we're facing the following error on the guest kernel: ``` virtiofs virtio2: device must provide VIRTIO_F_ACCESS_PLATFORM ``` After checking and double-checking with virtiofs and cloud-hypervisor developers, it happens as confidential containers might put some limitations on the device, so it can't access all of the guests' memory and that's where this restriction seems to be coming from. Vivek mentioned that virtiofsd do not support VIRTIO_F_ACCESS_PLATFORM (aka VIRTIO_F_IOMMU_PLATFORM) yet, and that for ecrypted guests virtiofs may not be the best solution at the moment. @sboeuf put this in a very nice way: "if the virtio-fs driver doesn't support VIRTIO_F_ACCESS_PLATFORM, then the pages corresponding to the virtqueues and the buffers won't be marked as SHARED, meaning the VMM won't have access to it". Interestingly enough, it works with QEMU, and it may be due to some change done on the patched QEMU that @devimc is packaging, but we won't take the path to figure out what was the change and patch cloud-hypervisor on the same way, because of 1. Fixes: #3810 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:49:40 +01:00
Fabiano Fidêncio	c30b3a9ff1	clh: Adding a volume is not supported without SharedFS As mounting volumes into the guest requires SharedFS setup, let's ensure we error out if trying to do so in a situation where SharedFS is not supported. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:49:30 +01:00
Fabiano Fidêncio	f889f1f957	clh: introduce supportsSharedFS() supportsSharedFS() is a new method to be used to ensure that no SharedFS specifics are called when, for a reason or another, Cloud Hypervisor is in a mode where SharedFSs are not supported. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:49:28 +01:00
Fabiano Fidêncio	54d27ed721	clh: introduce loadVirtiofsDaemon() Similarly to the `createVirtiofsDaemon` and `stopVirtiofsDaemon` methos, let's introduce and use loadVirtiofsDaemon, at it'll also be handy later in this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:48:38 +01:00
Fabiano Fidêncio	ae2221ea68	clh: introduce stopVirtiofsDaemon() Similary to the `createVirtiofsDaemon` method, let's introduce and use its counterpart, as it'll also be handy later in this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:48:26 +01:00
Fabiano Fidêncio	e8bc26f90d	clh: introduce setupVirtiofsDaemon() Similarly to what's been done with the `createVirtiofsDaemon`, let's create a `setupVirtiofsDaemon` one. It will also become handy later in this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:48:14 +01:00
Fabiano Fidêncio	413b3b477a	clh: introduce createVirtiofsDaemon() Let's introduce and use a new `createVirtiofsDaemon` method. Its name says it all, and it'll be handy later in this series when, spoiler alert, SharedFS cannot be used (in such cases as in Confidential Guests). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:48:02 +01:00
James O. D. Hunt	55cd0c89d8	runtime: Build golang components with extra security options Enable stack protector and fortify source for golang builds. Fixes: #3817. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-03 10:41:26 +00:00
Fabiano Fidêncio	76e4f6a2a3	Revert "hypervisors: Confidential Guests do not support Device hotplug" This reverts commit `df8ffecde0`, as device hotplug is supported and, more than that, is very much needed when using virtio-blk instead of virtio-fs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 09:59:55 +01:00
David Gibson	42e35505b0	agent: Verify that we allocated as many hugepages as we need allocate_hugepages() writes to the kernel sysfs file to allocate hugepages in the Kata VM. However, even if the write succeeds, it's not certain that the kernel will actually be able to allocate as many hugepages as we requested. This patch reads back the file after writing it to check if we were able to allocate all the required hugepages. fixes #3816 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-03 15:59:45 +11:00
David Gibson	608e003abc	agent: Don't attempt to create directories for hugepage configuration allocate_hugepages() constructs the path for the sysfs directory containing hugepage configuration, then attempts to create this directory if it does not exist. This doesn't make sense: sysfs is a view into kernel configuration, if the kernel has support for the hugepage size, the directory will already be there, if it doesn't, trying to create it won't help. For the same reason, attempting to create the "nr_hugepages" file itself is pointless, so there's no reason to call OpenOptions::create(true). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-03 11:24:11 +11:00
Julio Montes	934788eb53	Merge pull request #3812 from fidencio/wip/disable-clh-build-on-ppc64le snap: Don't build cloud-hypevisor on ppc64le	2022-03-02 15:40:01 -06:00
Fabiano Fidêncio	fa8b93927c	config: qemu: Fix disable_block_device_use comments virtio-fs, instead of virtio-9p, is the default shared file system type in case virtio-blk is not used. Fixes: #3813 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-02 20:43:36 +01:00
Fabiano Fidêncio	9615c8bc9c	config: fc: Don't expose disable_block_device_use Relying on virtio-block is the only way to use Firecracker with Kata Containers, as shared FS (virtio-{fs,fs-nydus,9p}) is not supported by Firecracker. As configuration doesn't make sense to be exposed, we hardcode the `false` value in the Firecracker configuration structure. Fixes: #3813 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-02 20:43:28 +01:00
Fabiano Fidêncio	c1fb4bb726	snap: Don't build cloud-hypevisor on ppc64le snapcraft build is failing due to: `` utils.mk:130: "WARNING: powerpc64le-unknown-linux-musl target is unavailable" ``` It seems to happen as powerpc64-unknown-linux-musl is a target that although there's support for it, it's not exactly built or automatically tested, at least according to: https://doc.rust-lang.org/rustc/platform-support.html Fixes: #3803 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-02 19:04:30 +01:00
James O. D. Hunt	58913694d3	snap: Use git clone depth 1 for QEMU and dependencies Use `git clone --depth 1 ...` for QEMU and its dependencies to speed up checkouts. Fixes: #3799. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-02 08:31:06 +00:00
Bin Liu	2ae8bd696a	Merge pull request #3367 from wfly1998/main build: always reset ARCH after getting it	2022-03-02 14:42:45 +08:00
Bin Liu	75877f8793	Merge pull request #3187 from Kvasscn/kata_dev_remove_temp_vsock_dir virtcontainers: remove temp dir created for vsock in test code	2022-03-02 11:05:47 +08:00
Chelsea Mafrica	c49e261819	Merge pull request #3782 from jodh-intel/docs-add-ut-presentation docs: Add unit testing presentation	2022-03-01 11:03:54 -08:00
James O. D. Hunt	b27c7f4068	docs: Add unit testing presentation Add the Kata Containers unit testing presentation I gave to the Kata outreach students as this may be of some use to others. Fixes: #3781 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-01 15:52:03 +00:00
Francesco Giudici	7f638dd049	Merge pull request #3764 from Jakob-Naucke/hugepages-test-s390x virtcontainers: Use available s390x hugepages	2022-03-01 14:33:59 +01:00
Fabiano Fidêncio	01c57da84b	Merge pull request #3552 from goodluckbot/update-hypervisor-version Update QEMU >= 6.1.0 in configure-hypervisor.sh	2022-03-01 14:19:16 +01:00
Fabiano Fidêncio	4ab35b0899	Merge pull request #3796 from jodh-intel/fix-monitor-listen-address Fix monitor listen address	2022-03-01 13:51:01 +01:00
Fabiano Fidêncio	8d4412d89f	Merge pull request #3728 from fidencio/wip/snapcraft-update-clh-installation snap: clh: Re-use kata-deploy script here	2022-03-01 13:07:13 +01:00
Fabiano Fidêncio	6c2cc1fbd1	Merge pull request #3341 from Jakob-Naucke/centos-stream osbuilder: Add CentOS Stream rootfs	2022-03-01 12:20:22 +01:00
Fabiano Fidêncio	97c17085b0	Merge pull request #3770 from Jakob-Naucke/gofmt-vmm-s390x runtime: Gofmt fixes	2022-03-01 11:34:15 +01:00
James O. D. Hunt	e64c54a2ad	monitor: Listen to localhost only by default Change `kata-monitor` to listen to port `8090` on the local interface only by default. > Note: > > This is a breaking change as previously it listened on all interfaces. Fixes: #3795. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-01 10:00:43 +00:00
James O. D. Hunt	e6350d3d45	monitor: Fix build options Removed redundant and duplicated build options to build `kata-monitor` the same way as the other components: - `CGO_ENABLED=0` is not necessary. - `-buildmode=exe` is not necessary since `BUILDFLAGS` already sets the build mode. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-01 10:00:43 +00:00
Fabiano Fidêncio	a67b93bb03	snap: clh: Re-use kata-deploy script here The current snap build for clh is broken as it's not aware of how to build the binary from sources. Instead of fixing it here, let's take advantage of the kata-deploy script, which is capable of building from sources, and re-use it here. Fixes: #3693 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-01 09:03:51 +01:00
Fabiano Fidêncio	f31125fe92	version: Bump cloud-hypervisor to b0324f85571c441f This bump brings a fix on the build script, for ARM, so we can use the very same build script everywhere. The commit of our interest is b0324f85571c441f840e9bdeb25410514a00bb74: ``` scripts: Fix musl build on aarch64 Adding the missing TARGET_CC environment variable to get the build to complete correctly. Fixes #3776 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-01 09:03:51 +01:00
GabyCT	ccb063b848	Merge pull request #3788 from fidencio/wip/update-clh-confidential-guest-comments Update `confidential_guest` comments	2022-02-28 15:11:01 -06:00
GabyCT	bc1733bb0e	Merge pull request #3774 from egernst/delinux-runtime cleanup runtime pkgs for Darwin build, add basic Darwin build/unit test	2022-02-28 15:08:09 -06:00
GabyCT	506ad6f6e7	Merge pull request #3792 from GabyCT/topic/updateread docs: Update Readme document	2022-02-28 14:16:43 -06:00
goodluckbot	54d0a672c5	subsystem: build With the ACPI PCI hotplug changes introduced in 2.3, QEMU >= 6.1 is required. Remove unnecessary qemu version check in build script. Fixes #3547 Signed-off-by: goodluckbot <tangbo_gl@hotmail.com>	2022-03-01 01:18:35 +08:00
Fabiano Fidêncio	21a8ba93c5	Merge pull request #3784 from liubin/fix/3783-use-exec-pipe runtime: use Cmd.StdoutPipe instead of self-created pipe	2022-02-28 18:04:58 +01:00
Gabriela Cervantes	edf20766d1	docs: Update Readme document This PR updates the README document by using the proper link for the contributing guide as well as a misspelling. Fixes #3791 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-02-28 16:52:26 +00:00
Jakob Naucke	eda8ea154a	runtime: Gofmt fixes - Mostly blank lines after `+build` -- see https://pkg.go.dev/go/build@go1.14.15 -- this is, to date, enforced by `gofmt`. - 1.17-style go:build directives are also added. - Spaces in govmm/vmm_s390x.go Fixes: #3769 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-28 17:24:47 +01:00
Eric Ernst	4afb278fe2	ci: add github action to exercise darwin build, unit tests There are a few outstanding changes required to build the runtime on Darwin. Let's add a GitHub action to exercise build and unit tests of the packages which we do expect to work. Eventually this should be dropped and we can run any Darwin specific tests, or just add MacOS to the matrix for our static check OSes. Fixes: #3778 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	e355a71860	container: file is not linux specific This should not be linux specific -- drop restriction. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	b31876eefb	device-manager: move linux-only test to a linux-only file We can't Mkdev on Darwin - let's make sure the vfio test is in a linux-only file. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	6a5c634490	resourcecontrol: SystemdCgroup check is not necessarily linux specific This utility function is also used to check the spec that will run in the guest - no need for this to be linux specific. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	cc58cf6993	resourcecontrol: convert stats dev_t to unit64types Their types may differ on various host OSes, but unix.Major\|Minor always takes a uint64 Depends-on: github.com/kata-containers/tests#4516 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	5be188cc29	utils: Add darwin stub Add a stub for utils_darwin to facilitate building this package on Darwin. We can probably drop this empty stub if we have better abstraction for the various parts of virtcontainers that call it today... Fixes:# 3777 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Samuel Ortiz	ad0449195d	virtcontainers: Convert stats dev_t to uint64 We need to convert them to uint64 as their types may differ on various host OSes, but unix.Major\|Minor takes a uint64 regardless. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-28 08:01:53 -08:00
Samuel Ortiz	56751089c0	katautils: Use a syscall wrapper for the hook JSON state There is no real equivalent of a thread ID on Darwin. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-28 08:01:53 -08:00
Samuel Ortiz	7d64ae7a41	runtime: Add a syscall wrapper package It allows to support syscall variations between host OSes. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-28 08:01:53 -08:00
Samuel Ortiz	abc681ca5f	katautils: Add Darwin stub for the netNS API And move the current implementation into a Linux only file. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-28 08:01:53 -08:00
Fabiano Fidêncio	9e3353a7e4	Merge pull request #3732 from YchauWang/wyc-docs-developer docs: Developer-Guide build a custom Kata agent with musl	2022-02-28 12:14:39 +01:00
Fabiano Fidêncio	de57466212	config: Expand confidential_guest comments Let's clarify that an error will be reported in case confidential_guest is enabled, but the hardware where Kata Containers is running doesn't provide the required feature set. Fixes: #3787 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 11:57:42 +01:00
Fabiano Fidêncio	641d475fa6	config: clh: Use "Intel TDX" instead of just "TDX" Let's use "Intel TDX" rather than just "TDX", as it can ease the understanding of the terminology. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 10:27:21 +01:00
Fabiano Fidêncio	0bafa2def9	config: clh: Mention supported TEEs Let's mention the supported TEEs to be used with confidential guests. Right now, Cloud Hyperisor supports only Intel TDX, used together with TD Shim. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 10:24:33 +01:00
bin	81ed269ed2	runtime: use Cmd.StdoutPipe instead of self-created pipe Nydusd uses a bufio.Scanner to check if nydusd process has existed, but stderr/stdout passed to Cmd is self-created pipe, this pipe will not be closed if the process start failing. Use standard Cmd.StdoutPipe can close the stdout and kata shim will detect the existence of the nydusd process, then call cmd.Wait to reap the process' resources. Fixes: #3783 Signed-off-by: bin <bin@hyper.sh>	2022-02-28 16:52:49 +08:00
Bin Liu	441fdbaf9f	Merge pull request #3753 from sailorvii/main kata-agent: Fix mismatching error of cgroup and mountinfo.	2022-02-28 16:07:26 +08:00
sailorvii	8edca8bbd1	kata-agent: Fix mismatching error of cgroup and mountinfo. The content about systemd in "/proc/self/cgroup" is as: 1:name=systemd:/kubepods/pod1815643d-3789-4e4e-aaf4-00de024912e1/0e15a65bd5f7b30a0b818d90706212354d8b3f0998a1495473c3be9a24706ccf and in "/prol/self/mountinfo" is as: 30 29 0:26 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd The keys extracted from the two files are the same as "name=systemd". So no need to rename the key to "systemd". Fixes: #3385 Signed-off-by: sailorvii <challengingway@hotmail.com>	2022-02-28 10:03:09 +08:00
Eric Ernst	3997c962c2	Merge pull request #3767 from tanweernoor/02242022-kata-containers-issue-3631 runtime, config: make selinux configurable	2022-02-26 08:44:29 -08:00
Eric Ernst	08976b591b	Merge pull request #3776 from fidencio/wip/fix-unbound-variable-tools-clh Fix unbound variable / typo on error mesage	2022-02-25 15:49:08 -08:00
Fabiano Fidêncio	a9ba7c132b	clh: Fix typo on HotplugRemoveDevice A copy and paste mistake was made and the error on HotplugRemoveDevice() should be about removal and not about addition. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 22:35:32 +01:00
Fabiano Fidêncio	827ab82a82	tools: clh: Fix unbound variable `4c164afbac` renamed extra_build_args to features, but did it only in one place, leading to: ``` 21:15:28 /home/jenkins/workspace/kata-containers-2.0-ubuntu-ARM-PR/go/src/github.com/kata-containers/kata-containers/tools/packaging/static-build/cloud-hypervisor/build-static-clh.sh: line 55: features: unbound variable 21:15:29 make[1]: *** [tools/packaging/kata-deploy/local-build/Makefile:30: cloud-hypervisor-tarball-build] Error 1 ``` Fixes: #3775 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 22:35:25 +01:00
Tanweer Noor	082d538cb4	runtime: make selinux configurable removes --tags selinux handling in the makefile (part of it introduced here: `d78ffd6`) and makes selinux configurable via configuration.toml Fixes: #3631 Signed-off-by: Tanweer Noor <tnoor@apple.com>	2022-02-25 10:33:46 -08:00
Fabiano Fidêncio	ea1876f057	Merge pull request #3771 from fidencio/wip/clh-tdx clh: Add TDX support	2022-02-25 18:45:31 +01:00
Samuel Ortiz	1103f5a4d4	virtcontainers: Use FilesystemSharer for sharing the containers files Switching to the generic FilesystemSharer brings 2 majors improvements: 1. Remove container and sandbox specific code from kata_agent.go 2. Allow for non Linux implementations to provide ways to share container files and root filesystems with the Kata Linux guest. Fixes #3622 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-25 17:22:27 +01:00
Samuel Ortiz	533c1c0e86	virtcontainers: Keep all filesystem sharing prep code to sandbox.go With the Linux implementation of the FilesystemSharer interface, we can now remove all host filesystem sharing code from kata_agent and keep it where it belongs: sandbox.go. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-25 17:22:27 +01:00
Samuel Ortiz	61590bbddc	virtcontainers: Add a Linux implementation for the FilesystemSharer This gathers the current kata agent and container filesystem sharing code into a FilesystemSharer implementation. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-25 17:22:27 +01:00
Samuel Ortiz	03fc1cbd7e	virtcontainers: Add a filesystem sharing interface Filesystem sharing here means the ability to share some parts of the host filesystem with the guest. It's mostly about sharing files and container bundle root filesystems. In order to allow for different file and rootfs sharing implementations, we define a FilesystemSharer interface. This interface provides a preparation step, where concrete implementations will be able to e.g. prepare the host filesysstem. Then it provides 2 methods, one for sharing any file (regular file or a directory) and another one for sharing a container root filesystem Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-25 17:22:27 +01:00
Fabiano Fidêncio	72434333aa	clh: Add TDX support Let's enable TDX support for Cloud Hypervisor, using td-shim as its desired firmware. Fixes: #3632 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	a13b4d5ad8	clh: Add firmware to the config file "firmware" option was already present for a while, but it's never been exposed to the configuration file before. Let's do it now as it can be used, in combination with the newly added confidential_guest option, to boot a guest VM using the so called `td-shim`[0] with Cloud Hypervisor. [0]: https://github.com/confidential-containers/td-shim Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	a8827e0c78	hypervisors: Confidential Guests do not support NVDIMM NVDIMM is also not supported with Confidential Guests and Virtio Block devices should be used instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	f50ff9f798	hypervisors: Confidential Guests do not support Memory hotplug Similarly to VCPUs and Device hotplug, Confidential Guests also do not support Memory hotplug. Let's make it clear in the documentation and guard the code on both QEMU and Cloud Hypervisor side to ensure we don't advertise Memory hotplug as being supported when running Confidential Guests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	df8ffecde0	hypervisors: Confidential Guests do not support Device hotplug Similarly to VCPUs hotplug, Confidential Guests also do not support Device hotplug. Let's make it clear in the documentation and guard the code on both QEMU and Cloud Hypervisor side to ensure we don't advertise Device hotplug as being supported when running Confidential Guests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	28c4c044e6	hypervisors: Confidential Guests do not support VCPUs hotplug As confidential guests do not support VCPUs hotplug, let's set the "DefaultMaxVCPUs" value to "NumVCPUs". The reason to do this is to ensure that guests will be started with the correct amount of VCPUs, without giving to the guest with all the possible VCPUs the host could provide. One clear side effect of this limitation is that workloads that would require more VCPUs on their yaml definition will not run on this scenario. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	29ee870d20	clh: Add confidential_guest to the config file ConfidentialGuest is an option already present and exposed for QEMU, which is used for using Kata Containers together with different sorts of Guest Protections, such as TDX and SEV for x86_64, PEF for ppc64le, and SE for s390x. Right now we error out in case confidential_guest is enabled, as we will be implementing the needed blocks for this as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	9621c59691	clh: refactor image / initrd configuration set This is a small code refactor removing a deadcode based the checks already done in the generic hypervisor abstraction. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	dcdc412e25	clh: use common kernel params from the hypervisor code The hypervisor code already defines 3 common kernel root params for the following cases: * NVDIMM * NVDIMM without DAX support * Virtio Block As parameters used for cloud-hypervisor have an overlap with the ones provided by the NVDIMM case, let's take advantage of that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	4c164afbac	versions: Update Cloud Hypervisor to 5343e09e7b8db Let's bump the Cloud Hypervisor version to 5343e09e7b8db, as that brings a few fixes we're interested in, such as: * hypervisor, vmm: Handle TDX hypercalls with INVALID_OPERAND - https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3723 - This is needed for the TDX support on the cloud hypervisor driver, which is part of this very same series. * openapi: Update the PciBdf types - https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3748 - This is needed due to a change in a DeviceNode field, which would cause a marshalling / demarshalling error when running with a version of cloud-hypervisor that includes the TDX fixes mentioned above. * scripts: dev_cli: Don't quote $features_build * scripts: dev_cli: Add --features option - https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3773 - This is needed due to changes in the scripts used to build Cloud Hypervisor, which are used as part of Kata Containers CIs and github actions. Due to this change, we're also adapting the build scripts as part of this very same commit. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:16 +01:00
Jakob Naucke	bbfe7d6591	Merge pull request #3599 from Jakob-Naucke/no-virtio-rng-ccw virtcontainers: Do not add a virtio-rng-ccw device	2022-02-25 15:27:02 +01:00
Francesco Giudici	3da6006de4	Merge pull request #3751 from fgiudici/kata-monitor_issue3705 kata-monitor: fix collecting metrics for sandboxes not started through CRI	2022-02-25 14:53:12 +01:00
Jakob Naucke	b2a65f9031	virtcontainers: Use available s390x hugepages in TestHandleHugepages. On s390x, hugepage sizes must be set at boot, so test with any that are present (default is 1M). Depends-on: github.com/kata-containers/kata-containers#3770 Fixes: #3763 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-25 13:11:00 +01:00
Chelsea Mafrica	6a11dbfa8a	Merge pull request #3762 from Amulyam24/fix-build runtime: fix package declaration for ppc64le	2022-02-24 12:45:31 -08:00
Amulyam24	cb4230e60e	runtime: fix package declaration for ppc64le Incorrect package name causes build to fail. Fix it in vm_ppc64le.go Fixes: #3761 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2022-02-24 15:31:48 +05:30
Eric Ernst	c6cc038364	Merge pull request #3615 from sameo/topic/hypervisor Make the hypervisor framework not Linux specific	2022-02-23 16:02:00 -08:00
GabyCT	7da7e0a8f5	Merge pull request #3724 from Jakob-Naucke/kata-deploy-s390x kata-deploy: Simplify Dockerfile and support s390x	2022-02-23 11:38:01 -06:00
Francesco Giudici	fec26f8e51	kata-monitor: trivial: rename symbols & labels We introduced collection of sandboxes metadata from the CRI that will be attached to the sandbox metrics: this will allow to immediately match sandboxes metrics with CRI workloads. Rename the symbols from Kube to CRI as the metadata will be there every time pods are created through CRI, also if kubernetes is not installed (e.g., 'crictl runp'). Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-02-23 18:34:32 +01:00
Samuel Ortiz	9fd4e5514f	runtime: Move the resourcecontrol package one layer up And try to reduce the number of virtcontainers packages, step by step. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Samuel Ortiz	823faee83a	virtcontainers: Rename the cgroups package To resourcecontrol, and make it consistent with the fact that cgroups are a Linux implementation of the ResourceController interface. Fixes: #3601 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Samuel Ortiz	0d1a7da682	virtcontainers: Rename and clean the cgroup interface We call it a ResourceController, and we make it not so Linux specific. Now the Linux implementations is the cgroups one. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Samuel Ortiz	ad10e201e1	virtcontainers: cgroups: Move non Linux routine to utils.go Have an OS agnostic file for sharing routines. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Samuel Ortiz	d49d0b6f39	virtcontainers: cgroups: Define a cgroup interface And move the current, Linux-specific implementation into cgroups_linux.go Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Francesco Giudici	3ac52e8193	kata-monitor: fix updating sandbox cache at startup We now rely on fs events only to update the sandbox cache. This is not true anyway for sandboxes already present at kata-monitor startup: we just retrieve the list and add them in the cache only when we get their CRI metadata. If CRI metadata is not available we will never add them to the sandbox cache. Fix this by immediately adding the sandboxes we find at startup time to the sandbox cache. Fixes: #3705 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-02-23 11:21:06 +01:00
Francesco Giudici	160bb62138	kata-monitor: bump version to 0.3.0 Since kata-monitor now: - relies on fs events only to update the sandbox cache - adds CRI meta-data as labels (CRI pod name, namespace and uid) it deserves a version bump. Note that while we could let kata-monitor match the runtime version, kata-monitor will usually work flawlessy with different kata shim releases: so it makes sense to keep kata-monitor version separated. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-02-23 11:17:02 +01:00
wangyongchao.bj	1a3381b096	docs: Developer-Guide build a custom Kata agent with musl The Developer-Guide.md build a custom kata agent with `x86_64-unknown-linux-musl`. The `musl` should be changed by the system arch. The system arch is aarch64, ppc64le and s390x, the musl should be changed. When the arch is ppc64le or s390x, the musl should be replaced by the gnu. Fixes: #3731 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2022-02-23 15:29:53 +08:00
Fabiano Fidêncio	6a9e5f90f7	Merge pull request #3670 from sameo/topic/nerdctl Support nerdctl OCI hooks	2022-02-22 23:03:33 +01:00
Fabiano Fidêncio	4729fd0fc2	Merge pull request #3736 from liubin/fix/3733-log-events-for-crio shim: log events for CRI-O	2022-02-22 09:19:37 +01:00
bin	f6fc1621f7	shim: log events for CRI-O CRI-O start shim process without setting TTRPC_ADDRESS, that the forwarding events goroutine will get errors. For CRI-O runtime, we can log the events to log file. Fixes: #3733 Signed-off-by: bin <bin@hyper.sh>	2022-02-22 11:02:50 +08:00
Julio Montes	753d639bb3	Merge pull request #3741 from GabyCT/topic/updatecontributing docs: Update contributing link	2022-02-21 14:03:48 -06:00
Gabriela Cervantes	1d68a08f4b	docs: Update contributing link This PR updates the contributing documentation link to the one that is using kata 2.0 Fixes #3740 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-02-21 17:01:09 +00:00
Fabiano Fidêncio	e604f83c40	Merge pull request #3735 from fidencio/wip/kata-deploy-use-kata-with-qemu-as-the-default-shim-v2-binary kata-deploy: Use (kata with) qemu as the default shim-v2 binary	2022-02-21 14:52:55 +01:00
Fabiano Fidêncio	1e9f3c856d	Merge pull request #3553 from fgiudici/kata-monitor_cachefix kata-monitor: simplify sandbox cache management and attach kubernetes POD metadata to metrics	2022-02-21 13:17:22 +01:00
Peng Tao	031da99914	Merge pull request #3687 from luodw/nydus-clh nydus: add lazyload support for kata with clh	2022-02-21 19:31:45 +08:00
Jakob Naucke	9123fc098d	kata-deploy: Simplify Dockerfile and support s390x The kata-deploy Dockerfile is based on CentOS 7, which has no s390x support. Add an `IMAGE` argument to specify the registry, which still defaults to CentOS, but e.g. ClefOS can be selected instead. Other x86_64 assumptions are also removed. Other general simplicifations are made. This does not address the more general issue of #3723 -- what we're doing here does not seem to be working with systemd >= something between 235-237. Fixes: #3722 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-21 11:06:54 +01:00
James O. D. Hunt	67c3195c9c	Merge pull request #3721 from Amulyam24/kernel-fix kernel: remove SYS_SUPPORTS_HUGETLBFS from powerpc fragments	2022-02-21 09:10:21 +00:00
Fabiano Fidêncio	11220f052f	kata-deploy: Use (kata with) qemu as the default shim-v2 binary When using kata-deploy, no `containerd-shim-kata-v2` binary is deployed, but we do deploy a `kata` runtime class, which seems very much incosistent. As the default configuration for kata-containers points to QEMU, let's also use kata with QEMU as the default shim-v2 binary. Fixes: #3228, #3734 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-21 10:03:47 +01:00
luodaowen.backend	3175aad5ba	virtiofs-nydus: add lazyload support for kata with clh As kata with qemu has supported lazyload, so this pr aims to bring lazyload ability to kata with clh. Fixes #3654 Signed-off-by: luodaowen.backend <luodaowen.backend@bytedance.com>	2022-02-19 21:55:31 +08:00
zhanghj	94b831ebf8	virtcontainers: remove temp dir created for vsock in test code remove temp dir generated by mock.GenerateKataMockHybridVSock(). Fixes: #3186 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2022-02-19 16:59:15 +08:00
James O. D. Hunt	a671b455a2	Merge pull request #3691 from Jakob-Naucke/fix-apply-patches packaging: Use `patch` for applying patches	2022-02-18 15:51:05 +00:00
Archana Shinde	7db9bef72c	Merge pull request #3718 from Kvasscn/kata_dev_fix_utils_assert_msg virtcontainers: Remove duplicated assert messages in utils test code	2022-02-18 06:07:16 -08:00
Amulyam24	8cc1b18636	kernel: remove SYS_SUPPORTS_HUGETLBFS from powerpc fragments The name of SYS_SUPPORTS_HUGETLBFS has been changed to ARCH_SUPPORTS_HUGETLBFS which is being selected on default by another kernel config. More info- `855f9a8e87` Change applicable from v5.13. Fixes: #3720 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2022-02-18 18:06:50 +05:30
Jakob Naucke	5c9d2b413f	packaging: Use `patch` for applying patches `tools/packaging/scripts/apply_patches.sh` uses `git apply $patch`, but this will not apply to subdirectories. If one wanted to apply with `git apply`, they'd have to run it with `--directory=...` _relative to the Git tree's root_ (absolute will not work!). I suggest we just use `patch`, which will do what we expected `git apply` would do. `patch` is also added to build containers that require it. Fixes: #3690 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-18 11:32:17 +01:00
Tim Zhang	12e83a99ed	Merge pull request #3699 from liubin/fix/3698-add-nydus-snapshotter-to-versions versions: add nydus-snapshotter	2022-02-18 17:42:58 +08:00
Fabiano Fidêncio	5b3fb6f83d	kernel: Build SGX as part of the vanilla kernel Let's take advantage of the fact that we've bumped to our kernel version ot the 5.15 LTS and enable SGX by default, as it's present there. Fixes: #3692 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-18 10:41:08 +01:00
Fabiano Fidêncio	2c35d8cb8e	workflows: Stop building the experimental kernel Let's stop building the experimental kernel as, currently, we have all the needed contents as part of the vanilla kernel. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-18 10:41:08 +01:00
Fabiano Fidêncio	32e7845d31	snap: Build vanilla kernel for all arches There's no need to build an experimental kernel for x86_64 as all the bits which were part of the experimental one (SGX only, really) are now part of the vanilla one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-18 10:41:08 +01:00
Samuel Ortiz	27de212fe1	runtime: Always add network endpoints from the pod netns As the container runtime, we're never inspecting, adding or configuring host networking endpoints. Make sure we're always do that by wrapping addSingleEndpoint calls into the pod network namespace. Fixes #3661 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-18 10:37:07 +01:00
James O. D. Hunt	f324305004	Merge pull request #3710 from GabyCT/topic/ulimidoc docs: Update limitations document	2022-02-18 09:20:09 +00:00
zhanghj	1cee0a9452	virtcontainers: Remove duplicated assert messages in utils test code Remove duplicated strings in assert.Errorf() and assert.NoErrorf(). Fixes: #3714 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2022-02-18 16:45:05 +08:00
Gabriela Cervantes	6c1d149a5d	docs: Update limitations document This PR updates the limitations document by removing the docker references belonged to kata 1.x and add as a limitation the docker and podman support for kata 2.0 Fixes #3709 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-02-17 21:15:56 +00:00
Julio Montes	0b31b7ccc2	Merge pull request #3707 from devimc/2022-02-16/qemu-tdx packaging: support qemu-tdx	2022-02-17 12:20:05 -06:00
Julio Montes	7c4ee6ec48	packaging/qemu: create no_patches file for qemu-tdx create no_patches.txt file for qemu-tdx, this way we can build it using packaging scripts Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-17 09:17:57 -06:00
Julio Montes	d47c488b58	versions: add qemu tdx section define qemu tdx version and repo url fixes #3706 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-17 09:03:17 -06:00
Julio Montes	8d3ace4a7d	Merge pull request #3675 from jodh-intel/kata-manager-fix-install Kata manager fix install	2022-02-17 08:00:23 -06:00
Samuel Ortiz	77c29bfd3b	container: Remove VFIO lazy attach handling With the recently added VFIO fixes and support, we should not need that anymore. Fixes #3108 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-17 08:39:44 +01:00
bin	7241d618f1	versions: add nydus-snapshotter Add nydus-snapshotter to versions.yaml to install nydus-snapshotter from its own releases. Fixes: #3698 Signed-off-by: bin <bin@hyper.sh>	2022-02-17 14:09:20 +08:00
Peng Tao	9e618f1fb2	Merge pull request #3684 from fidencio/kernel-lts-5.15.x versions: Linux 5.15.x	2022-02-17 10:25:28 +08:00
Fupan Li	8694af6d92	Merge pull request #3657 from liubin/fix/3656-add-make-check-for-tools trace-forwarder/agent-ctl: run cargo fmt/clippy in make check	2022-02-17 10:05:16 +08:00
GabyCT	ced5e910d5	Merge pull request #3558 from jodh-intel/docs-rework-readme docs: Improve top-level README	2022-02-16 16:28:14 -06:00
Fabiano Fidêncio	6f9685fbf5	Merge pull request #3624 from mdlayher/mdl-vsock runtime: use github.com/mdlayher/vsock@v1.1.0	2022-02-16 23:11:47 +01:00
Fabiano Fidêncio	1f28e87e00	Merge pull request #3689 from fidencio/wip/clh-build-and-ship-a-tdx-capable-binary tools: Build cloud-hypervisor with "--features tdx"	2022-02-16 21:52:55 +01:00
Samuel Ortiz	26b3f0017c	virtcontainers: Split hypervisor into Linux and OS agnostic bits Keep all the OS agnostic bits in the hypervisor.go and hypervisor_ARCH.go files. Fixes #3614 Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:15:31 +01:00
Samuel Ortiz	fa0e9dc6b1	virtcontainers: Make all Linux VMMs only build on Linux Some of them (e.g. QEMU) can run on other OSes (e.g. Darwin) but the current virtcontainers implementation is Linux specific. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:07:34 +01:00
Samuel Ortiz	c91035d0e1	virtcontainers: Move non QEMU specific constants to hypervisor.go Hotplugging errors and 9pfs size are not particularily QEMU specific. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:07:34 +01:00
Samuel Ortiz	10ae05914c	virtcontainers: Move guest protection definitions to hypervisor.go They're not QEMU specific, other VMMs may implement support for it. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:07:31 +01:00
Samuel Ortiz	b28d0274ff	virtcontainers: Make max vCPU config less QEMU specific Even though it's still actually defined as the QEMU upper bound, it's now abstracted away through govmm. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:06:32 +01:00
Samuel Ortiz	a5f6df6a49	govmm: Define the number of supported vCPUs per architecture Based on qhe QEMU supports on those architectures. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:06:32 +01:00
Fabiano Fidêncio	be2e90469a	Merge pull request #3669 from fidencio/wip/virtiofsd-use-announce-submounts virtiofsd: Use "-o announce_submounts"	2022-02-16 16:43:18 +01:00
Fabiano Fidêncio	a6b4015130	tools: clh: Remove unused variables Right now we're getting the info for the Cloud Hypervisor repo and version, but we don't do anything with them, as those are not passed down to the build script. Morever, the build script itself gets the info from exactly the same place when those are not passed, making those redundant. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-16 14:54:51 +01:00
Peng Tao	b4a1150638	Merge pull request #3344 from liubin/f/3342-hugepages-support feature: hugepages support	2022-02-16 21:52:26 +08:00
Fabiano Fidêncio	5816c132ec	tools: Build cloud-hypervisor with "--features tdx" Right now TDx support on Cloud Hypervisor is gated behind a "--features tdx" flag. However, having TDx support enabled should not and does not impact on the general usability of cloud-hypervisor. As sooner than later we'll need kata-deploy binaries to be tested on a CI that's TDx capable, for the confidential containers effort, let's take the bullet and already enable it by default. By the way, touching kata-deploy-binaries.sh as it's ensure the change will be used in the following workflows: * kata-deploy-push * kata-deploy-test * release Fixes: #3688 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-16 14:51:15 +01:00
Carlos Venegas	e6060cb7c0	versions: Linux 5.15.x Upgrade to new Linux kernel LTS version. Fixes: #3576 Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>	2022-02-16 11:12:44 +01:00
James O. D. Hunt	9818cf7196	docs: Improve top-level and runtime README Various improvements to the top-level README file: - Moved the following sections from the runtime's README to the top-level README: - License - Platform support / Hardware requirements - Added the following sections to the top-level README: - Configuration - Hypervisors - Improved formatting of the Documentation section in the top-level README. - Removed some unused named links from the top-level README. Also improvements to the runtime README: - Removed confusing mention of the old 1.x runtime name. - Clarify the binary name for the 2.x runtime and the utility program. > Note: > > We cannot currently link to the AMD website as that site's > configuration causes the CI static checks to fail. See > https://github.com/kata-containers/tests/issues/4401 Fixes: #3557. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-16 09:52:48 +00:00
Fabiano Fidêncio	d0c8eb7e14	Merge pull request #3673 from fidencio/wip/allow-passing-a-build-flag-to-cloud-hypervisor tools: clh: Allow to set when to build from sources and the build flags passed down to cargo	2022-02-16 09:45:54 +01:00
bin	36c3fc12ce	agent: support hugepages for containers Mount hugepage directories and configure the requested number of hugepages dynamically by writing to sysfs files Port from: `78b307b5bd` Fixes: #3342 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Signed-off-by: bin <bin@hyper.sh>	2022-02-16 15:14:53 +08:00
bin	81a8baa5e5	runtime: add hugepages support Add hugepages support, port from: `b486387cba` Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Signed-off-by: bin <bin@hyper.sh>	2022-02-16 15:14:53 +08:00
bin	7df677c01e	runtime: Update calculateSandboxMemory to include Hugepages Limit Support hugepages and port from: `96dbb2e8f0` Fixes: #3342 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Signed-off-by: bin <bin@hyper.sh>	2022-02-16 15:14:37 +08:00
GabyCT	1dcb413e68	Merge pull request #3677 from GabyCT/topic/removedockerrun docs: Remove docker run and shared memory from limitations	2022-02-15 15:04:36 -06:00
Fabiano Fidêncio	948a2b099c	tools: clh: Ensure the download binary is executable We're downloading the released cloud-hypervisor binary from GitHub, but we should also ensure we set the binary as executable. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-15 20:23:46 +01:00
bin	72bf5496fd	agent: handle hook process result Current hook process is handled by just calling unwrap() on it, sometime it will cause panic. By handling all Result type and check the error can avoid panic. Fixes: #3649 Signed-off-by: bin <bin@hyper.sh>	2022-02-15 19:01:54 +01:00
bin	80e8dbf1f5	agent: valid envs for hooks Envs contain null-byte will cause running hooks to panic, this commit will filter envs and only pass valid envs to hooks. Fixes: #3667 Signed-off-by: bin <bin@hyper.sh>	2022-02-15 19:01:54 +01:00
Samuel Ortiz	4f96e3eae3	katautils: Pass the nerdctl netns annotation to the OCI hooks We need to let nerdctl know which namespace to use when calling the selected CNI plugin. See https://github.com/containerd/nerdctl/issues/787 Fixes: #1935 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-15 18:11:23 +01:00
Samuel Ortiz	a871a33b65	katautils: Run the createRuntime hooks The preStart hooks are being deprecated over the createRuntime ones. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-15 17:31:56 +01:00
Samuel Ortiz	d9dfce1453	katautils: Run the preStart hook in the host namespace The OCI spec is very specific about it: "The prestart hooks MUST be executed in the runtime namespace." Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-15 17:31:56 +01:00
Samuel Ortiz	6be6d0a3b3	katautils: Pass the OCI annotations back to the called OCI hooks That allows us to amend those annotations with information that could be used when running those hooks. For example nerdctl will use those annotations to resolve the networking namespace path in where to run the CNI plugin, i.e. the created pod networking namespace. Fixes #3629 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-15 17:31:56 +01:00
James O. D. Hunt	493ebc8ca5	utils: Update kata manager docs Update the `kata-manager.sh` README to recommend users view the available options before running the script. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 16:05:54 +00:00
James O. D. Hunt	34b2e67d48	utils: Added more kata manager cli options Added CLI options to the `kata-manager.sh` script to: - Force installation - Disable cleanup (retain downloaded files) - Only install Kata (don't consider containerd). > Note: > > This change introduces a subtle behaviour difference: > > - Previously, the script would error if containerd was already installed. > > - Now, the script will detect the existing installation and skip > trying to install containerd. > > This new behaviour makes more sense for most users but if you wish > to use the old behaviour, you (now) need to run the script specifying > the `-f` (force) option. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 16:05:54 +00:00
James O. D. Hunt	714c9f56fd	utils: Improve containerd configuration `kata-manager.sh` improvements for containerd: - Fixed containerd default branch (which is now `main`). - Only install service file if it doesn't already exist. - Enable the containerd service to ensure it can be started. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 16:05:54 +00:00
James O. D. Hunt	c464f32676	utils: kata-manager: Force containerd sym link creation For consistency with the rest of the script force the creation of a symbolic link for containerd in `kata-manager.sh`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 16:05:54 +00:00
James O. D. Hunt	4755d004a7	utils: Fix unused parameter Actually make use of the `requested_version` parameter in `kata-manager.sh` and added a comment. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 16:05:54 +00:00
James O. D. Hunt	601be4e63b	utils: Fix containerd installation Fix bug introduced inadvertently on #3330 which fixes the Kata installation, but unfortunately breaks installing containerd. The new approach is to check that the download URL matches a project-specific regular expression. Also improves the architecture test to handle the containerd architecture name (`amd64` rather than `x86_64`). Fixes: #3674. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 16:05:54 +00:00
James O. D. Hunt	ae21fcc799	utils: Fix Kata tar archive check The static tar archive published on GitHub (now) contains `./` which is being being flagged as an "unknown path" and resulting in the `kata-manager.sh` script failing. Partially fixes: #3674. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 16:05:54 +00:00
James O. D. Hunt	f4d1e45c33	utils: Add kata-manager CLI options for kata and containerd Add options to `kata-manager.sh` to allow the version of Kata and containerd to be specified. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 16:05:52 +00:00
Gabriela Cervantes	395cff480d	docs: Remove docker run and shared memory from limitations This PR removes the docker run and shared memory segment from the limitations document as for kata 2.0 we do not support docker and this is not longer valid. Fixes #3676 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-02-15 15:29:12 +00:00
Fabiano Fidêncio	e07545a23c	tools: clh: Allow passing down a build flag Let's allow passing down a build flag to cargo, when building Cloud Hypervisor. By doing this we allow calling this script with: ``` extra_build_flags="--features tdx" ./build-static-clh.sh ``` Fixes: #3671 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-15 14:14:30 +01:00
Fabiano Fidêncio	55cdef2295	tools: clh: Add the possibility to always build from sources The current code will always pull the release binaries in case the version requested by Kata Containers matches with a released version. This, however, has a limitation of preventing users / CIs to build cloud-hypervisor from source for a reason or another, such as passing a specific build flag to cloud-hypervisor. This is a pre-req to solving https://github.com/kata-containers/kata-containers/issues/3671. While here, a small changes were needed in order to improve readability and debugability of why we're building something from the sources rather than simply downloading and using a pre-built binary. Fixes: #3672 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-15 14:13:51 +01:00
James O. D. Hunt	3f87835a0e	utils: Switch kata manager to use getopts Use `getopts(1)` for command line argument parsing in `kata-manager.sh`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-15 08:55:54 +00:00
Fabiano Fidêncio	4bd945b67b	virtiofsd: Use "-o announce_submounts" German Maglione, one of the current virtio-fs developers, has brought to our attention that using "announce-submounts" could help us to prevent inode number collisions. This feature was introduced a year ago or so by Hanna Reitz as part of the 08dce386e77eb9ab044cb118e5391dc9ae11c5a8, and as we already mandate QEMU >= 6.1.0, let's take advantage of that. Fixes: #3507 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-15 08:52:03 +01:00
Yu Li	37df1678ae	build: always reset ARCH after getting it When building with `ARCH=x86_64`, the previous `Makefile` will use it without checking and cause: Makefile:319: *** "ERROR: No hypervisors known for architecture x86_64 (looked for: acrn firecracker qemu cloud-hypervisor)". Stop. This commit fix the above issue by checking `ARCH` no matter where it is assigned. Fixes: #3444 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>	2022-02-15 14:26:34 +08:00
Fabiano Fidêncio	a3b3274121	Merge pull request #3664 from fidencio/clh-update-to-55479a64d237 versions: Udpate Cloud Hypervisor to 55479a64d237	2022-02-15 00:52:42 +01:00
Shengjing Zhu	3a641b56f6	katatestutils: remove distro constraints The distro constraint parses os release files, which may not contain distro version(VERSION_ID field), for example rolling release distributions like Debian testing, archlinux. These distro constraints are not used anyway, so removing them instead of fixing the complex version detection. Fixes: #1864 Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2022-02-15 02:11:52 +08:00
Fabiano Fidêncio	90fd625d0c	versions: Udpate Cloud Hypervisor to 55479a64d237 Let's update cloud-hypervisor to a version that exposes the TDx support via the OpenAPI's auto-generated code. Fixes: #3663 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-14 17:32:30 +01:00
Eric Ernst	1873fd2641	Merge pull request #3660 from devimc/2022-02-11/packaging/supportKernelTDx kernel: add missing config fragment for TDx	2022-02-14 08:18:59 -08:00
Jakob Naucke	573a37b33b	osbuilder: Add CentOS Stream rootfs to cover a Red Hat (adjacent) rootfs with great cross-platform compatibility and a workable release cadence. The previous CentOS & Fedora workflows are simplified. Also remove unnecessary `/usr/share` files as on Ubuntu and mark Alpine as unuspported on ppc64le (due to musl, for a while already). Fixes: #3340 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-14 15:06:07 +01:00
Jakob Naucke	f10642c82b	osbuilder: Source .cargo/env before checking Rust We install Rust in the build containers, but we also install Rust in `rootfs.sh` if it is missing. It makes sense to install Rust in the build containers so it does not have to be installed every time, but for that check to work on non-login shells, we should source `.cargo/env` before running it. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-14 15:06:07 +01:00
Julio Montes	955d359f9e	kernel: add missing config fragment for TDx Add kernel config fragment that enables TDx fixes #3659 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-14 07:40:12 -06:00
James O. D. Hunt	8f80dffead	Merge pull request #3648 from yaoyinnan/index-in-for runtime: The index variable is initialized multiple times in for	2022-02-14 12:36:46 +00:00
James O. D. Hunt	3d3af84cde	Merge pull request #3636 from Kvasscn/kata_dev_fix_check_build_type scripts: fix a typo while to check build_type	2022-02-14 12:33:59 +00:00
bin	734b618c16	agent-ctl: run cargo fmt/clippy in make check Run cargo fmt/clippy in make check and clear clippy warnings. Fixes: #3656 Signed-off-by: bin <bin@hyper.sh>	2022-02-14 20:12:57 +08:00
bin	12c37fafc5	trace-forwarder: add make check for Rust Add make check to run cargo fmt/clippy for Rust projects. Fixes: #3656 Signed-off-by: bin <bin@hyper.sh>	2022-02-14 20:12:48 +08:00
Fabiano Fidêncio	7ae8901a66	Merge pull request #3483 from fidencio/wip/bump-crio-to-its-1.23-release versions: bump CRI-O to its 1.23 release	2022-02-14 10:06:51 +01:00
Bin Liu	cf53ec2c71	Merge pull request #2977 from luodw/support_nydus feature(nydusd): add nydusd support to introduce lazyload ability	2022-02-14 13:08:50 +08:00
Eric Ernst	172fac5cc8	Merge pull request #3613 from hxtmdev/markdown-relative docs: Fix relative links in Markdown	2022-02-13 21:01:41 -08:00
Fabiano Fidêncio	56c51fba4b	Merge pull request #3651 from devimc/2022-02-11/packaging/supportKernelTDx kernel: support TDx	2022-02-13 13:13:38 +01:00
Matt Layher	c1ce67d905	runtime: use github.com/mdlayher/vsock@v1.1.0 Fixes #3625 Signed-off-by: Matt Layher <mdlayher@gmail.com>	2022-02-12 19:57:15 -05:00
yaoyinnan	42a878e6c1	runtime: The index variable is initialized multiple times in for Change the variables `mountTypeFieldIdx := 8`, `mntDestIdx := 4` and `netNsMountType := "nsfs"` to const. And unify the variable naming style, modify `mntDestIdx` to `mountDestIdx`. Fixes: #3646 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2022-02-12 11:10:10 +08:00
Julio Montes	1797b3eb04	packaging/kernel: build TDX guest kernel Add support for building TDX kernel from github.com/intel/tdx To build a guest kernel that supports Intel TDx run: ``` ./build-kernel.sh -s -x tdx -d setup ./build-kernel.sh -s -x tdx -d install ``` fixes #3650 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-11 16:00:32 -06:00
Julio Montes	9875252917	versions: add url and tag for tdx kernel Add url and tag for tdx kernel Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-11 15:44:18 -06:00
Julio Montes	bc8464e04f	packaging/kernel: add option -s option Add -s option to skip .config checks Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-11 15:44:03 -06:00
Julio Montes	dfbde2e06c	Merge pull request #3643 from dgibson/vfio-env-fix device: Actually update PCIDEVICE_ environment variables for the guest	2022-02-11 10:47:33 -06:00
luodaowen.backend	2d9f89aec7	feature(nydusd): add nydusd support to introduse lazyload ability Pulling image is the most time-consuming step in the container lifecycle. This PR introduse nydus to kata container, it can lazily pull image when container start. So it can speed up kata container create and start. Fixes #2724 Signed-off-by: luodaowen.backend <luodaowen.backend@bytedance.com>	2022-02-11 21:41:17 +08:00
Daniel Höxtermann	b19b6938a8	docs: Fix relative links in Markdown Relative links within this repository allow for easier navigation to the corresponding file / directory in the current commit / for the selected version. Link text was slightly changed / fixed in - docs/Unit-Test-Advice.md - docs/how-to/how-to-run-docker-with-kata.md Fixes #3045 Signed-off-by: Daniel Höxtermann <daniel@hxtm.dev>	2022-02-11 13:49:42 +01:00
David Gibson	9590874d9c	device: Update PCIDEVICE_ environment variables for the guest In commit 78dff468bf1 we introduced logic to rewrite PCIDEVICE_ environment variables for the container so that they contain correct addresses for the Kata VM rather than for the host. Unfortunately, we never actually invoked the function to do this. It turns out we need to do this not only at container creation time, but also for environment variables supplied to processes exec-ed into the container after creation (e.g. with crictl exec). Add calls to make both those updates. fixes #3634 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-02-11 13:46:36 +11:00
David Gibson	7b7f426a3f	device: Keep host to VM PCI mapping persistently add_devices() generates a mapping of host to guest PCI addresses which is used to update some environment variables for the workload. Currently it just does this locally, but it turns out we're going to need the same map again in order to correct environment variables for processes exec-ed into the existing container. Move the map to the sandbox structure so we can keep it around for those later uses. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-02-11 13:46:17 +11:00
David Gibson	0b2bd64124	device: Rework update_spec_pci() to update_env_pci() This function updates PCIDEVICE_ environment variables (such as those supplied by the Kubernetes SR-IOV plugin) in the OCI spec to be correct for the Kata VM, rather than for the host. We neglected to actually call this function, however, and it turns out that when we do, we need to do things slightly different. We actually need to adjust envionment variables both in the OCI spec when creating a container and also in the variables supplied for exec-ing a new process within an existing container. Adjust the function so that it can be used for both these cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-02-11 13:46:05 +11:00
Eric Ernst	88b3e9e848	Merge pull request #3617 from hxtmdev/fluentd-link docs: Update link to EFK stack docs	2022-02-10 12:50:17 -08:00
Julio Montes	046aae7e52	Merge pull request #3619 from devimc/2021-02-03/supportQEMUSGX runtime: support QEMU SGX	2022-02-10 11:36:49 -06:00
Julio Montes	982f14fa66	runtime: support QEMU SGX Enable SGX in QEMU when `sgx.intel.com/epc` annotation is defined fixes #3436 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-10 09:45:48 -06:00
Daniel Höxtermann	40aa43f429	docs: Update link to EFK stack docs Fixes #3616 Signed-off-by: Daniel Höxtermann <daniel@hxtm.dev>	2022-02-09 15:32:21 -08:00
Fabiano Fidêncio	0f856da402	Merge pull request #3628 from jongwu/snap_qemu_version snap: update qemu version to 6.1.0 for arm	2022-02-09 20:12:28 +01:00
zhanghj	54e1faec4c	scripts: fix a typo while to check build_type check $build_type is not an empty string instead of equal to "true". Fixes: #3635 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2022-02-09 17:13:04 +08:00
Eric Ernst	901a9d7cad	Merge pull request #3612 from snir911/release_fixes Release process related fixes	2022-02-08 16:36:14 -08:00
Samuel Ortiz	07b9d93f5f	virtcontainer: Simplify the sandbox network creation flow We don't need to call NewNetwork() twice, and we can have the VM factory case return immediatly. That makes the code more readable. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	2c7087ff42	virtcontainers: Make all endpoints Linux only All of the networking endpoints are Linux specific. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	49d2cde1e2	virtcontainers: Split network tests into generic and OS specific parts Some unit tests are generic while others, mostly because they depend on netlink, are Linux specific. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	0269077ebf	virtcontainers: Remove the netlink package dependency from network.go Move the netlink dependent code into network_linux.go. Other OSes will have to provide the same functions. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	7fca5792f7	virtcontainers: Unify Network endpoints management interface And only have AddEndpoints/RemoveEndpoints for all cases (single endpoint vs all of them, hotplug or not). Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	c67109a251	virtcontainers: Remove the Network PostAdd method It's used once by the sandbox code and can be implemented directly there. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	e0b264430d	virtcontainers: Define a Network interface And move the Linux implementation into a GOOS specific file. Fixes #3005 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	5e119e90e8	virtcontainers: Rename the Network structure fields and methods We are converting the Network structure into an interface, so that different host OSes can have different networking implementations for Kata. One step into that direction is to rename all the Network structure fields and methods to something that is less Linux networking namespace specific. This will make the Network interface naming consistent. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	b858d0dedf	virtcontainers: Make all Network fields private Prepare for making it a real interface. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	49eee79f5f	virtcontainers: Remove the NetworkNamespace structure It is now replaced with a single Network structure Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	844eb61992	virtcontainers: Have CreateVM use a Network reference We are replacing the NetworkingNamespace structure with the Network one, so we should have the hypervisor interface switching to it as well. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	d7b67a7d1a	virtcontainers: Network API cleanups and simplifications Remove unused parameters. Reduce the number of parameters by deriving some of them (e.g. a networking config) from their outer structure (e.g. a Sandbox reference). Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	2edea88369	virtcontainers: Make the Network structure manage endpoints Endpoints creations, attachement and hotplug are bound to the networking namespace described through the Network structure. Making them Network methods is natural and simplifies the code. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	8f48e28325	virtcontainers: Expand the Network structure For simplicity sake, there should only be one networking structure per sandbox, as opposed to two (Network and NetworkingNamespace) currently. This commit start expanding the Network structure in order to eventually make it the single representation of a virtcontainers sandbox networking. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Fabiano Fidêncio	193f7a4626	Merge pull request #3606 from wainersm/openshift-ci_stream8 openshift-ci: switch to CentOS Stream	2022-02-08 21:26:15 +01:00
Pierre Kohler	5ef522f7c3	runtime: check kvm module `sev` correctly Runtime now accepts both `1` and `Y` as valid values for kvm_amd module parameter kvm_amd.sev. Fixes #3273 Signed-off-by: Pierre Kohler <pierre.kohler@cysec.systems>	2022-02-07 23:48:47 +01:00
Jianyong Wu	419d813427	snap: update qemu version to 6.1.0 for arm Update qemu version of snap for arm to 6.1.0 thus the arch specific qemu version for arm needs clean up. Fixes: #3627 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-02-07 14:48:23 +08:00
Snir Sheriber	007221875e	docs: update Release-Process.md with a reminder to test kata-deploy Fixes: #3611 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-02-06 09:15:57 +02:00
Snir Sheriber	496bc10de2	tools: check for yq before using it as get_from_kata_deps may be called from scripts that does not install_yq Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-02-06 09:14:31 +02:00
Fabiano Fidêncio	88a70d32ba	Revert "workflows: Ensure a label change re-triggers the actions" This reverts commit `7a879164bd`, as it's been proved that re-triggering the checks at every single change is more painful than having to close / re-open a PR in case we ever use the `force-skip-ci` label again. Fixes: #2804 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-04 00:01:21 +01:00
Eric Ernst	e8eb5e8295	Merge pull request #3609 from egernst/rootless-linux virtcontainers: Split the rootless package into OS specific parts	2022-02-03 12:19:31 -08:00
GabyCT	3603105669	Merge pull request #3584 from devimc/2022-01-31/splitTDVF runtime: suppport split firmware	2022-02-03 10:24:20 -06:00
Wainer dos Santos Moschetta	a9bebb3169	openshift-ci: switch to CentOS Stream The build root container is switched from CentOS 8 to Stream 8 as the former reached EOL. Fixes #3605 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2022-02-02 19:50:01 -03:00
Eric Ernst	c78ffe4cc8	Merge pull request #3587 from egernst/kata-test-deploy-action kata-deploy: for testing, make sure we use the PR branch	2022-02-02 12:09:11 -08:00
Eric Ernst	89047901b3	kata-deploy-push: only run if PR modifying tools path Since we are using this to exercise any changes to osbuilder or packaging scripts, let's make sure that we only run the test in that case. Similarly, don't run for every single push. Just run this workflow for pull requests. Fixes: #3594 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-02 10:16:18 -08:00
GabyCT	43f68252ff	Merge pull request #3582 from GabyCT/topic/removezun docs: Remove Zun documentation with kata containers	2022-02-02 10:54:56 -06:00
Jakob Naucke	7ffe9e5198	virtcontainers: Do not add a virtio-rng-ccw device On s390x, skip adding a virtio-rng device. The on-chip CPACF provides entropy instead. For Confidential Containers, when using Secure Execution, entropy attacks on virtio-rng are mitigated. Fixes: #3598 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-02 17:06:20 +01:00
Fabiano Fidêncio	6d6748afd7	Merge pull request #3351 from Bevisy/main-2610-fix-args agent: Fix execute_hook() args error	2022-02-02 09:45:25 +01:00
Fabiano Fidêncio	1e20baf646	Merge pull request #3565 from Tim-Zhang/commit-message-check-filter-out-revert-commit workflows: stop checking revert commit	2022-02-02 09:38:47 +01:00
Julio Montes	1f29478b09	runtime: suppport split firmware firmware can be split into FIRMWARE_VARS.fd (UEFI variables as configuration) and FIRMWARE_CODE.fd (UEFI program image). UEFI variables can be customized per each user while UEFI code is kept same. fixes #3583 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-01 13:40:19 -06:00
Eric Ernst	24796d2f25	kata-deploy: for testing, make sure we use the PR branch Since we are already checking that only an admin is triggering the job, let's go ahead and make sure we are testing against the PR itself. This will ensure that we are exercising changes to kata-deploy tooling, which is important for this test. While at it, cleanup and simplify some of the tarball creation. Fixes: #3586 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-01 10:53:30 -08:00
Gabriela Cervantes	1cc1c8d058	docs: Remove images from Zun documentation This PR removes the images belonged to the Zun documentation at the use cases directory. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-02-01 18:13:22 +00:00
Gabriela Cervantes	5861e52f8d	docs: Remove Zun documentation with kata containers This PR removes the zun documentation use case with kata containers mainly because is not longer valid as it is using as a reference docker with clear containers 2.0 which are not longer being supported and it is also using docker to test kata with openstack zun and docker is also not supported. Fixes #3581 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-02-01 16:29:06 +00:00
Greg Kurz	a31cde1224	Merge pull request #3578 from snir911/2.4.0-alpha2-branch-bump # Kata Containers 2.4.0-alpha2	2022-02-01 16:36:05 +01:00
Fabiano Fidêncio	903a6a455d	versions: Bump critools to its 1.23 release critools v1.23.0 has been released a few days ago. As we're already bumping kubernetes, and CRI-O, let's also update critools. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-01 10:50:15 +01:00
Fabiano Fidêncio	63eb115890	versions: bump CRI-O to its 1.23 release As done for kubernetes, CRI-O should also be bumped to its 1.23 release so those are in sync. Fixes: #3481 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-01 10:50:15 +01:00
Snir Sheriber	26e08b273c	release: Kata Containers 2.4.0-alpha2 - virtcontainers: Enable initrd for Cloud Hypervisor - versions: update Rust to 1.58.1 - Sandbox sizing feature - kata-deploy: Fix the tag replacement logic - docs: Update networking details in the architecture doc - Fix and re-enable s390x GoVMM tests - runtime: fix handling container spec's memory limit - ci: Pass function arguments in static-checks.sh - docs: Remove docker run and sysctl limitation - runtime: update runc and image-spec dependencies - agent: resolve unused variables in tests - Upgrade to Cloud Hypervisor v21.0 - runtime: rectify passing empty options to -ldflags - osbuilder: Remove libseccomp from Dockerfile - agent: fix the issue of creating new namespaces for agent - docs: Remove kata-pkgsync reference - docs: Redirect glossary to the wiki - workflows: Use base instead of head ref for kata-deploy-test - govmm: Use it from our own repo - tools: Fix groupname if it differs from username - workflows: Fix typo in kata-deploy-push action - release: Escape backticks in Libseccomp Notices - packaging: Remove kata-pkgsync tool - govmm: Bring the project in - version: bump to kubernetes 1.23 - vendor: update govmm - workflows: Ensure force-skip-ci skips all actions - runtime: -Wl,--s390-pgste for s390x - workflows: Use the correct branch ref on test kata-deploy - update apiVersion - scripts: Use shebang /usr/bin/env bash - packaging: Make kernel config accessible to guest - docs: fix a typo in host-cgroups.md doc - qemu: add support for SGX - experimentally enable the vcpu-hotplug for arm in qemu side - Remove all the non-tested rootfs - docs: Remove ccloudvm reference - runtime: Provide protection for shared data - kata-deploy: validate conf file can be created - runtime: it should rollback when failed in Sandbox AddInterface - libs: add some generated files to .gitignore - runtime: close span before return from function in case of error - packaging: Remove ccloudvm instructions and script - docs: Default machine type is q35 meanwhile - CI: Revert "CI: Switch to a mirror as gnu.org is down" - agent: fix the broken protobuf generation code - packaging: Remove obs packages testing for kata 2.0 - runtime: Remove docker comments for kata 2.0 configuration.tomls - docs: fix agent proto file path - qemu: update readonly flag for block devices - qemu: only set wait parameter for server mode socket based char device - qemu: Fix 32 bit int overflow in test file - qemu: Add support for legacy serial device - qemu: Remove -realtime in favor of -overcommit - Add clean shutdown support - govmm/qemu: Let IO/memory reservations be specified for bridge devices - QMP: Add ExecuteBlockdevAddWithDriverCache - qemu: Fix iommu_platform for CCW - qemu: Add credentials to qemu Cmd - Don't use deprecated 'props' argument to QMP 'object-add' - Use 'host_device' driver for blockdev backends - add support for "sandbox" feature to qemu - qemu: support read-only nvdimm - Support golang 1.16 - qemu: Consistent parameter building - qemu: Allow hot-plugging memory devices on PCI bridges - qemu: Add support for PEF - qemu: Add support for Secure Execution - qemu: VhostUserDevice CCW device numbers - qmp: remove chatty log - Fix qemu commandline issue with empty romfile - qemu: add support for tdx-guest object - qemu: Append memory backend for non-DIMM setups - qemu: add support for device loaders - qemu: support QEMU 6 - qmp: Add ro argument for block-device hotplug funcs - qemu: add arm64 to support list of dimm - qemu: enable "-pflash" - qemu: add pvpanic and dump guest memory support - Add serial ID to blk device - Make fw_cfg a slice - contributors: remove CONTRIBUTORS.md file - misc: Update for new GitHub organisation name - qemu: add fw_cfg flag to config - Add qom-get function - typo fix - Add support for hot-plugging IBM Adjunct Processor (AP) devices - github: enable github actions - travis: Run coveralls after success - qemu: add iommu_platform knob for qemuParams - qemu: Add NoReboot config Knob for qemuParams - Add multidevs option to fsdev - qemu/qmp: use boolean type for the vhost - qemu: add IOMMU Device - Enable Numa support for Power (ppc64le) architecture - qemu: Add max_ports option to virtio-serial device - Add rt clock definition for rtc clock in qemu - qemu: Add microvm machine type support - qemu: add pmem flag to memory-backend-file - Refactor code to support multiple virtio transports at runtime - qemu: Don't set ".cache-size=" when CacheSize is 0 - qemu: Add pcie-root-port device support. - qmp: Add ExecMemdevAdd and ExecQomSet API - qmp: add ExecutePCIVhostUserDevAdd and ExecuteChardevDel to hotplug vhost-user device - s390x: add s390x travis support - virtio-blk: Add support for share-rw flag - s390x: dimm not supported - improve qemu interaction - qmp: support command 'query-qmp-schema' - qmp: add checks for the CPU toplogy - qemu: support x86 SMP die - Support x-pci-vendor-id and x-pci-device-id pass to qemu - Support for virtio-blk-ccw - Allow sharing of memory backend file - qemu: add migration incoming defer support - qmp: add virtio-blk multiqueue - qemu: fix the issue of wrong driver for VirtioBlock - qemu: use MiB instead of Gib for virtio-fs cache size - qemu/qmp: re-implement mainLoop - qemu/qmp: fix readLoop() reuse scanner.Bytes() underlying array problem - govmm: add VhostUserFS vhost-user device type - qmp: Conditionally pass threadID and socketID when CPU device add - Fix travis - qmp: Add nvdimm support - qemu: Allow disable-modern option from QMP - qmp: Output error detail when execute QMP command failed - Run tests for the s390x build - Contributors: Add Clare Chen to CONTRIBUTORS.md - Verify govmm builds on s390x - Contributors: Add my name - qemu: Add s390x support - Update file headers , CONTRIBUTING.md and add CONTRIBUTORS.md - qmp: fix mem-path properties for hotplug memory. - qemu: change Context ID for Vsock to uint64 - qemu/qmp: preparation for s390x support - qemu/qmp: add new function ExecuteBlockdevAddWithCache - qemu: add support for pidfile option - qemu: Fix virtio-net-pci QMP command - qemu: Add support for romfile option - Update guidelines on security issue reporting - qemu: Add virtio-balloon device suppport. - qemu: Show full path to qemu binary at launch time - qemu: Fix the support of PCIe bridge - qmp: add ExecuteQueryMigration - qemu: skip setting system memory if it is set via dimm device - qmp: add "query-cpus" support - qemu/qmp: add vfio mediated device support on root bus - qemu/image: Reduce permissions of .iso creation dir - qemu/qmp: nic can works without vhost - qemu: Add rng device . - qemu/qmp: support query-memory-devices qmp command. - govmm: modify govmm to be compatible with qemu 2.8 - qemu/qmp: support hotplug a nic whose qdisc is mq - qmp: Remind users that you must first call ExecuteQMPCapabilities() - qemu/qmp: Add netdev_add with chardev support - Add some negative test cases - qemu: Use the supplied context.Context for launching - disk: Add --share-rw option for hotplugging disks - qemu/qmp: add vfio mediated device support - qemu: Do not try and generate invalid RTC parameters - qemu/qmp: add addr and bus to hotplug vsock devices - qemu/qmp: add function for hotplug network by fds - qemu/qmp: implement functions to hotplug chardevs and serial ports - qemu: add vhostfd and disable-modern to vsock hotplug - Add two additional static analysis tools to the travis builds - qemu/qmp: implement function for hotplug network - qemu: add vhostfd and disable-modern to vhost-vsock-pci - qemu/qmp: implement function to hotplug vsock-pci - Add APIs to enable vm templating - qemu: Add qemu parameter for PCI address for a bridge. - Add ability to associate a SCSI controller device with an iothread - qemu: add initrd support - qemu: add DisableModern to SCSIController - qemu: add extra options for the machine type - scsi: Add function to send device_add qmp command for a scsi device - Compute coverage statistics for unit tests in Travis builds - scsi: Add a scsi controller device - qemu: Add VSOCK support - Vhost-user: add block device support - qemu: Add maxcpus attribute to -smp - Add badges to the README.md file - Enable Travis builds - qemu: introduce vhost-user handling `bcce1a19` versions: update Rust to 1.58.1 `7c956e0d` virtcontainers: Enable initrd for Cloud Hypervisor `aa3fae13` kata-deploy: Fix the tag replacement logic `8cde5413` runtime: introduce static sandbox resource management `13eb1f81` docs: describe vCPU handling when hotplug is unavailable `c3e97a0a` config: updates to configuration clh, fc toml template `75ae5361` docs: Update networking details in the architecture doc `fc0e0951` runtime: fix handling container spec's memory limit `7af40fbc` docs: Remove docker run, sysctl and docker daemon limitations `17211979` ci: Pass function arguments in static-checks.sh `5643c6dc` runtime: update runc and image-spec dependencies `2f37165f` govmm: Unite VirtioNet tests `4a428fd1` govmm: readonly=on in s390x blkdev test `79ecebb2` govmm: TestAppendPCIBridgeDevice et al. on !s390x `dc285ab1` govmm: Remove unnecessary comma in iommu_platform `d23f2eb0` govmm: Revert "govmm: s390x: Skip broken tests" `f52ce302` runtime: rectify passing empty options to -ldflags `2d799cbf` virtcontainers: clh: Re-generate the client code `7e15e99d` versions: Upgrade to Cloud Hypervisor v21.0 `9c2f1de1` docs: Remove kata-pkgsync reference `df6ae1e7` osbuilder: Remove libseccomp from Dockerfile `0338fc65` docs: Redirect glossary to the wiki `3924470c` workflows: Use base instead of head ref for kata-deploy-test `5ce9011a` govmm: s390x: Skip broken tests `8bcaed0b` govmm: Adapt license headers to kata-containers `6dd65779` govmm: Ignore govet checks, at least for now `de678a3a` govmm: Remove non-relevant top files `ec6655af` govmm: Use govmm from our own pkg `8cc088b5` packaging: Remove kata-pkgsync tool `a8b66de5` release: Escape backticks in Libseccomp Notices `c3785f66` workflows: Fix typo in kata-deploy-push action `f4a4c3c7` version: bump to kubernetes 1.23 `49223e67` runtime: remove enable_swap option `7a879164` workflows: Ensure a label change re-triggers the actions `d87ab14f` workflows: Ensure force-skip-ci skips all actions `5285ac2b` runtime: -Wl,--s390-pgste for s390x `fc646434` workflows: Use the correct branch ref on test kata-deploy `e347694f` tools: Fix groupname if it differs from username `41e0c414` vendor: update govmm `a5829a29` docs: fix a typo in host-cgroups.md doc `92773170` agent: resolve unused variables in tests `8939b0f8` qemu: add support for SGX `2d0ec00a` Qemu: Enable the vcpu-hotplug for arm `e22a4e2a` packaging: Make kernel config accessible to guest `adffd3f8` scripts: Use shebang /usr/bin/env bash `e4b7a12b` qat: Add Debian to the distro examples `6979d5be` osbuilder: Remove gentoo rootfs-builder `22c1a093` osbuilder: Remove suse rootfs-builder `85dd5873` osbuilder: Remove fedora rootfs-builder `06fae29f` osbuilder: Remove centos rootfs-builder `01005c5a` docs: Remove ccloudvm reference `878ab93c` runtime: Provide protection for shared data `ac7acbf8` kata-deploy: validate conf file can be created `7e2bc4d7` packaging: Remove ccloudvm instructions and script `85f5ae19` runtime: close span before return from function in case of error `106df33f` libs: add some generated files to .gitignore `b133a236` runtime: it should rollback when failed in Sandbox AddInterface `7f546748` CI: Revert "CI: Switch to a mirror as gnu.org is down" `c486c2ca` agent: fix the broken protobuf generation code `f6cdf464` docs: Default machine type is q35 meanwhile `b48322d4` packaging: Remove obs packages testing for kata 2.0 `ad16d75c` runtime: Remove docker comments for kata 2.0 configuration.tomls `905e124b` docs: fix agent proto file path `ea1a1738` agent: fix the issue of creating new namespaces for agent `b17f0739` qemu: update readonly flag for block devices `b5b9de1d` kata-deploy: Update API Version of RuntimeClass to v1 `f971801b` qemu: only set wait parameter for server mode socket based char device `82cc01d2` qemu: Fix 32 bit int overflow in test file `1d1a2313` qemu: Add support for legacy serial device `9a2bbeda` qemu: Remove -realtime in favor of -overcommit `fe83c208` qemu: Add support for --no-shutdown Knob `1ed52714` qmp: wait for POWERDOWN event in ExecuteSystemPowerdown() `de039da2` govmm/qemu: Let IO/memory reservations be specified for bridge devices `5c7998db` QMP: Add ExecuteBlockdevAddWithDriverCache `3a9a6749` qemu: Add credentials to qemu Cmd `d27256f8` qmp: Don't use deprecated 'props' field for object-add `d8cdf9aa` qemu: Drop support for versions older than 5.0 `18352c36` qemu: Fix iommu_platform for vhost user CCW `1b021929` Use 'host_device' driver for blockdev backends `9518675e` add support for "sandbox" feature to qemu `335fa816` qemu: fix golangci-lint errors `61b63787` .github/workflows: reimplement github actions CI `9d6e7970` go: support go modules `0d21263a` qemu: support read-only nvdimm `ff34d283` qemu: Consistent parameter building `0e19ffb6` qemu: Allow hot-plugging memory devices on PCI bridges `c135681d` qemu: Add support for PEF `03b55ea5` qemu: Add support for Secure Execution `7a367dc0` qemu: Simplify (Object).Valid() `a6cec2d3` qemu: add support for SevGuest object `abd3c7ea` qemu: VhostUserDevice CCW device numbers `3eaeda7f` qemu: Refactor vhostuserDev.QemuParams `511cf58b` Fix qemu commandline issue with empty romfile `b3eac95b` qmp: remove frequent, chatty log `31418940` qemu: add support for tdx-guest object `4b136f3f` qemu: Append memory backend for non-DIMM setups `6213dea4` qemu: support QEMU 6 `0d47025d` qemu: add support for device loaders `e2eb549f` qmp: Add ro argument for block-device hotplug funcs `0592c825` qemu: add arm64 to support list of dimm `2079c15c` qemu: enable "-pflash" `b8cd7059` qmp: add dump-guest-memory support `d7836877` qemu: add pvpanic device to get GUEST_PANICKED event `43d774d2` Add serial to blk device `8cb8b24c` Make fw_cfg a slice `cb0d3391` contributors: remove CONTRIBUTORS.md file `29ba5a90` qemu: add fw_cfg flag to config `9f309c2a` misc: Update for new GitHub organisation name `3d46d08a` Add qom-get function `39c372a2` Add support for hot-plugging IBM VFIO-AP devices `f5bdd53c` travis: disable amd64 jobs `1af1c0d7` github: enable github actions `4831c6e0` travis: Run coveralls after success `cf0f05d2` qemu: add iommu_platform knob for qemuParams `6645baf2` qemu: Add NoReboot config Knob for qemuParams `abca6f3c` Add multidevs option to fsdev `cc538766` qemu/qmp: use boolean type for the vhost `e57e86e2` qemu: add IOMMU Device `b2aa0225` Enable Numa support for Power (ppc64le) architecture `29529a5d` Add rt clock definition for rtc clock in qemu `0e98b613` qemu: Add max_ports option to virtio-serial device `787c86b7` qemu: Add microvm machine type support `5378725f` qemu: add pmem flag to memory-backend-file `3700c55d` qemu: add block device readonly support `88a25a2d` Refactor code to support multiple virtio transports at runtime `2ee53b00` qemu: Don't set ".cache-size=" when CacheSize is 0 `f1252f6e` qemu: Add pcie-root-port device support. `6667f4e9` qmp_test: Add TestExecMemdevAdd and TestExecQomSet `201fd0ae` qmp: Add ExecMemdevAdd and ExecQomSet API `e04be2cc` qmp: add ExecutePCIVhostUserDevAdd API `13aeba09` qmp: support command 'chardev-remove' `6d6b2d88` s390x: add s390x travis support `175ac499` typo fix `cb9f640b` virtio-blk: Add support for share-rw flag `9463486d` s390x: dimm not supported `164bd8cd` test/fmt: drop extra newlines `73555a40` qmp: add query-status API `234e0edf` qemu: fix memory prealloc handling `30bfcaaa` qemu: add debug logfile `79e0d533` qmp: support command 'query-qmp-schema' `68cdf64f` test: add cpu topology tests `e0cf9d5c` qmp: add checks for the CPU toplogy `a5c11908` qemu: support x86 SMP die `8fd28e23` Support x-pci-vendor-id and x-pci-device-id pass to qemu `713d0d94` s390x: add virtio-blk-ccw type `65cc343f` test: add devno in the tests for s390x `9cf98da0` s390x: add devno support `0c900f59` Allow sharing of memory backend file `f695ddf8` qemu: add migration incoming defer support `f0f18dd0` qmp: add virtio-blk multiqueue `7d3deea4` qemu: Add a virtio-blk-pci device driver support `058cda06` qemu: use MiB instead of Gib for virtio-fs cache size `694a7b1c` qemu/qmp: re-implement mainLoop `5712b119` qemu/qmp: fix readLoop() reuse scanner.Bytes() underlying array problem `3c84b1da` govmm: add VhostUserFS vhost-user device type `4692f6b9` qmp: Conditionally pass threadID and socketID when CPU device add `1f51b438` Update the versions of Go used to build GoVMM `ad310f9f` Fix staticcheck S1023 `932fdc7f` Fix staticcheck S1023 `cb2ce933` Fix staticcheck S1008 `f0172cd2` Fix staticcheck (S1002) `5f2e630b` Fix staticcheck (S1025) `4beea513` Fix staticcheck (ST1005) errors `97fc3435` contributors: add my name `c891f5f8` qmp: Add nvdimm support `f9b31c0f` qemu: Allow disable-modern option from QMP `d6173077` Run tests for the s390x build `b36b5a8f` Contributors: Add Clare Chen to CONTRIBUTORS.md `b41939c6` Contributors: Add my name `dab4cf1d` qmp: Add tests `5ea6da14` Verify govmm builds on s390x `ee75813a` contributors: add my name `c80fc3b1` qemu: Add s390x support `ca477a18` Update source file headers `e68e0056` Update the CONTRIBUTING.md `2b7db547` Add the CONTRIBUTORS.md file `b3b765cb` qemu: test Valid for Vsock for Context ID `3becff5f` qemu: change of ContextID from uint32 to uint64 `f30fd135` qmp: Output error detail when execute QMP command failed `7da6a4c7` qmp: fix mem-path properties for hotplug memory. `e4892e33` qemu/qmp: preparation for s390x support `110d2fa0` qemu/qmp: add new function ExecuteBlockdevAddWithCache `a0b0c86e` qmp_test: Change QMP version from 2.6 to 2.9 `10c36a13` qemu: add support for pidfile option `9c819db5` qemu: Fix virtio-net-pci QMP command `7fdfc6a4` qemu: Add support for romfile option `e74de3c7` Update guidelines on security issue reporting `ec83abe6` qemu: Add virtio-balloon device suppport. `46970781` qemu: Show full path to qemu binary at launch time `ef725050` qemu: Fix the support of PCIe bridge `56f645ea` qmp: add ExecuteQueryMigration `a429677a` govmm: fix memory prealloc `1130aab8` qmp: add "query-cpus" support `de5d2788` qemu/qmp: add vfio mediated device support on root bus `de00d7a6` qemu/image: Reduce permissions of .iso creation dir `1a1fee75` qemu/qmp: nic can works without vhost `6c3d84ea` qemu: Add virtio RNG device. `b16291cf` qemu/qmp: support query-memory-devices qmp command. `ce070d11` govmm: modify govmm to be compatible with qemu 2.8 `0286ff9e` qemu/qmp: support hotplug a nic whose qdisc is mq `8515ae48` qmp: Remind users that you must first call ExecuteQMPCapabilities() `21504d31` qemu/qmp: Add netdev_add with chardev support `ed34f616` Add some negative test cases for qmp.go `17cacc72` Add negative test cases for qemu.go `2706a07b` qemu: Use the supplied context.Context for launching `e46092e0` qemu: Do not try and generate invalid RTC parameters `fcaf61dc` qemu/qmp: add vfio mediated device support `4461c459` disk: Add --share-rw option for hotplugging disks `68519998` qemu/qmp: add addr and bus to hotplug vsock devices `10efa841` qemu/qmp: add function for hotplug network by fds `80ed88ed` qemu/qmp: implement function to hotplug serial ports `ca46f21f` qemu/qmp: implement function to hotplug character devices `03f1a1c3` qemu/qmp: implement getfd `84b212f1` qemu: add vhostfd and disable-modern to vsock hotplug `12dfa872` qemu/qmp: implement function for hotplug network `3830b441` qemu: add vhostfd and disable-modern to vhost-vsock-pci `f700a97b` qemu/qmp: implement function to hotplug vsock-pci `4ca232ec` qmp_test: Fix Warning and Error level logs `430e72c6` qemu,qmp: Enable gas security checker `ffc06e6b` qemu,qmp: Add staticcheck to travis and fix errors `54caf781` qmp: add hotplug memory `e66a9b48` qemu: add appendMemoryKnobs helper `8aeca153` qmp: add migrate set arguments `a03d4968` qmp: add set migration capabilities `0ace4176` qemu: allow to set migration incoming `723bc5f3` qemu: allow to create a stopped guest `283d7df9` qemu: add file backed memory device support `30aeacb8` qemu: Add qemu parameter for PCI address for a bridge. `9130f375` scsi: Allow scsi controller to associate with an IO thread. `a54de183` iothread: Add ability to configure iothreads `0c0ec8f3` qemu: add initrd support `68f30718` qemu: add DisableModern to SCSIController `693d9548` qemu: add options for the machine type `3273aafd` scsi: Add function to send device_add qmp command for a scsi device `6d198b8a` Compute coverage statistics for unit tests in Travis builds `3a31da32` scsi: Add a scsi controller device `5316779d` qemu: Add VSOCK support `f5655366` vhost-user: add blk device support `e9e27673` vhost-user: updating comments for accuracy, rename device field `8fe57236` qemu: Add maxcpus attribute to -smp `3baa7765` Add badges to the README.md file `d74e3b66` Fix errcheck failures in the unit tests `db60e32f` Enable Travis builds `9cb47fc0` Add .gitignore file. `a8aaf534` Add project documentation `57aafb56` Remove all references to and dependencies on ciao `27709fce` Move files to the qemu folder `48feb29f` qemu: introduce vhost-user handling `b8ddd244` qemu: Add function to list hotpluggable CPUs `8c428ed7` qemu: Add function to hotplug CPUs `24b14059` qemu: Add functions to process QMP response `e39da6ca` qmp: Add support for hot plugging VFIO devices on PCI(E) bridges `bc030d13` qemu: Add a SysProcAttr parameter to CreateCloudInitISO `11977072` qemu: Add a SysProcAttr parameter to LaunchCustomQemu `b639da45` qemu: Add function to hotplug vfio device `7e5614b8` Networking: Add vhost fd support `14316ce0` qemu/qmp: Implement function to hot plug PCI devices `83485dc9` qemu: Implement Bridge struct `cfa8a995` Networking: Add support for handling macvtap interfaces `83126d3e` bios: add support for custom bios `3da2ef9d` QEMU: Knobs: Huge Page Support: Add support for huge pages `9bfa7927` vfio: Add ability to pass VFIO devices to qemu `a70ffd19` Build: Fix the build after repo move. `0c206170` Knobs: Modify the behaviour of the Mlock knob. `ddee41d5` QEMU: Enable realtime options `4ecb9de5` qemu: Add support for memory pre-allocation `1fbe6c5d` qmp: Update block device deletion for newer versions of qemu `e74aeef1` qemu: Add disable-modern option for virtio devices `8d617ff5` qemu: Update virtio-net-pci command line `25a2dc8f` qemu: Update blockdev-add qmp command to support newer qemu versions `d4f77103` misc: Remove some of the code flagged by unused linter `a1600dc1` misc: Remove unused fields identified by structcheck `58a835e6` misc: Remove unused variables identified by varcheck `d48b5b5f` qemu: Add PCI option to the NetDevice `a84228ae` qemu: Document how cancelling works. `1e7202a5` qemu: Fix spelling error in qmp_test.go `c6f33453` qemu: Fix command cancelling. `a8a798b0` qemu, ciao-launcher: Move ConfigDrive ISO creation code to qemu `30cf1163` Add missing bus parameter for a CharDevice `2aa5f5a3` qemu: Add support for serial port addition `6fe338d6` qemu: Support creating multiple QMP sockets `992b861e` qemu: Add the daemonize qemu option to the Knobs structure `997cb233` qemu: Remove dead code `e555f565` qemu: Add support for socket based consoles `eae8fae0` qemu: Fix security model typo `db067857` qemu: Make Config's FDs field private `12f6ebe3` qemu: Embed the qemu parameters into the Config structure `e193a77b` qemu: Add support for block devices `3908185c` qemu: Add MACVTAP support `6d7dfa04` qemu: Get rid of the Driver structure `cc9cb33a` qemu: Add QMPSocket specific type `2d736d71` qemu: Add RTC specific types `e543c338` qemu: Probe each qemu device with a driver `eda8607c` qemu: Add netdev options to the Device structure `4780e237` qemu: Add multi-queue and vhost definitions to NetDevice `137e7c72` qemu: Add a NetDevice slice to the Config structure `c0e2aaca` qemu: Add one unit test for the Config strings `5ba8ef79` qemu: Add QMP socket unit tests `7b2f7eb5` qemu: Add Memory and SMP unit tests `2ea9b9a3` qemu: Add a Kernel unit test `8e495f6e` qemu: Add a Knobs unit test `8aeb3d45` qemu: Add an Object unit test `38e041dc` qemu: Add Device unit tests `54d32c24` qemu: Add parameters adding unit tests `ebfa382d` qemu: Add a Knobs field to the Config structure `fe1bdcd2` qemu: Remove the extra parameters field from the Config structure `15bce61a` qemu: Group all machine configurations into one structure `d94b5af8` qemu: Add a VGA parameter field to the Config structure `4892d041` qemu: Add a Global parameter field to the Config structure `612a5a9e` qemu: Add a RTC field to the Config structure `c63ec096` qemu: Add a SMP field to the Config structure `7cf386a8` qemu: Add a Memory field to the Config structure `b198bc67` qemu: Add a UUID field to the Config structure `6239e846` qemu: Add a Character Devices slice field to the Config structure `73e2d53c` qemu: Add a Filesystem Devices slice field to the Config structure `518ba627` qemu: Add a Kernel field to the Config structure `b973bc59` qemu: Add an Object slice field to the Config structure `8744dfe8` qemu: Add a Device slice field to the Config structure `5458de70` qemu: Add a QMP socket field to the Config structure `17118270` qemu: Add qemu's name to the Config structure `37a1f500` qemu: Add configuration structure to simplify LaunchQemu `5ccbaf2b` ciao-launcher, qemu: Upgrade to new context package. `f5720198` qemu: Use null QMP logger when the logger parameter is nil `7d4199a4` qemu: Fix ineffassign error `7f50a415` qemu: Fix a silly bug in LaunchQemu `fc6bf8cf` qemu: Add package documentation `306f54a9` ciao-launcher, qemu: Move launchQemu to qemu `344aa22b` qemu: Add the qemu package Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-02-01 11:36:28 +02:00
Peng Tao	732c45de94	Merge pull request #3567 from jodh-intel/ch-enable-initrd virtcontainers: Enable initrd for Cloud Hypervisor	2022-01-29 14:23:32 +08:00
Peng Tao	86d418251e	Merge pull request #3571 from liubin/fix/2570-update-rust-version versions: update Rust to 1.58.1	2022-01-29 14:17:56 +08:00
Tim Zhang	5083ae65a0	workflows: stop checking revert commit The commit message of a revert commit usually generated by `git revert`, we should consider this as legal. Consider the commit as the merge commit if the subject starts with 'Reject "' Follow the pr kata-containers/tests/#3938, the suttle diffrence is we skip all commit checks for revert commit including fixes checking and subsystem checking. Because the commit was reverted must have passed the check so the revert-commit should have the Fixes and Subsystem. Fixes: #3568 Fixes: kata-containers/tests#3934 Signed-off-by: Tim Zhang <tim@hyper.sh>	2022-01-29 11:45:20 +08:00
bin	bcce1a1911	versions: update Rust to 1.58.1 Update Rust to 1.58.1 to fix CVE-2022-21658. Fixes: #3570 Signed-off-by: bin <bin@hyper.sh>	2022-01-29 11:35:56 +08:00
Samuel Ortiz	14e7f52a91	virtcontainers: Split the rootless package into OS specific parts Move the netns specific bits into a Linux specific file. Fixes: #3607 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-28 16:20:28 -08:00
James O. D. Hunt	7c956e0d27	virtcontainers: Enable initrd for Cloud Hypervisor Since CH has supported booting with an initramfs since version 0.7.0 [1], allow an `initrd=` to be specified. Fixes: #3566. [1] - https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v0.7.0 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-01-28 10:49:10 +00:00
Eric Ernst	a5ebeb96c1	Merge pull request #2941 from egernst/sandbox-sizing-feature Sandbox sizing feature	2022-01-27 09:37:57 -08:00
snir911	7ac0fcb9e0	Merge pull request #3560 from fidencio/fix-kata-deploy-tag-replacement kata-deploy: Fix the tag replacement logic	2022-01-27 15:48:20 +02:00
Francesco Giudici	25b2bc713e	Merge pull request #3548 from amshinde/update-network-arch-doc docs: Update networking details in the architecture doc	2022-01-27 09:18:54 +01:00
Fabiano Fidêncio	aa3fae1397	kata-deploy: Fix the tag replacement logic When building a non-stable release, the tag is always "latest¨, instead of the version. The same magic done for setting the correct tags up should be done for replacing the tag on the kata-deploy and kata-cleanup yaml files, as part of the kata-deploy test. Fixes: #3559 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-26 20:42:48 +01:00
Eric Ernst	8cde54131a	runtime: introduce static sandbox resource management There are software and hardware architectures which do not support dynamically adjusting the CPU and memory resources associated with a sandbox. For these, today, they rely on "default CPU" and "default memory" configuration options for the runtime, either set by annotation or by the configuration toml on disk. In the case of a single container (launched by ctr, or something like "docker run"), we could allow for sizing the VM correctly, since all of the information is already available to us at creation time. In the sandbox / pod container case, it is possible for the upper layer container runtime (ie, containerd or crio) could send a specific annotation indicating the total workload resource requirements associated with the sandbox creation request. In the case of sizing information not being provided, we will follow same behavior as today: start the VM with (just) the default CPU/memory. If this information is provided, we'll track this as Workload specific resources, and track default sizing information as Base resources. We will update the hypervisor configuration to utilize Base+Workload resources, thus starting the VM with the appropriate amount of CPU and memory. In this scenario (we start the VM with the "right" amount of CPU/Memory), we do not want to update the VM resources when containers are added, or adjusted in size. This functionality is introduced behind a configuration flag, `static_sandbox_resource_mgmt`. This is defaulted to false for all configurations except Firecracker, which is set to true. This'll greatly improve UX for folks who are utilizing Kata with a VMM or hardware architecture that doesn't support hotplug. Note, users will still be unable to do in place vertical pod autoscaling or other dynamic container/pod sizing with this enabled. Fixes: #3264 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-26 09:04:38 -08:00
Eric Ernst	13eb1f81b9	docs: describe vCPU handling when hotplug is unavailable Describe the static_sandbox_resource_mgmt flag, and how this applies to configurations that do not utilize hotplug. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-26 09:52:42 -08:00
Eric Ernst	c3e97a0a22	config: updates to configuration clh, fc toml template There's some cruft -- let's update to reflect reality, and ensure that we match what is expected. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-26 09:45:50 -08:00
Francesco Giudici	ab447285ba	kata-monitor: add kubernetes pod metadata labels to metrics Add the POD metadata we get from the container manager to the metrics by adding more labels. Fixes: #3551 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Francesco Giudici	834e199eee	kata-monitor: drop unused functions Drop the functions we are not using anymore. Update the tests too. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Francesco Giudici	7516a8c51b	kata-monitor: rework the sandbox cache sync with the container manager Kata-monitor detects started and terminated kata pods by monitoring the vc/sbs fs (this makes sense since we will have to access that path to access the sockets there to get the metrics from the shim). While kata-monitor updates its sandbox cache based on the sbs fs events, it will schedule also a sync with the container manager via the CRI in order to sync the list of sandboxes there. The container manager will be the ultimate source of truth, so we will stick with the response from the container manager, removing the sandboxes not reported from the container manager. May happen anyway that when we check the container manager, the new kata pod is not reported yet, and we will remove it from the kata-monitor pod cache. If we don't get any new kata pod added or removed, we will not check with the container manager again, missing reporting metrics about that kata pod. Let's stick with the sbs fs as the source of truth: we will update the cache just following what happens on the sbs fs. At this point we may have also decided to drop the container manager connection... better instead to keep it in order to get the kube pod metadata from it, i.e., the kube UID, Name and Namespace associated with the sandbox. Every time we get a new sandbox from the sbs fs we will try to retrieve the pod metadata associated with it. Right now we just attach the container manager sandbox id as a label to the exposed metrics, making hard to link the metrics to the running pod in the kubernetes cluster. With kubernetes pod metadata we will be able to add them as labels to map explicitly the metrics to the kubernetes workloads. Fixes: #3550 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Francesco Giudici	e78d80ea0d	kata-monitor: silently ignore CHMOD events on the sandboxes fs We currently WARN about unexpected fs events, which includes CHMOD operations (which should be actually expected...). Just ignore all the fs events we don't care about without any warn. We dump all the events with debug log in any case. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Francesco Giudici	e9eb34cea8	kata-monitor: improve debug logging Improve debug log formatting of the sandbox cache update process. Move raw and tracing logs from the DEBUG to the TRACE log level. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Fabiano Fidêncio	f7c7dc8d33	Merge pull request #3504 from Jakob-Naucke/s390x-govmm-tests Fix and re-enable s390x GoVMM tests	2022-01-26 12:57:38 +01:00
Archana Shinde	081a235efe	Merge pull request #3540 from bradenrayhorn/fix-negative-memory-limit runtime: fix handling container spec's memory limit	2022-01-25 05:17:05 -08:00
Archana Shinde	75ae536196	docs: Update networking details in the architecture doc Updated the doc to clarify certain networking details and external links to some of the networking terms used. Fixes #3308 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-01-25 17:04:27 +05:30
Bin Liu	905b4b09d2	Merge pull request #3543 from Jakob-Naucke/fwdport-static-args ci: Pass function arguments in static-checks.sh	2022-01-25 14:07:32 +08:00
GabyCT	0fa7814c21	Merge pull request #3546 from GabyCT/topic/removesystcl docs: Remove docker run and sysctl limitation	2022-01-24 15:41:23 -06:00
Braden Rayhorn	fc0e095180	runtime: fix handling container spec's memory limit The OCI container spec specifies a limit of -1 signifies unlimited memory. Update the sandbox memory calculator to reflect this part of the spec. Fixes: #3512 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-01-24 13:30:32 -06:00
Gabriela Cervantes	7af40fbc66	docs: Remove docker run, sysctl and docker daemon limitations This PR removes the docker run and sysctl limitation reference for kata 2.0 as well as docker daemon limitation as currently for kata we are not supporting docker and this reference belonged to kata 1.0 Fixes #3545 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-24 18:11:54 +00:00
Jakob Naucke	016569fd8e	Merge pull request #3476 from bergwolf/runtime-dep runtime: update runc and image-spec dependencies	2022-01-24 15:53:43 +01:00
Jakob Naucke	1721197934	ci: Pass function arguments in static-checks.sh e.g. when called from the tests repo Fixes: #3525 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-24 12:05:10 +01:00
Binbin Zhang	4fc4c76b87	agent: Fix execute_hook() args error 1. The hook.args[0] is the hook binary name which shouldn't be included in the Command.args. 2. Add new unit tests Fixes: #2610 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2022-01-24 14:13:24 +08:00
Peng Tao	5643c6dcae	runtime: update runc and image-spec dependencies To address two depbot security warnings. Fixes: #3475 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-01-24 11:49:05 +08:00
Fabiano Fidêncio	8a8ae8aae7	Merge pull request #3531 from egernst/test-lint agent: resolve unused variables in tests	2022-01-21 21:57:13 +01:00
Bo Chen	94b343492d	Merge pull request #3520 from likebreath/0120/clh_v21.0 Upgrade to Cloud Hypervisor v21.0	2022-01-21 08:08:13 -08:00
Jakob Naucke	918dcd5f69	Merge pull request #3522 from Amulyam24/runtime-build runtime: rectify passing empty options to -ldflags	2022-01-21 15:54:38 +01:00
Jakob Naucke	2f37165f46	govmm: Unite VirtioNet tests no explicit PCI test, just switch path depending on architecture (CCW for s390x, PCI for others). Also fixes an unknown variable error. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Jakob Naucke	4a428fd1c5	govmm: readonly=on in s390x blkdev test Forgotten in `b17f07395c`, also fixes a test. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Jakob Naucke	79ecebb280	govmm: TestAppendPCIBridgeDevice et al. on !s390x s390x uses CCW, also fixes a lint failure about undeclared variables on s390x. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Jakob Naucke	dc285ab1d7	govmm: Remove unnecessary comma in iommu_platform in FSDevice.QemuParams for VirtioCCW. Forgotten in `ff34d283db`, also fixes a test. Fixes: #3500 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Jakob Naucke	d23f2eb0f0	govmm: Revert "govmm: s390x: Skip broken tests" This reverts commit `5ce9011a36`. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Amulya Meka	f52ce302bc	runtime: rectify passing empty options to -ldflags When no options are passed to -ldflags, it passes incorrect values(in this case, $BUILDFLAGS) to it. Fix passing empty values by passing $KATA_LDFLAGS in quotes. Fixes: #3521 Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>	2022-01-21 06:57:52 +00:00
Fabiano Fidêncio	618aa659d6	Merge pull request #3509 from ManaSugi/remove-libseccomp-from-dockerfile osbuilder: Remove libseccomp from Dockerfile	2022-01-21 06:50:53 +01:00
Tim Zhang	eac003462d	Merge pull request #3370 from lifupan/fix_namespace agent: fix the issue of creating new namespaces for agent	2022-01-21 10:25:43 +08:00
Bo Chen	2d799cbfa3	virtcontainers: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v21.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-01-20 17:48:10 -08:00
Bo Chen	7e15e99d5f	versions: Upgrade to Cloud Hypervisor v21.0 Highlights from the Cloud Hypervisor release v21.0: 1) Efficient Local Live Migration (for Live Upgrade); 2) Recommended Kernel is Now 5.15; 3) Bug fixes on OpenAPI yaml spec file, avoid deadlock for live-migration, etc. Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v21.0 Fixes: #3519 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-01-20 17:43:14 -08:00
Eric Ernst	25aa2e8578	Merge pull request #3514 from GabyCT/topic/removekatapkg docs: Remove kata-pkgsync reference	2022-01-20 13:04:37 -08:00
Gabriela Cervantes	9c2f1de16d	docs: Remove kata-pkgsync reference Now that kata-pkgsync has been removed, this PR removes the reference in the documentation. Fixes #3513 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-20 18:00:58 +00:00
James O. D. Hunt	16418be3c3	Merge pull request #3506 from jodh-intel/docs-glossary-wiki-redirect docs: Redirect glossary to the wiki	2022-01-20 17:00:58 +00:00
Fabiano Fidêncio	b964bfc97d	Merge pull request #3503 from fidencio/wip/kata-deploy-use-base-ref workflows: Use base instead of head ref for kata-deploy-test	2022-01-20 17:02:03 +01:00
Manabu Sugimoto	df6ae1e789	osbuilder: Remove libseccomp from Dockerfile Remove the libseccomp package from Dockerfile of `alpine` and `clearlinux` because the libseccomp library is installed by the `ci/install_libseccomp.sh` script when building the kata-agent. Fixes: #3508 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-01-21 00:32:57 +09:00
James O. D. Hunt	0338fc657f	docs: Redirect glossary to the wiki Whilst we work to update the [copy of the glossary currently hosted in the wiki](https://github.com/kata-containers/kata-containers/wiki/Glossary), update the in-tree glossary doc to refer to that wiki version. Fixes: #3505. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-01-20 14:01:24 +00:00
Binbin Zhang	168fadf1de	ci: Weekly check whether the docs url is alive Weekly check(at 23:00 every Sunday) whether the docs url is ALIVE, so that we can find the failed url in time Fixes #815 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2022-01-20 19:56:15 +08:00
Fabiano Fidêncio	3924470c8f	workflows: Use base instead of head ref for kata-deploy-test Although I've done tests on my own fork using `head_ref` and those worked, it seems those only worked as the PR was coming from exactly the same repository as the target one. Let's switch to base_ref, instead, which we for sure have as part of our repo. The downside of this is that we run the test with the last merged PR, rather than with the "to-be-approved" PR, but that's a limitation we've always had. Fixes: #3482 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-20 11:04:14 +01:00
Fabiano Fidêncio	1a59c5743e	Merge pull request #3496 from fidencio/wip/use-govmm-from-kata govmm: Use it from our own repo	2022-01-20 09:47:32 +01:00
Archana Shinde	f71eedf3a0	Merge pull request #3437 from haslersn/un-gn tools: Fix groupname if it differs from username	2022-01-19 22:25:59 -08:00
Archana Shinde	f29f04e1e0	Merge pull request #3486 from fidencio/wip/fix-kata-deploy-push-workflow workflows: Fix typo in kata-deploy-push action	2022-01-19 19:42:37 -08:00
Archana Shinde	1c3f8c708e	Merge pull request #3488 from ManaSugi/fix-seccomp-notice-in-release-page release: Escape backticks in Libseccomp Notices	2022-01-19 19:40:54 -08:00
Fabiano Fidêncio	5ce9011a36	govmm: s390x: Skip broken tests For now a bunch of tests are simply not working. Let's skip them all, and re-enable them once kata-containers/kata-containers/issues/3500 gets fixed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-20 01:04:35 +01:00
Fabiano Fidêncio	0570317e7b	Merge pull request #3494 from GabyCT/topic/removeobsremains packaging: Remove kata-pkgsync tool	2022-01-19 19:59:25 +01:00
Fabiano Fidêncio	8bcaed0b4f	govmm: Adapt license headers to kata-containers Both projects follow the same license, Apache-2.0, but the header saying that comes from govmm is different from the one expected for the tests present on the kata-containers repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 18:02:46 +01:00
Fabiano Fidêncio	6dd6577986	govmm: Ignore govet checks, at least for now govet checks have been ignored on govmm repo, but those are enabled on kata-containers one. So, in order to avoid failing our CIs let's just keep ignoring the checks for the govmm structs and have an issue opened for fixing it whenever someone has cycles to do it. The important bit here is, we're not making anything worse that it already is. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 18:02:46 +01:00
Fabiano Fidêncio	de678a3aaa	govmm: Remove non-relevant top files govmm, from now on, should follow the same guidelines from contributing, copying, and etc as kata-containers does. The go.mod is not needed anymore as the project lives inside the runtime. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 18:02:46 +01:00
Fabiano Fidêncio	ec6655af87	govmm: Use govmm from our own pkg Let's stop using govmm from kata-containers/govmm and let's start using it from our own repo. Fixes: #3495 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 18:02:46 +01:00
Fabiano Fidêncio	c9c1aab97b	Merge pull request #3468 from fidencio/wip/bring-govmm-in govmm: Bring the project in	2022-01-19 18:00:09 +01:00
Gabriela Cervantes	8cc088b540	packaging: Remove kata-pkgsync tool This PR removes the kata-pkgsync tool that is mainly used for OBS packages, currently for kata 2.0 we do not have OBS packages and this tool is not being used for kata 2.0 Fixes #3493 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-19 15:53:37 +00:00
Manabu Sugimoto	a8b66de5e8	release: Escape backticks in Libseccomp Notices Escape (with backslash) backticks (`) to prevent them from being evaluated by the shell. Fixes: #3487 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-01-19 19:45:05 +09:00
Fabiano Fidêncio	c3785f6665	workflows: Fix typo in kata-deploy-push action A `:` was missed when `d87ab14fa7` was introduced. Fixes: #3485 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 11:05:58 +01:00
Fabiano Fidêncio	b8421fb72b	Merge pull request #3478 from egernst/bump-k8s version: bump to kubernetes 1.23	2022-01-19 09:53:46 +01:00
Fabiano Fidêncio	fb7f98bd2e	Merge govmm into kata-containers	2022-01-19 09:40:15 +01:00
Eric Ernst	f4a4c3c76a	version: bump to kubernetes 1.23 Current latest release is 1.23.1. Let's update to this version for our integration testing. Fixes: #3477 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-18 21:34:24 -08:00
Julio Montes	c0e28b54a1	Merge pull request #3460 from devimc/2021-01-17/vendorGovmm vendor: update govmm	2022-01-18 15:54:11 -06:00
Wainer Moschetta	b9876d9078	Merge pull request #3472 from fidencio/wip/force-skip-ci-should-skip-all-github-actions workflows: Ensure force-skip-ci skips all actions	2022-01-18 18:00:50 -03:00
Jakob Naucke	f5f036247d	Merge pull request #3470 from Jakob-Naucke/pgste runtime: -Wl,--s390-pgste for s390x	2022-01-18 18:59:15 +01:00
Julio Montes	49223e67af	runtime: remove enable_swap option `enable_swap` option was added long time ago to add `-realtime mlock=off` to the QEMU's command line. Kata now supports QEMU 6, `-realtime` option has been deprecated and `mlock=on` is causing unexpected behaviors in kata. This patch removes support for `enable_swap`, `-realtime` and `mlock=` since they are causing bugs in kata. Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-01-18 11:12:29 -06:00
Fabiano Fidêncio	7a879164bd	workflows: Ensure a label change re-triggers the actions This is needed in order to ensure that, for instance, if `force-skip-ci` label is either added or removed later, the jobs related to the actions will be restarted and accordingly checked. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-18 14:39:01 +01:00
Fabiano Fidêncio	d87ab14fa7	workflows: Ensure force-skip-ci skips all actions Before this change it was only applied to the static-checks, but if we're already taking the extreme path of skipping the CI, we better ensure we skip all the actions and not just a few of them. Fixes: #3471 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-18 14:37:32 +01:00
Jakob Naucke	5285ac2b57	runtime: -Wl,--s390-pgste for s390x for linking. Required for basic KVM checks on some kernels (e.g. the one RHEL is currently shipping), cf. `6621441db5/target/s390x/kvm/meson.build (L15-L16)`. Fixes: #3469 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-18 11:32:03 +01:00
Fabiano Fidêncio	db451f3c27	Merge pull request #3463 from fidencio/wip/fix-kata-deploy-ref-branch workflows: Use the correct branch ref on test kata-deploy	2022-01-18 09:31:51 +01:00
Fabiano Fidêncio	fc64643437	workflows: Use the correct branch ref on test kata-deploy The action used for testing kata-deploy is entirely based on the action used to build the kata-deploy tarball, but while the latter is able to use the correct branch, the former always uses `main`. This happens as the `issue_comment`, from GitHub actions, passed the "default branch" as the GITHUB_REF. As we're not the first ones to face such a issue, I've decided to take one of the approaches suggested at one of the checkout's issues, https://github.com/actions/checkout/issues/331, and take advantage of a new action provided by the community, which will get the PR where the comment was made, give us that ref, and that then can be used with the checkout action, resulting on what we originally wanted. Fixes: #3443 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-17 23:23:26 +01:00
Fabiano Fidêncio	0b5c0ae2ae	Merge pull request #3188 from weiyuanke/fix_version update apiVersion	2022-01-17 23:20:06 +01:00
Carlos Venegas	5f41e199dd	Merge pull request #3438 from haslersn/usr-bin-env-bash scripts: Use shebang /usr/bin/env bash	2022-01-17 15:39:42 -06:00
Carlos Venegas	5a55313431	Merge pull request #3446 from jodh-intel/kernel-proc-config packaging: Make kernel config accessible to guest	2022-01-17 15:37:34 -06:00
Sebastian Hasler	e347694fff	tools: Fix groupname if it differs from username The script `tools/packaging/static-build/qemu/build-base-qemu.sh` previously failed on systems where the user's groupname differs from the username Fixes: #3461 Signed-off-by: Sebastian Hasler <sebastian.hasler@stuvus.uni-stuttgart.de>	2022-01-17 16:52:39 +01:00
Julio Montes	41e0c414a4	vendor: update govmm bring SGX support and other fixes shortlog: `8939b0f` qemu: add support for SGX `b17f073` qemu: update readonly flag for block devices `f971801` qemu: only set wait parameter for server mode socket based char device `82cc01d` qemu: Fix 32 bit int overflow in test file `1d1a231` qemu: Add support for legacy serial device `9a2bbed` qemu: Remove -realtime in favor of -overcommit `fe83c20` qemu: Add support for --no-shutdown Knob `1ed5271` qmp: wait for POWERDOWN event in ExecuteSystemPowerdown() fixes #3080 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-01-17 09:20:47 -06:00
Fabiano Fidêncio	7120c78946	Merge pull request #3432 from Kvasscn/kata_dev_fix_host-cgroups_typo docs: fix a typo in host-cgroups.md doc	2022-01-17 15:34:09 +01:00
Julio Montes	0781a21804	Merge pull request #208 from devimc/2022-01-12/supportSGX qemu: add support for SGX	2022-01-17 07:19:32 -06:00
zhanghj	a5829a294e	docs: fix a typo in host-cgroups.md doc Container1's cgroupsPath in pod2 should be /kubepods/pod2/container1. Fixes: #3431 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2022-01-17 09:17:01 +08:00
Eric Ernst	9277317098	agent: resolve unused variables in tests A few tests have unused or unread variables. Let's clean these up... Fixes: #3530 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-16 14:09:03 -08:00
Julio Montes	8939b0f8e0	qemu: add support for SGX Define and implement memory-backend-epc object Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-01-14 13:11:03 -06:00
Jianyong Wu	d370604fa5	Merge pull request #3292 from zyzii/vcpu-hotplug2 experimentally enable the vcpu-hotplug for arm in qemu side	2022-01-14 18:10:40 +08:00
Huang Shijie	2d0ec00aff	Qemu: Enable the vcpu-hotplug for arm Initially enable vcpu hotplug in qemu for arm base on Salli's work[1]. Fixes:#3280 Signed-off-by: Huang Shijie <shijie8@gmail.com> [1] https://github.com/salil-mehta/qemu/tree/virt-cpuhp-armv8/rfc-v1	2022-01-14 13:27:17 +00:00
James O. D. Hunt	e22a4e2a0a	packaging: Make kernel config accessible to guest Provide the `/proc/config.gz` file in guest kernels that allow the guest to determine the kernel configuration used to build the running kernel. Note that since `gunzip` expects to rename the gzip'ed file it operates on, to use this feature you need to run something like the following in the container environment: ```bash # cat /proc/config.gz\|gunzip -c ``` Fixes: #3445. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-01-14 08:50:34 +00:00
Fabiano Fidêncio	e10fd32a88	Merge pull request #3420 from fidencio/wip/remove-non-tested-rootfs Remove all the non-tested rootfs	2022-01-14 07:45:40 +01:00
Sebastian Hasler	adffd3f8b6	scripts: Use shebang /usr/bin/env bash Not all distros have `/bin/bash`, e.g. NixOS. Fixes: #3450 Signed-off-by: Sebastian Hasler <sebastian.hasler@stuvus.uni-stuttgart.de>	2022-01-13 22:53:28 +01:00
Fabiano Fidêncio	e4b7a12bf3	qat: Add Debian to the distro examples Debian is a supported rootfs that uses systemd as init, thus, it should be mentioned in the QAT README document. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-13 21:54:05 +01:00
Fabiano Fidêncio	6979d5be69	osbuilder: Remove gentoo rootfs-builder As the gentoo rootfs is not tested in our CI, we can't guarantee it actually works as expected. Whenever we have someone willing to maintain this rootfs we can have it added back, and also add a CI job to test it altogether, avoiding then any possible regression. Fixes: #2144 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-13 21:54:05 +01:00
Fabiano Fidêncio	22c1a093d7	osbuilder: Remove suse rootfs-builder As the suse rootfs is not tested in our CI, we can't guarantee it actually works as expected. Whenver we have someone willing to maintain this rootfs we can have it added back, and also add a CI job to test it altogether, avoiding then any possible regression. Fixes: #2145 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-13 21:54:05 +01:00
Fabiano Fidêncio	85dd587382	osbuilder: Remove fedora rootfs-builder As the fedora rootfs is not tested in our CI, we can't guarantee it actually works as expected. Whenever we have someone willing to maintain the rootfs we can have it added back, and also add a CI job to test it altogether, avoiding then any possible regression. Fixes: #2143 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-13 21:54:05 +01:00
Fabiano Fidêncio	06fae29f49	osbuilder: Remove centos rootfs-builder As the centos rootfs is not tested in our CI, we can't guarantee it actually works as expected. Whenever we have someone willing to maintain the rootfs we can have it added back, and also add a CI job to test it altogether, avoiding then any possible regression. Fixes: #2140 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-13 21:54:05 +01:00
Fabiano Fidêncio	0917addea7	Merge pull request #3449 from GabyCT/topic/removeccloudvmref docs: Remove ccloudvm reference	2022-01-13 21:43:23 +01:00
Gabriela Cervantes	01005c5a9c	docs: Remove ccloudvm reference This PR removes the ccloudvm reference at the README document as the setup of scripts of ccloudvm were removed. Fixes #3448 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-13 18:22:26 +00:00
James O. D. Hunt	6387a3d323	Merge pull request #3441 from liangxianlong/main runtime: Provide protection for shared data	2022-01-13 17:46:35 +00:00
snir911	cf464668ff	Merge pull request #3433 from snir911/fix-kata-deploy-2 kata-deploy: validate conf file can be created	2022-01-13 15:16:25 +02:00
liangxianlong	878ab93c15	runtime: Provide protection for shared data The k.reqHandlers should be protected by locks when used Fixes #3440 Signed-off-by: liangxianlong <liang.xianlong@zte.com.cn>	2022-01-13 14:48:10 +08:00
James O. D. Hunt	ef835b5948	Merge pull request #3418 from yangfeiyu20102011/main runtime: it should rollback when failed in Sandbox AddInterface	2022-01-12 10:22:36 +00:00
Snir Sheriber	ac7acbf87b	kata-deploy: validate conf file can be created As containerd doesn't exist at cleanup Fixes: #3429 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-01-12 10:12:46 +02:00
Bin Liu	a561159f7b	Merge pull request #3423 from liubin/fix/3422-ignore-some-generated-files libs: add some generated files to .gitignore	2022-01-12 15:46:21 +08:00
Bin Liu	0bd2cc5a93	Merge pull request #3425 from liubin/fix/3424-close-span-before-return runtime: close span before return from function in case of error	2022-01-12 10:52:53 +08:00
GabyCT	08d8402e98	Merge pull request #3428 from GabyCT/topic/removeccloudvm packaging: Remove ccloudvm instructions and script	2022-01-11 13:25:57 -06:00
Carlos Venegas	43d8ccdb3e	Merge pull request #3409 from haslersn/design-docs-q35 docs: Default machine type is q35 meanwhile	2022-01-11 11:00:54 -06:00
GabyCT	493d3f50e4	Merge pull request #3421 from jodh-intel/ci-revert-gnu-mirror CI: Revert "CI: Switch to a mirror as gnu.org is down"	2022-01-11 10:36:34 -06:00
Gabriela Cervantes	7e2bc4d764	packaging: Remove ccloudvm instructions and script This PR removes ccloudvm for kata 2.0, ccloudvm was used in kata 1.x and we are not longer using it for kata 2.0. Fixes #3427 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-11 15:41:16 +00:00
bin	85f5ae190e	runtime: close span before return from function in case of error Return before closing span will cause invalid spans, so span should be closed before function return. Fixes: #3424 Signed-off-by: bin <bin@hyper.sh>	2022-01-11 19:45:41 +08:00
bin	106df33ff8	libs: add some generated files to .gitignore Generated protocols files should not be inclued in Git repo. And also add Cargo.lock in oci/protocols directory to .gitignore. Fixes: #3422 Signed-off-by: bin <bin@hyper.sh>	2022-01-11 19:29:27 +08:00
yangfeiyu	b133a2368a	runtime: it should rollback when failed in Sandbox AddInterface When Sandbox AddInterface() is called, it may fail after endpoint.HotAttach, we'd better rollback and call save() in the end. Fixes: #3419 Signed-off-by: yangfeiyu <yangfeiyu20102011@163.com>	2022-01-11 18:43:43 +08:00
James O. D. Hunt	7d1a956471	Merge pull request #3415 from fengwang666/protogen-bug-fix agent: fix the broken protobuf generation code	2022-01-11 09:45:24 +00:00
James O. D. Hunt	7f54674834	CI: Revert "CI: Switch to a mirror as gnu.org is down" This reverts commit `321995b7df`. Now that gnu.org is back online, we don't need to use a mirror. Fixes: #3313. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-01-11 09:22:58 +00:00
Feng Wang	c486c2ca18	agent: fix the broken protobuf generation code After the protocols are moved to upper libs (PR3355), the runtime protocol generation is broken. This fixes it. Fixes: #3414 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-01-10 15:37:00 -08:00
Sebastian Hasler	f6cdf46496	docs: Default machine type is q35 meanwhile Fixes: #3412 Signed-off-by: Sebastian Hasler <sebastian.hasler@stuvus.uni-stuttgart.de>	2022-01-10 11:19:35 +01:00
Bin Liu	97e18cf2d0	Merge pull request #3405 from GabyCT/topic/removeobs packaging: Remove obs packages testing for kata 2.0	2022-01-10 11:18:24 +08:00
Gabriela Cervantes	b48322d44e	packaging: Remove obs packages testing for kata 2.0 This PR removes the scripts and the dockerfiles that were used in kata 1.x to test the different kata components for different distributions in OBS. Currently for kata 2.0 we are not generating packages in OBS so these scripts are not longer being used. Fixes #3404 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-07 17:06:20 +00:00
GabyCT	e6e5d2593a	Merge pull request #3401 from GabyCT/topic/removedockercomments runtime: Remove docker comments for kata 2.0 configuration.tomls	2022-01-06 11:43:07 -06:00
Gabriela Cervantes	ad16d75c07	runtime: Remove docker comments for kata 2.0 configuration.tomls This PR removes the reference of how to use disable_new_netns configuration with docker as for kata 2.0 we are not supporting docker and this information was used for kata 1.x Fixes #3400 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-06 16:08:10 +00:00
James O. D. Hunt	66510b977d	Merge pull request #3392 from zhsj/fix-doc docs: fix agent proto file path	2022-01-06 14:31:34 +00:00
snir911	3704f2aadf	Merge pull request #3398 from snir911/2.4.0-alpha1-branch-bump # Kata Containers 2.4.0-alpha1	2022-01-06 11:24:29 +02:00
Snir Sheriber	117fc9c9e9	release: Kata Containers 2.4.0-alpha1 - kata-deploy: fix tar command in dockerfile - vendor: update to containerd v1.6.0-beta.4 - versions: Upgrade to Cloud Hypervisor v20.2 - vc: remove swagger binary - agent: Refactor command line parsing to use a framework - move the oci and protocols crates from agent to upper libs - docs: Remove word duplication - osbuilder: Restore Debian as a rootfs - runtime: fix a typo in kata-collect-data.sh - agent: return detail error message for RPC calls from shim - use-cases: clarify SPDK vhost-user-nvme target status in using-spdk-v… - Delint dockerfiles - Makefile: update `make go-test` call - docs: add how-to on DinD in Kata - agent: Ignore unknown seccomp system calls - agent: mount: Remove unneeded mount_point local variable - docs: Fix outdated links - docs: Fix kernel configs README spelling errors - security: Update rust crate versions - kata-manager: Retrieve static tarball - osbuilder: avoid to copy versions.txt which already deprecated - qemu: Disable libudev for QEMU 5.2 and newer - osbuilder: Add protoc to the alpine container - docs: Clarify where to run agent API generation commands - packaging/qemu: partial git clone - docs: Fix arch doc formatting - CI: Switch to a mirror as gnu.org is down - Split architecture doc into separate files - docs: Update the stable branch strategy - tracing: Add span name to logging error - docs: Update code PR advice document - agent: Add config file option to cli - update container type handling - docs: Update architecture document - runtime: update golang to 1.16 and remove ioutil package - kata-deploy: Deal with empty containerd conf file - src: reorg source code directory - osbuilder: show usage if no options/arguments specified - Upgrade to Cloud Hypervisor v20.1 - image_build: add help info for '-f' option and 'BLOCK_SIZE' env. - osbuilder: be runtime consistent with podman build - osbuilder: Revert to using apk.static for Alpine - runtime/template: Handling new attributes for hypervisor config - docs: fix check-markdown test - runtime: correct span name for stopSandbox function - runtime: only call stopVirtiofsd when shared_fs is virtio-fs - snap: read initrd and image distros from version.yaml - versions: Use Ubuntu initrd for non-musl archs - packaging: Fix missing commit message in building kata-runtime - virtcontainers: clh: Upgrade to openapi-generator v5.3.0 - agent: user container ID as watchable storage key for hashmap - runtime: enable vhost-net for rootless hypervisor - packaging: add help information for '-f' option in install_go.sh - Cleanup some unused variables, definitions - Upgrade to Cloud Hypervisor v20.0 - docs: Update limitation document regarding docker swarm - runtime: Enable FUSE_DAX kernel config for DAX - agent: copy empty directories for watchable-bind mounts - runtime: Update comments for virtcontainers to use kata 2.0 - Update rust crate versions - osbuilder: Remove debian as a rootfs `e2c1e65e` kata-deploy: fix tar command in dockerfile `615224e9` agent: move the protocols to upper libs `330e3dcc` agent: move the oci crate to upper libs `7b03d78f` vendor: update to containerd v1.6.0-beta.4 `1f581a04` versions: Upgrade to Cloud Hypervisor v20.2 `623d8f08` docs: Remove word duplication `1c4edb96` agent: Refactor arg parsing to use clap `3093f93a` osbuilder: Restore Debian as a rootfs `073a3459` use-cases: clarify vhost-user-nvme status in using-spdk-vhost-user `2254fa86` runtime: fix a typo in kata-collect-data.sh `2d0f9d2d` vc: remove swagger binary `cf91307c` agent: return detail error message for rpc calls from shim `137e217b` docs: Fix outdated k8s link `55bac67a` docs: Fix kernel configs README spelling errors `205420d2` docs: Replicate branch rename on runtime-spec `91abebf9` agent: mount: Remove unneeded mount_point local variable `b1f4e945` security: Update rust crate versions `d79268ac` tools/packaging: add copyright to kata-monitor's Dockerfile `428cf0a6` packaging: delint tests dockerfiles `1ea9b703` packaging: delint kata-deploy dockerfiles `3669e1b6` ci/openshift-ci: delint dockerfiles `aeb2b673` osbuilder: delint dockerfiles `bc120289` packaging: delint kata-monitor dockerfiles `bc71dd58` packaging: delint static-build dockerfiles `99ef52a3` osbuilder: Add protoc to the alpine container `c2578cd9` docs: Clarify where to run agent API generation commands `321995b7` CI: Switch to a mirror as gnu.org is down `fb1989b2` docs: Fix arch doc formatting `2938bb7f` packaging/qemu: Use QEMU script to update submodules `5d49ccd6` packaging/qemu: Use partial git clone `87a219a1` docs: Update the stable branch strategy `d1bc409d` osbuilder: avoid to copy versions.txt which already deprecated `1653dd4a` tracing: Add span name to logging error `12c8e41c` qemu: Disable libudev for QEMU 5.2 and newer `233015a6` docs: Split guest assets details out of arch doc `db411c23` docs: Split k8s info out of arch doc `7ac619b2` docs: Split networking out of arch doc `5df0cb64` docs: Split storage out of arch doc `7229b7a6` docs: Split background and example out of arch doc `283d7d52` docs: Split history out of arch doc `6f9efb40` docs: Move arch doc to separate directory `02608e13` docs: Update code PR advice document `cb5c948a` kata-manager: Retrieve static tarball `51bf9807` docs: Update architecture document `f3a97e94` docs: add how-to on Docker in Kata `7a989a83` runtime: api-test: fixup `52f79aef` utils: update container type handling `5b002f3c` docs: change io/ioutil to io/os packages `03546f75` runtime: change io/ioutil to io/os packages `24a530ce` versions: bump minimum golang version to 1.16.10 `7c4263b3` src: reorg source directories `1a34fbcd` agent: Add config file option to cli `bbfb10e1` versions: Upgrade to Cloud Hypervisor v20.1 `84571506` kata-deploy: Deal with empty containerd conf file `3f7cf7ae` osbuilder: show usage if no options/arguments specified `2ebaaac7` osbuilder: be runtime consistent also with podman build `f3103696` docs: fix check-markdown test `2204ecac` versions: Upgrade Alpine, using minor version `dfd0732f` osbuilder: Revert to using apk.static for Alpine `6b3e4c21` image_build: add help info for '-f' option and 'BLOCK_SIZE' env. `b92babf9` runtime/template: Handling new attributes for hypervisor config `40bd34ca` runtime: only call stopVirtiofsd when shared_fs is virtio-fs `33f343ee` runtime: correct span name for stopSandbox function `d7cc952c` versions: Use Ubuntu initrd for non-musl archs `ff929fc0` snap: read initrd and image distros from version.yaml `8fae2631` packaging: Fix missing commit message in building kata-runtime `99530026` virtcontainers: clh: Upgrade to openapi-generator v5.3.0 `b3bcb7b2` runtime: enable vhost-net for rootless hypervisor `7cb7b9d5` agent: remove unused field in mount handling `f6ae1582` agent: drop unused fields from network `4756a04b` virtcontainers: clh: Re-generate the client code `0bf4d257` versions: Upgrade to Cloud Hypervisor v20.0 `647082b2` docs: Update limitation document regarding docker swarm `39b35d00` agent: user container ID as watchable storage key for hashmap `1e6f58e5` packaging: add help information for '-f' option in install_go.sh `2af95bc5` agent: create directories for watchable-bind mounts `6105e3ee` runtime: enable FUSE_DAX kernel config for DAX `591d4af1` runtime: Update comments for virtcontainers to use kata 2.0 `923e098d` osbuilder: Remove debian as a rootfs `afb96c00` agent: Wrap remaining nix errors with anyhow `aba572e0` rustjail: Wrap remaining nix errors with anyhow `30d60078` uevent: Fix clippy issue in test code `4a2be13c` agent: Upgrade nix version for security fix `256d5008` agent: Update crate versions `13257986` agent-ctl: Update rust lockfile `4ebdd424` forwarder: Update rust lockfile `6007322d` agent: Fixed invalid error message `7b356151` agent: Log unknown seccomp system calls `7304e52a` Makefile: update `make go-test` call `c66b5668` agent: Ignore unknown seccomp system calls Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-01-06 08:37:28 +02:00
Fabiano Fidêncio	f9b4d0b60e	Merge pull request #3395 from snir911/fix_kata_deploy kata-deploy: fix tar command in dockerfile	2022-01-05 23:42:26 +01:00
Eric Ernst	e073c0936b	Merge pull request #3279 from egernst/containerd-vendor-bump vendor: update to containerd v1.6.0-beta.4	2022-01-05 11:13:05 -08:00
Bo Chen	dca220ad4d	Merge pull request #3384 from likebreath/0104/clh_v20.2 versions: Upgrade to Cloud Hypervisor v20.2	2022-01-05 10:51:55 -08:00
Snir Sheriber	e2c1e65e27	kata-deploy: fix tar command in dockerfile tar params are passed wrongly Fixes: #3394 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-01-05 20:07:52 +02:00
Shengjing Zhu	905e124b77	docs: fix agent proto file path Fixes: #3391 Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2022-01-06 00:22:49 +08:00
Bin Liu	94f14cf6f7	Merge pull request #3363 from zhsj/remove-binary vc: remove swagger binary	2022-01-05 20:40:33 +08:00
Bin Liu	f622d9491f	Merge pull request #3253 from stevenhorsman/agent-config-cmdline agent: Refactor command line parsing to use a framework	2022-01-05 20:25:57 +08:00
Bin Liu	59ec112337	Merge pull request #3355 from lifupan/main move the oci and protocols crates from agent to upper libs	2022-01-05 20:19:59 +08:00
Fupan Li	615224e993	agent: move the protocols to upper libs move the protocols to upper libs thus it can be shared between agent and other rust runtime. Depends-on: github.com/kata-containers/tests#4306 Fixes: #3348 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2022-01-05 16:58:06 +08:00
Fupan Li	330e3dcc93	agent: move the oci crate to upper libs Move the oci crate to upper libs thus it can be shared between agent and other rust runtimes. Fixes: #3348 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2022-01-05 16:58:06 +08:00
Bin Liu	3339ba90cf	Merge pull request #3382 from GabyCT/topic/updateupgradingdoc docs: Remove word duplication	2022-01-05 14:50:26 +08:00
Bin Liu	b2166560fa	Merge pull request #3375 from zhaojizhuang/debianrootfs osbuilder: Restore Debian as a rootfs	2022-01-05 10:27:47 +08:00
Eric Ernst	7b03d78f15	vendor: update to containerd v1.6.0-beta.4 Update our containerd vendoring. In particular, we're interested in grabbing the updated annotation definitions for defining sandbox sizing. - go get github.com/containerd/containerd@v1.6.0-beta.4 - edit go.mod to remove containerd v1.5.8 replacement directive - go mod vendor - go mod tidy Fixes: #3276 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-04 17:15:17 -08:00
GabyCT	caa4e89dfc	Merge pull request #3366 from Kvasscn/kata_dev_fix_kata-collect-data_typo runtime: fix a typo in kata-collect-data.sh	2022-01-04 17:03:34 -06:00
Bo Chen	1f581a0405	versions: Upgrade to Cloud Hypervisor v20.2 This is a bug release from Cloud Hypervisor addressing the following issues: 1) Don't error out when setting up the SIGWINCH handler (for console resize) when this fails due to older kernel; 2) Seccomp rules were refined to remove syscalls that are now unused; 3) Fix reboot on older host kernels when SIGWINCH handler was not initialised; 4) Fix virtio-vsock blocking issue. Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v20.2 Fixes: #3383 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-01-04 14:37:35 -08:00
Gabriela Cervantes	623d8f086a	docs: Remove word duplication This PR removes a word duplication in the Upgrading documentation. Fixes #3381 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-04 15:58:50 +00:00
James O. D. Hunt	a838a598ef	Merge pull request #3354 from liubin/fix/3353-return-error-details agent: return detail error message for RPC calls from shim	2022-01-04 14:06:25 +00:00
stevenhorsman	1c4edb9619	agent: Refactor arg parsing to use clap Fixes: #3284 Co-authored-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2022-01-04 09:14:08 +00:00
zhaojizhuang	3093f93a6f	osbuilder: Restore Debian as a rootfs Restore Debian as a rootfs. 1. revert of #3154, but some change 2. update debian version to 10.11 3. update `libstdc++-6-dev` to `libstdc++-8-dev` 4. changes discarded in QAT are not restored Fixes: #3372 Signed-off-by: zhaojizhuang <571130360@qq.com>	2022-01-04 11:54:34 +08:00
Bin Liu	883b0d1dc3	Merge pull request #2840 from optimistyzy/1014_fix_vhost_nvme use-cases: clarify SPDK vhost-user-nvme target status in using-spdk-v…	2022-01-04 11:42:15 +08:00
Ziye Yang	073a345908	use-cases: clarify vhost-user-nvme status in using-spdk-vhost-user SPDK vhost-user-nvme target is removed from SPDK 21.07 release since upstreamed QEMU version does not support. Fixes this usage. Fixes #3371 Signed-off-by: Ziye Yang <ziye.yang@intel.com>	2021-12-31 02:24:59 +00:00
Fupan Li	ea1a173854	agent: fix the issue of creating new namespaces for agent The tokio's spawn will only create an future async task instead of a new real thread, thus executing unshare to create a new namespace in tokio's async task would make the agent process to join in the new created namespace, which isn't expected. Thus, we'd better to to the unshare in a real thread to prevent moving the agent process into a new namespace. Fixes: #3369 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2021-12-30 13:32:22 +08:00
Wainer Moschetta	820dc930db	Merge pull request #3109 from wainersm/delint_dockerfiles Delint dockerfiles	2021-12-28 10:11:51 -03:00
zhanghj	2254fa8657	runtime: fix a typo in kata-collect-data.sh Fix a typo while to check if mountpoint exist. Fixes: #3365 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2021-12-28 10:03:18 +08:00
Shengjing Zhu	2d0f9d2d06	vc: remove swagger binary Fixes: #3362 Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2021-12-25 22:41:29 +08:00
bin	cf91307c66	agent: return detail error message for rpc calls from shim For calls from shim to agent, the return error will be processed like this: match self.do_start_container(req).await { Err(e) => Err(ttrpc_error(ttrpc::Code::INTERNAL, e.to_string())), Ok(_) => Ok(Empty::new()), } The e.to_string() return only a part of the error(for example set by context()), this may lead lack of information. The `format!("{:?}", err)` will return more info. Fixes: #3353 Signed-off-by: bin <bin@hyper.sh>	2021-12-24 17:17:29 +08:00
Fupan Li	0fe20854e7	Merge pull request #2481 from Bevisy/main-1494 Makefile: update `make go-test` call	2021-12-24 09:57:06 +08:00
James O. D. Hunt	302c7c34f3	Merge pull request #3137 from t3hmrman/docs/2474-add-dind-how-to docs: add how-to on DinD in Kata	2021-12-23 12:24:36 +00:00
James O. D. Hunt	ba22a04265	Merge pull request #2958 from ManaSugi/ignore-unknown-systemcall agent: Ignore unknown seccomp system calls	2021-12-23 12:12:47 +00:00
Peng Tao	8b6fbf9108	Merge pull request #3331 from dubek/mount-remove-var agent: mount: Remove unneeded mount_point local variable	2021-12-23 11:53:14 +08:00
Peng Tao	65343b3fdc	Merge pull request #3337 from Jakob-Naucke/cgroups-main docs: Fix outdated links	2021-12-23 11:40:32 +08:00
Peng Tao	08367643dc	Merge pull request #3339 from Jakob-Naucke/spell-kernel-readme docs: Fix kernel configs README spelling errors	2021-12-23 11:40:09 +08:00
Jakob Naucke	137e217b85	docs: Fix outdated k8s link in virtcontainers readme Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-22 19:40:25 +01:00
Jakob Naucke	55bac67ac6	docs: Fix kernel configs README spelling errors - `fragments` in backticks - s/perfoms/performs/ Fixes: #3338 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-22 18:57:47 +01:00
Jakob Naucke	205420d21b	docs: Replicate branch rename on runtime-spec renamed branch `master` to `main` Fixes: #3336 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-22 18:15:01 +01:00
Fabiano Fidêncio	562fc73769	Merge pull request #3297 from jodh-intel/cargo-audit-fixes security: Update rust crate versions	2021-12-22 16:10:10 +01:00
Dov Murik	91abebf92e	agent: mount: Remove unneeded mount_point local variable We already have a `mount_path` local Path variable which holds the mount point. Use it instead of creating a new `mount_point` variable with identical type and content. Fixes: #3332 Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>	2021-12-22 14:11:50 +02:00
James O. D. Hunt	b1f4e945b3	security: Update rust crate versions Update the rust dependencies that have upstream security fixes. Issues fixed by this change: - [`RUSTSEC-2020-0002`](https://rustsec.org/advisories/RUSTSEC-2020-0002) (`prost` crate) - [`RUSTSEC-2020-0036`](https://rustsec.org/advisories/RUSTSEC-2020-0036) (`failure` crate) - [`RUSTSEC-2021-0073`](https://rustsec.org/advisories/RUSTSEC-2021-0073) (`prost-types` crate) - [`RUSTSEC-2021-0119`](https://rustsec.org/advisories/RUSTSEC-2021-0119) (`nix` crate) This change also includes: - Minor code changes for the new version of `prometheus` for the agent. - A downgrade of the version of the `futures` crate to the (new) latest version (`0.3.17`) since version `0.3.18` was removed [1]. Fixes: #3296. [1] - See https://crates.io/crates/futures/versions Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-22 07:41:16 +00:00
Fabiano Fidêncio	ee66155a72	Merge pull request #3271 from Jakob-Naucke/kata-manager-static kata-manager: Retrieve static tarball	2021-12-21 16:09:50 +01:00
Fabiano Fidêncio	67f0ab4092	Merge pull request #3294 from Kvasscn/kata_dev_osbuilder_makefile osbuilder: avoid to copy versions.txt which already deprecated	2021-12-21 16:07:01 +01:00
Wainer dos Santos Moschetta	d79268ac65	tools/packaging: add copyright to kata-monitor's Dockerfile The kata-monitor's Dockerfile was added by Eric Ernst on commit `2f1cb7995f` but for some reason the static checker did not catch the file misses the copyright statement at the time it was added. But it is now complaining about it. So this assign the copyright to him to make the static-checker happy. Fixes #3329 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-21 10:01:11 -05:00
Fabiano Fidêncio	79153c3845	Merge pull request #3288 from gkurz/qemu-disable-libudev qemu: Disable libudev for QEMU 5.2 and newer	2021-12-21 15:56:16 +01:00
Wainer dos Santos Moschetta	428cf0a685	packaging: delint tests dockerfiles Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following ignored rules: - "DL3008 warning: Pin versions in apt get install" - "DL3041 warning: Specify version with `dnf install -y <package>-<version>`" - "DL3033 warning: Specify version with `yum install -y <package>-<version>`" - "DL3048 style: Invalid label key" - "DL3003 warning: Use WORKDIR to switch to a directory" - "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>" - "DL3037 warning: Specify version with zypper install -y <package>[=]<version>" Fixes #3107 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-21 09:54:44 -05:00
Wainer dos Santos Moschetta	1ea9b70383	packaging: delint kata-deploy dockerfiles Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following ignored rules: - "DL3008 warning: Pin versions in apt get install" - "DL3041 warning: Specify version with `dnf install -y <package>-<version>`" - "DL3033 warning: Specify version with `yum install -y <package>-<version>`" - "DL3048 style: Invalid label key" - "DL3003 warning: Use WORKDIR to switch to a directory" - "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>" - "DL3037 warning: Specify version with zypper install -y <package>[=]<version>" Fixes #3107 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-21 09:54:44 -05:00
Wainer dos Santos Moschetta	3669e1b6d9	ci/openshift-ci: delint dockerfiles Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following ignored rules: - "DL3008 warning: Pin versions in apt get install" - "DL3041 warning: Specify version with `dnf install -y <package>-<version>`" - "DL3033 warning: Specify version with `yum install -y <package>-<version>`" - "DL3048 style: Invalid label key" - "DL3003 warning: Use WORKDIR to switch to a directory" - "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>" - "DL3037 warning: Specify version with zypper install -y <package>[=]<version>" Fixes #3107 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-21 09:54:44 -05:00
Wainer dos Santos Moschetta	aeb2b673b3	osbuilder: delint dockerfiles Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following ignored rules: - "DL3008 warning: Pin versions in apt get install" - "DL3041 warning: Specify version with `dnf install -y <package>-<version>`" - "DL3033 warning: Specify version with `yum install -y <package>-<version>`" - "DL3048 style: Invalid label key" - "DL3003 warning: Use WORKDIR to switch to a directory" - "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>" - "DL3037 warning: Specify version with zypper install -y <package>[=]<version>" Fixes #3107 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-21 09:54:44 -05:00
Wainer dos Santos Moschetta	bc120289ec	packaging: delint kata-monitor dockerfiles Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following ignored rules: - "DL3008 warning: Pin versions in apt get install" - "DL3041 warning: Specify version with `dnf install -y <package>-<version>`" - "DL3033 warning: Specify version with `yum install -y <package>-<version>`" - "DL3048 style: Invalid label key" - "DL3003 warning: Use WORKDIR to switch to a directory" - "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>" - "DL3037 warning: Specify version with zypper install -y <package>[=]<version>" Fixes #3107 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-21 09:54:44 -05:00
Wainer dos Santos Moschetta	bc71dd5812	packaging: delint static-build dockerfiles Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following ignored rules: - "DL3008 warning: Pin versions in apt get install" - "DL3041 warning: Specify version with `dnf install -y <package>-<version>`" - "DL3033 warning: Specify version with `yum install -y <package>-<version>`" - "DL3048 style: Invalid label key" - "DL3003 warning: Use WORKDIR to switch to a directory" - "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>" - "DL3037 warning: Specify version with zypper install -y <package>[=]<version>" Fixes #3107 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-21 09:54:41 -05:00
Fabiano Fidêncio	aa7ba1741b	Merge pull request #3324 from fidencio/wip/add-protoc-to-alpine-image osbuilder: Add protoc to the alpine container	2021-12-21 15:52:25 +01:00
Fabiano Fidêncio	99ef52a35d	osbuilder: Add protoc to the alpine container It seems the lack of protoc in the alpine containers is causing issues with some of our CIs, such as the VFIO one. Fixes: #3323 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-12-21 13:57:18 +01:00
Archana Shinde	ae271a7e7b	Merge pull request #3318 from jodh-intel/docs-agent-protoc docs: Clarify where to run agent API generation commands	2021-12-21 00:28:01 -08:00
Peng Tao	b990868b11	Merge pull request #3302 from wainersm/static_qemu-partial_clone packaging/qemu: partial git clone	2021-12-21 10:52:49 +08:00
James O. D. Hunt	c2578cd9a1	docs: Clarify where to run agent API generation commands Make it clear when reading the table in the agent's "Change the agent API" documentation that the commands in the "Generation method" column should be run in the agent repo. Fixes: #3317. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-20 15:45:36 +00:00
James O. D. Hunt	464d1a653e	Merge pull request #3312 from jodh-intel/docs-arch-fix-formatting docs: Fix arch doc formatting	2021-12-20 14:04:36 +00:00
James O. D. Hunt	cd20bf95e9	Merge pull request #3315 from jodh-intel/ci-use-mirror-for-gnu.org CI: Switch to a mirror as gnu.org is down	2021-12-20 11:53:14 +00:00
James O. D. Hunt	321995b7df	CI: Switch to a mirror as gnu.org is down All CI jobs are failing as www.gnu.org is down, so switch to a mirror for the time being. Fixes: #3314. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-20 11:22:56 +00:00
James O. D. Hunt	fb1989b27a	docs: Fix arch doc formatting PR #3298 failed to move the named link for the debug console to the `guest-assets.md` meaning the debug console cells in the "User accessible" column in the table in the "Root filesystem image" section do not work as a link. Fixes: #3311. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-20 10:33:48 +00:00
James O. D. Hunt	2ebae2d279	Merge pull request #3287 from jodh-intel/docs-split-arch-doc Split architecture doc into separate files	2021-12-20 10:11:30 +00:00
Julio Montes	e329dcf2ff	Merge pull request #3299 from fidencio/wip/update-stable-branch-strategy docs: Update the stable branch strategy	2021-12-17 13:29:10 -06:00
Chelsea Mafrica	e4c0b71e40	Merge pull request #3290 from cmaf/tracing-span-logging-error tracing: Add span name to logging error	2021-12-17 11:13:41 -08:00
Jakob Naucke	7fdb425918	Merge pull request #3286 from zmlcc/pr-advice-expect-211216 docs: Update code PR advice document	2021-12-17 15:35:05 +01:00
Wainer dos Santos Moschetta	2938bb7f89	packaging/qemu: Use QEMU script to update submodules Currently QEMU's submodules are git cloned but there is the scripts/git-submodule.sh which is meant for that. Let's use that script. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-17 10:20:59 -03:00
Wainer dos Santos Moschetta	5d49ccd613	packaging/qemu: Use partial git clone The static build of QEMU takes a good amount of time on cloning the source tree because we do a full git clone. In order to speed up that operation this changed the Dockerfile so that it is carried out a partial clone by using --depth=1 argument. Fixes #3291 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-12-17 10:20:29 -03:00
Fabiano Fidêncio	87a219a1c9	docs: Update the stable branch strategy On the last architecture committee meeting, the one held on December 14th 2021, we reached the agreement that minor releases will be cut once every 16 weeks (instead of 12), and that patch releases will be cut every 4 weeks (instead of 3) Fixes: #3298 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-12-17 13:48:26 +01:00
zhanghj	d1bc409d57	osbuilder: avoid to copy versions.txt which already deprecated Currently the versions.txt in rootfs-builder dir is already removed, so avoid to copy it in list of helper files. Fixes: #3267 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2021-12-17 17:23:05 +08:00
Chelsea Mafrica	1653dd4a30	tracing: Add span name to logging error Add span name to logging error to help with debugging when the context is not set before the span is created. Fixes #3289 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2021-12-16 12:44:42 -08:00
Greg Kurz	12c8e41c75	qemu: Disable libudev for QEMU 5.2 and newer Commit `112ea25859` disabled libudev for static builds because it was breaking snap. It turns out that the only users of libudev in QEMU are qemu-pr-helper and USB. Kata already disables USB and doesn't use qemu-pr-helper. Disable libudev for all builds if QEMU supports it, i.e. version 5.2 or newer. Fixes #3078 Signed-off-by: Greg Kurz <groug@kaod.org>	2021-12-16 16:12:02 +01:00
James O. D. Hunt	233015a6d9	docs: Split guest assets details out of arch doc Move the guest assets details out of the architecture doc and into a separate file. Fixes: #3246. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-16 14:18:49 +00:00
James O. D. Hunt	db411c23e8	docs: Split k8s info out of arch doc Move the Kubernetes information out of the architecture doc and into a separate file. Partially fixes: #3246. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-16 14:18:47 +00:00
James O. D. Hunt	7ac619b24e	docs: Split networking out of arch doc Move the networking details out of the architecture doc and into a separate file. Partially fixes: #3246. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-16 14:18:45 +00:00
James O. D. Hunt	5df0cb6420	docs: Split storage out of arch doc Move the storage details in the architecture doc to a separate file. Partially fixes: #3246. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-16 14:18:41 +00:00
James O. D. Hunt	7229b7a69d	docs: Split background and example out of arch doc Move the background and example command details out of the architecture doc and into separate files. Partially fixes: #3246. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-16 14:18:38 +00:00
James O. D. Hunt	283d7d52c8	docs: Split history out of arch doc Move the historical details out of the architecture doc and into a separate file. Partially fixes: #3246. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-16 14:17:59 +00:00
James O. D. Hunt	6f9efb4043	docs: Move arch doc to separate directory Move the architecture document into a new `docs/design/architecture/` directory in preparation for splitting it into more manageable pieces. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-16 12:26:17 +00:00
Zack	02608e13ab	docs: Update code PR advice document Allow using `expect()` for `Mutex.lock()` because it is almost unrecoverable if failed in the lock acquisition Fixes: #3285 Signed-off-by: Zack <zmlcc@linux.alibaba.com>	2021-12-16 19:23:17 +08:00
Steve Horsman	39cf2b27c1	Merge pull request #3261 from stevenhorsman/native-agent-config-opt agent: Add config file option to cli	2021-12-16 10:00:56 +00:00
Eric Ernst	3865a1bcf6	Merge pull request #2918 from egernst/update-container-type-handling update container type handling	2021-12-15 10:41:23 -08:00
Eric Ernst	32d62c85c2	Merge pull request #3195 from jodh-intel/docs-update-architecture docs: Update architecture document	2021-12-15 09:25:20 -08:00
Jakob Naucke	cb5c948a0a	kata-manager: Retrieve static tarball In `utils/kata-manager.sh`, we download the first asset listed for the release, which used to be the static x86_64 tarball. If that happened to not match the system architecture, we would abort. Besides that logic being invalid for !x86_64 (despite not distributing other tarballs at the moment), the first asset listed is also not the static tarball any more, it is the vendored source tarball. Retrieve all _static_ tarballs and select the appropriate one depending on architecture. Fixes: #3254 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-15 14:34:14 +01:00
James O. D. Hunt	51bf98073d	docs: Update architecture document Refresh the content and formatting of the architecture document. Out of scope of these changes: - Diagram updates. - Updates to the Networking section. Fixes: #3190. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-15 10:46:46 +00:00
Jakob Naucke	a40e4877e9	Merge pull request #3266 from liubin/fix/3265-update-golang-to-1.16-and-remove-ioutil runtime: update golang to 1.16 and remove ioutil package	2021-12-15 10:09:23 +01:00
vados	f3a97e94b2	docs: add how-to on Docker in Kata Add documentation on how to use Docker in Docker Fixes: #2474 Signed-off-by: vados <vados@vadosware.io>	2021-12-15 12:43:58 +09:00
Eric Ernst	7a989a8333	runtime: api-test: fixup not clear why this was commented out before -- ensure that we set approprate annotation on the sandbox container's annotations to indicate this is a sandbox. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-12-14 18:55:18 -08:00
Eric Ernst	52f79aef91	utils: update container type handling Today we assume that if the CRI/upper layer doesn't provide a container type annotation, it should be treated as a sandbox. Up to this point, a sandbox with a pause container in CRI context and a single container (ala ctr run) are treated the same. For VM sizing and container constraining, it'll be useful to know if this is a sandbox or if this is a single container. In updating this, we cleanup the type handling tests and we update the containerd annotations vendoring. Fixes: #2926 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-12-14 17:59:19 -08:00
bin	5b002f3c88	docs: change io/ioutil to io/os packages Change io/ioutil to io/os packages because io/ioutil package is deprecated from 1.16: TempDir => os.MkdirTemp Details: https://go.dev/doc/go1.16#ioutil Fixes: #3265 Signed-off-by: bin <bin@hyper.sh>	2021-12-15 07:31:57 +08:00
bin	03546f75a6	runtime: change io/ioutil to io/os packages Change io/ioutil to io/os packages because io/ioutil package is deprecated from 1.16: Discard => io.Discard NopCloser => io.NopCloser ReadAll => io.ReadAll ReadDir => os.ReadDir ReadFile => os.ReadFile TempDir => os.MkdirTemp TempFile => os.CreateTemp WriteFile => os.WriteFile Details: https://go.dev/doc/go1.16#ioutil Fixes: #3265 Signed-off-by: bin <bin@hyper.sh>	2021-12-15 07:31:48 +08:00
Julio Montes	aaac742762	Merge pull request #207 from devimc/2021-12-14/fixBlockdevReadonly qemu: update readonly flag for block devices	2021-12-14 13:30:47 -06:00
Jakob Naucke	70274b9d39	Merge pull request #3258 from fidencio/wip/kata-deploy-count-with-a-non-existend-containerd-config-file kata-deploy: Deal with empty containerd conf file	2021-12-14 20:14:41 +01:00
Julio Montes	b17f07395c	qemu: update readonly flag for block devices since qemu 6.0, readonly flag for block devices must be enable or disable with `on` or `off` respectively. Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-12-14 11:55:19 -06:00
Bin Liu	6c34446f49	Merge pull request #3244 from bergwolf/reorg-code src: reorg source code directory	2021-12-14 21:57:07 +08:00
bin	24a530ced1	versions: bump minimum golang version to 1.16.10 According to https://endoflife.date/go golang 1.11.10 is not supported anymore, 1.16.10 is the minimum supported version. Fixes: #3265 Signed-off-by: bin <bin@hyper.sh>	2021-12-14 17:03:53 +08:00
Tim Zhang	4f96ea4e2b	Merge pull request #3257 from liubin/fix/3256-show-usage-if-no-arguments-specified osbuilder: show usage if no options/arguments specified	2021-12-14 11:41:06 +08:00
Peng Tao	7c4263b3e1	src: reorg source directories To make the code directory structure more clear: └── src ├── agent ├── libs │ └── logging ├── runtime ├── runtime-rs (to be added) └── tools ├── agent-ctl └── trace-forwarder Fixes: #3204 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2021-12-14 10:30:08 +08:00
stevenhorsman	1a34fbcdbd	agent: Add config file option to cli - Add option to pass in config with -c/--config Fixes: #3252 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2021-12-13 21:57:23 +00:00
Bo Chen	9d13d1b208	Merge pull request #3263 from likebreath/1213/clh_v20.1 Upgrade to Cloud Hypervisor v20.1	2021-12-13 12:51:27 -08:00
Bo Chen	bbfb10e169	versions: Upgrade to Cloud Hypervisor v20.1 This is a bug release from Cloud Hypervisor addressing the following issues: 1) Networking performance regression with virtio-net; 2) Limit file descriptors sent in vfio-user support; 3) Fully advertise PCI MMIO config regions in ACPI tables; 4) Set the TSS and KVM identity maps so they don't overlap with firmware RAM; 5) Correctly update the DeviceTree on restore. Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v20.1 Fixes: #3262 Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-12-13 10:09:44 -08:00
Fabiano Fidêncio	8457150684	kata-deploy: Deal with empty containerd conf file As containerd can properly run without having a existent `/etc/containerd/config.toml` file (it'd run using the default cobnfiguration), let's explicitly create the file in those cases. This will avoid issues on ammending runtime classes to a non-existent file. Fixes: #3229 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Tested-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-13 11:20:22 +01:00
bin	3f7cf7ae67	osbuilder: show usage if no options/arguments specified Now if no options/arguments specified, the shell scripts will return an error: ERROR: Invalid rootfs directory: '' This commit will show usage if no options/arguments specified. Fixes: #3256 Signed-off-by: bin <bin@hyper.sh>	2021-12-13 16:10:55 +08:00
Bin Liu	978b13c9e8	Merge pull request #3235 from Kvasscn/kata_dev_image_builer_help image_build: add help info for '-f' option and 'BLOCK_SIZE' env.	2021-12-09 22:55:24 +08:00
Julio Montes	70062e1563	Merge pull request #3238 from snir911/wip/build_with_runtime osbuilder: be runtime consistent with podman build	2021-12-09 08:06:00 -06:00
Fabiano Fidêncio	c868172510	Merge pull request #3222 from Jakob-Naucke/apk-static osbuilder: Revert to using apk.static for Alpine	2021-12-09 13:33:35 +01:00
Fabiano Fidêncio	602d87295b	Merge pull request #3226 from liubin/fix/3193-fill-hypervisorconfig runtime/template: Handling new attributes for hypervisor config	2021-12-09 13:29:23 +01:00
Snir Sheriber	2ebaaac73d	osbuilder: be runtime consistent also with podman build Use the same runtime used for podman run also for the podman build cmd Additionally remove "docker" from the docker_run_args variable Fixes: #3239 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2021-12-09 11:28:16 +02:00
Fabiano Fidêncio	251be90dc0	Merge pull request #3241 from devimc/2021-12-06/fixCheckMarkdown docs: fix check-markdown test	2021-12-09 08:16:57 +01:00
Julio Montes	f310369698	docs: fix check-markdown test Unit-Test-Advice.md was moved to kata-containers repo but URLs pointing to that document were not updated. This patch updates these URLs. Depends-on: github.com/kata-containers/tests#4273 fixes #3240 Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-12-08 14:38:12 -06:00
Jakob Naucke	2204ecac39	versions: Upgrade Alpine, using minor version - Upgrade Alpine guest rootfs to 3.15 - Specify a minor version rather than patch level as the Alpine repositories use that. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-08 15:18:44 +01:00
Jakob Naucke	dfd0732ff9	osbuilder: Revert to using apk.static for Alpine #2399 partially reverted #418, missing on returning to bootstrapping a rootfs with `apk.static` instead of copying the entire root, which can result in drastically larger (more than 10x) images. Revert this as well (requires some updates to URL building). Fixes: #3216 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-08 15:18:43 +01:00
zhanghj	6b3e4c212c	image_build: add help info for '-f' option and 'BLOCK_SIZE' env. The help information of '-f' option is missing, and same issue with 'BLOCK_SIZE' env variables, fix it in usage() function. Fixes: #3231 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2021-12-08 17:33:07 +08:00
yuanke wei	b5b9de1de9	kata-deploy: Update API Version of RuntimeClass to v1 API Version of node.k8s.io/v1beta1 is deprecated in v1.22+, unavailable in v1.25+ Fixes: #3185 Signed-off-by: yuanke wei <yuanke.wyk@alibaba-inc.com>	2021-12-08 14:18:57 +08:00
Chelsea Mafrica	7522109abc	Merge pull request #3218 from liubin/fix/3217-fix-span-name runtime: correct span name for stopSandbox function	2021-12-07 16:36:14 -08:00
Julio Montes	712c5ac6ba	Merge pull request #3220 from liubin/fix/3219-stop-virtiofsd-when-needed runtime: only call stopVirtiofsd when shared_fs is virtio-fs	2021-12-07 07:51:08 -06:00
bin	b92babf91b	runtime/template: Handling new attributes for hypervisor config Some new attributes are added to hypervisor config: - VMStorePath - RunStorePath - SharedPath These attributes should be handled in two places: - reset when check the new hypervisor's config is suitable to the base config. - copy from new hypervisor's config when create new VM Fixes: #3193 Signed-off-by: bin <bin@hyper.sh>	2021-12-07 19:31:03 +08:00
Fabiano Fidêncio	1a7fcd0583	Merge pull request #3211 from devimc/2021-11-06/snap/readVerFromYaml snap: read initrd and image distros from version.yaml	2021-12-07 09:07:10 +01:00
bin	40bd34caaf	runtime: only call stopVirtiofsd when shared_fs is virtio-fs If shared_fs is set to virtio-9p, the virtiofsd is not started, so there is no need to stop it. Fixes: #3219 Signed-off-by: bin <bin@hyper.sh>	2021-12-07 16:06:26 +08:00
bin	33f343ee08	runtime: correct span name for stopSandbox function Normally the span name should be the same as function name, so chagne `StopVM` to `stopSandbox`. Fixes: #3217 Signed-off-by: bin <bin@hyper.sh>	2021-12-07 15:59:18 +08:00
Fabiano Fidêncio	e091409404	Merge pull request #3213 from Jakob-Naucke/ppc64le-s390x-ubuntu-initrd versions: Use Ubuntu initrd for non-musl archs	2021-12-06 22:52:53 +01:00
Jakob Naucke	d7cc952cb1	versions: Use Ubuntu initrd for non-musl archs ppc64le & s390x have no (well supported) musl target for Rust, therefore, the agent must use glibc and cannot use Alpine. Specify Ubuntu as the distribution to be used for initrd. Fixes: #3212 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-06 17:13:38 +01:00
Julio Montes	ff929fc081	snap: read initrd and image distros from version.yaml Build initrd or image rootfs using the distro name specified in the versions.yaml fixes #3208 Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-12-06 08:42:07 -06:00
Bin Liu	ce75785d87	Merge pull request #3197 from Bevisy/main-3196 packaging: Fix missing commit message in building kata-runtime	2021-12-06 11:37:29 +08:00
Binbin Zhang	8fae263170	packaging: Fix missing commit message in building kata-runtime add `git` package to the shim-v2 build image Fixes: #3196 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2021-12-04 11:59:59 +08:00
Eric Ernst	c14080fd08	Merge pull request #3200 from likebreath/1203/upgrade_openapi_generator virtcontainers: clh: Upgrade to openapi-generator v5.3.0	2021-12-03 14:15:51 -08:00
Bo Chen	995300260e	virtcontainers: clh: Upgrade to openapi-generator v5.3.0 The latest release of openapi-generator v5.3.0 contains the fix for `dropping err` bug [1]. This patch also re-generated the client code of Cloud Hypervisor to have the bug fixed. [1] https://github.com/OpenAPITools/openapi-generator/pull/10275 Fixes: #3201 Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-12-03 08:55:38 -08:00
Carlos Venegas	d02a0932d6	Merge pull request #3173 from liubin/fix/3172 agent: user container ID as watchable storage key for hashmap	2021-12-03 09:35:32 -06:00
Fabiano Fidêncio	3fdc97e110	Merge pull request #3183 from fengwang666/nonroot-vhost-bug-fix runtime: enable vhost-net for rootless hypervisor	2021-12-03 10:42:50 +01:00
Bin Liu	86d9d2eed5	Merge pull request #3169 from Kvasscn/kata_dev_add_install_go_help packaging: add help information for '-f' option in install_go.sh	2021-12-03 14:39:05 +08:00
Feng Wang	b3bcb7b251	runtime: enable vhost-net for rootless hypervisor vhost-net is disabled in the rootless kata runtime feature, which has been abandoned since kata 2.0. I reused the rootless flag for nonroot hypervisor and would like to enable vhost-net. Fixes #3182 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2021-12-02 21:55:31 -08:00
Bin Liu	4b57548838	Merge pull request #3181 from egernst/topic/clean-lint Cleanup some unused variables, definitions	2021-12-03 11:06:42 +08:00
Eric Ernst	7cb7b9d5ba	agent: remove unused field in mount handling In our parsing of mountinfo, majority of the fields are unused. Let's stop saving these. Fixes: #3180 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-12-02 17:03:46 -08:00
Eric Ernst	f6ae15826e	agent: drop unused fields from network We don't utilize routes or inteface vectors. Let's drop them. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-12-02 17:03:41 -08:00
Chelsea Mafrica	cb4bf486ef	Merge pull request #3179 from likebreath/1202/clh_v20.0 Upgrade to Cloud Hypervisor v20.0	2021-12-02 15:31:14 -08:00
Bo Chen	4756a04b2d	virtcontainers: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v19.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-12-02 12:09:12 -08:00
Bo Chen	0bf4d2578a	versions: Upgrade to Cloud Hypervisor v20.0 Highlights from the Cloud Hypervisor release v20.0: 1) Multiple PCI segments support (now support up to 496 PCI devices); 2) CPU pinning; 3) Improved VFIO support; 4) Safer code; 5) Extended documentation; 6) Bug fixes. Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v20.0 Fixes: #3178 Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-12-02 12:09:05 -08:00
GabyCT	6edddcced9	Merge pull request #3175 from GabyCT/topic/limitations docs: Update limitation document regarding docker swarm	2021-12-02 12:03:36 -06:00
Gabriela Cervantes	647082b2c8	docs: Update limitation document regarding docker swarm This PR removes the information about docker swarm and docker compose as currently for kata 2.0 we have not support for docker swarm and docker compose and the links and references that the document is referring are currently not part of kata 1.0 Fixes #3174 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2021-12-02 16:38:13 +00:00
bin	39b35d0073	agent: user container ID as watchable storage key for hashmap Use sandbox ID as the key will cause the failed containers' storage leak. Fixes: #3172 Signed-off-by: bin <bin@hyper.sh>	2021-12-02 23:28:25 +08:00
Bin Liu	4895015eac	Merge pull request #3166 from fengwang666/dax-bug-fix runtime: Enable FUSE_DAX kernel config for DAX	2021-12-02 16:08:06 +08:00
zhanghj	1e6f58e562	packaging: add help information for '-f' option in install_go.sh add help info for force install, and remove unused '-p' option. Fixes: #3168 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2021-12-02 02:58:12 -05:00
Bin Liu	3992d28f00	Merge pull request #3152 from liubin/fix/3140-create-empty-dir agent: copy empty directories for watchable-bind mounts	2021-12-02 14:46:25 +08:00
bin	2af95bc536	agent: create directories for watchable-bind mounts In function `update_target`, if the updated source is a directory, we should create the corresponding directory. Fixes: #3140 Signed-off-by: bin <bin@hyper.sh>	2021-12-02 06:31:03 +08:00
Feng Wang	6105e3ee85	runtime: enable FUSE_DAX kernel config for DAX Otherwise DAX device cannot be set up. Fixes #3165 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2021-12-01 13:38:57 -08:00
GabyCT	45854147d0	Merge pull request #3164 from GabyCT/topic/fixconfigtoml runtime: Update comments for virtcontainers to use kata 2.0	2021-12-01 12:19:26 -06:00
Gabriela Cervantes	591d4af1ea	runtime: Update comments for virtcontainers to use kata 2.0 This PR updates the comments in the configuration.toml to point to the current kata containers repository instead of the kata 1.x. Fixes #3163 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2021-12-01 16:16:46 +00:00
Fupan Li	87f350db53	Merge pull request #3125 from jodh-intel/update-rust-crate-versions Update rust crate versions	2021-12-01 18:00:33 +08:00
James O. D. Hunt	bc7fde2096	Merge pull request #3154 from GabyCT/topic/removedebian osbuilder: Remove debian as a rootfs	2021-12-01 09:29:02 +00:00
Gabriela Cervantes	923e098db6	osbuilder: Remove debian as a rootfs Currently we do not have debian as part of the kata CI as we do not have a mantainer, this PR removes debian as a supported rootfs in order to have only the distros that we are supporting and mantainining. Fixes #3153 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2021-11-30 19:31:33 +00:00
James O. D. Hunt	afb96c0044	agent: Wrap remaining nix errors with anyhow Wrap `nix` `Error`'s in an `anyhow` error for consistency with the way `rustjail` handles errors. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-30 13:26:15 +00:00
James O. D. Hunt	aba572e01d	rustjail: Wrap remaining nix errors with anyhow Replace `Result` values that use a "bare" `nix` `Error` like this: ```rust return Err(nix::Error::EINVAL.into()); ``` ... to the following which wraps the nix` error in an `anyhow` call for consistency with the other errors returned by `rustjail`: ```rust return Err(anyhow!(nix::Error::EINVAL)); ``` Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-30 13:24:04 +00:00
James O. D. Hunt	30d6007893	uevent: Fix clippy issue in test code Remove a bare `return` from a test function. This looks wrong but isn't because the callers are all tests that just wait for a state change caused by this test function. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-30 12:58:15 +00:00
James O. D. Hunt	4a2be13c60	agent: Upgrade nix version for security fix Running `cargo audit` showed that the `nix` package for the agent and the `rustjail` and `vsock-exporter` local crates need to be updated to resolve rust security issue [RUSTSEC-2021-0119](https://rustsec.org/advisories/RUSTSEC-2021-0119). Hence, bumped `nix` to the latest version (which required changes to work with the new, simpler `errno` handling). Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-30 12:58:15 +00:00
James O. D. Hunt	256d5008dc	agent: Update crate versions Run `cargo update` to update to the latest crate dependency versions. The agent is an application so this includes expanding the partially specified semvers to full semver values for the following crates, which makes those crates consistent with the other agent dependencies: - `futures` - `regex` - `scan_fmt` - `tokio` Fixes: #3124. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-30 12:58:15 +00:00
James O. D. Hunt	13257986ae	agent-ctl: Update rust lockfile Ran `cargo update` to bump crate versions. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-30 12:58:15 +00:00
James O. D. Hunt	4ebdd424de	forwarder: Update rust lockfile Ran `cargo update` to bump crate versions. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-30 12:58:15 +00:00
James O. D. Hunt	6007322daa	agent: Fixed invalid error message Remove the format specifier in the `"failed to get VFIO group"` error returned by `vfio_device_handler()`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-30 12:58:15 +00:00
Fabiano Fidêncio	3e3e3a0253	Merge pull request #3149 from fidencio/2.4.0-alpha0-branch-bump # Kata Containers 2.4.0-alpha0	2021-11-29 20:24:19 +01:00
Fabiano Fidêncio	72b8144b56	release: Kata Containers 2.4.0-alpha0 - osbuilder: fix missing cpio package when building rootfs-initrd image - osbuilder: add coreutils to guest rootfs - workflows: only allow org members to run `/test_kata_deploy` - agent: use temp directory for test containers - tools/osbuilder: build QAT kernel in fedora 34 - agent: refactor find_process function and add test cases - Hypervisor cleanup, refactoring - agent: clear cargo test warnings - docs: Add a code PR advice document - tools: Automatically revert kata-deploy changes - runtime: delete netmon - agent: Remove some unwrap and expect calls - agent: fixed the `make optimize` bug - docs: make kata-deploy more visible - workflows: Add back the checks for running test-kata-deploy - kata-deploy: Ensure we test HEAD with `/test_kata_deploy` - docs: update using-SPDK-vhostuser-and-kata.md - Update k8s SR-IOV plugin environment variables to work properly with Kata - watchers: don't dereference symlinks when copying files - kata-deploy: Add back stable & latest tags - agent: fix the issue of missing create a new session for container - runtime: Update containerd to 1.5.8 - qemu: fix snap build on ppc64le - virtcontainers: fix failing template test on ppc64le - agent: Update README - Remove cruft, do some simple non-functional cleanup in the runtime - macvlan: drop bridged part of name - clh: Fix race condition that prevent start pods - Update CRI-O documentation - cgroups: Fix systemd cgroup support - runtime: merge virtcontainers/pkg/types into virtcontainers/types - workflows: Remove non-used main.yaml - agent/src: improve unit test coverage for src/namespace.rs - doc: update kata metrics documentation - runtime: delete not used codes - versions: bump golang to 1.17.x - release: Use ${GOPATH}/bin/yq for upload-libseccomp-tarball action - agent-ctl: Allow API specification in JSON format - virtcontainers: Lint protection types - agent: check environment variables if empty or invalid - runtime: Revert "runtime: use containerd package instead of cri-containerd" - rustjail: Fix created time of container - agent: Remove dynamic tracing APIs - kernel: add VFIO kernel dependencies for ppc64le - logging: Always run crate tests `8ee67aae` osbuilder: fix missing cpio package when building rootfs-initrd image `f59d3ff6` osbuilder: add coreutils to guest rootfs `5e7c1a29` workflows: only allow org members to run `/test_kata_deploy` `857501d8` tools/osbuilder: build QAT kernel in fedora 34 `a32e02a1` agent: use temp directory as root of test containers `f0734f52` docs: Remove extraneous whitespace `aff32756` docs: Add a code PR advice document `d41c375c` docs: Add more advice to the UT advice doc `baf4f76d` docs: More detail on running tests as different users `fcf45b0c` docs: Use more idiomatic rust string check `9fed7d0b` docs: Mention anyhow for error handling in UT doc `318b3f18` docs: No present continuous in UT advice doc `e8bb6b26` docs: Correct repo name usage `c1111a1d` docs: Use leading caps for lang names in UT advice doc `597b239e` docs: Remove TOC in UT advice doc `cf360fad` docs: Move unit test advice doc from tests repo `bc955814` docs: Move doc requirements section higher `6a0b7165` agent: refactor find_process function and add test cases `5ba2f52c` tools: Quote functions arguments in the update repos script `5dbd752f` tools: Remove the check for the VERSION file `85eb743f` tools: Make hub usage slightly less fragile `76540dbd` tools: Automatically revert kata-deploy changes `36d73c96` tools: Do the kata-deploy changes on its own commit `c8e22daf` tools: Use vars for the registry in the update repo script `ac958a30` tools: Use vars for the yaml files used in the update repo script `edca8292` tools: Rewrite the logic around kata-deploy changes `31f6c2c2` tools: Update comments about the kata-deploy yaml changes `75bb3401` shimv2/service: fix defer funtions never run with os.Exit() `bd3217da` agent: Remove redundant returns `adab6434` agent: Remove some unwrap and expect calls `351cef7b` agent: Remove unwrap from verify_cid() `a7d1c70c` agent: Improve baremount `09abcd4d` agent-ctl: Remove some unwrap and expect calls `35db75ba` agent-ctl: Remove redundant returns `46e45958` agent-ctl: Simplify main `c7349d0b` agent-ctl: Simplify error handling `ddc68131` runtime: delete netmon `705687dc` docs: Add kata-deploy as part of the install docs `acece849` docs: Use the default notation for "Note" on install README `143fb278` kata-deploy: Use the default notation for "Note" `45d76407` kata-deploy: Don't mention arch specific binaries in the README `0c6c0735` agent: fixed the `make optimize` bug `a7c08aa4` workflows: Add back the checks for running test-kata-deploy `ce0693d6` agent: clear cargo test warnings `ce92cadc` vc: hypervisor: remove setSandbox `2227c46c` vc: hypervisor: use our own logger `4c2883f7` vc: hypervisor: remove dependency on persist API `34f23de5` vc: hypervisor: Remove need to get shared address from sandbox `c28e5a78` acrn: remove dependency on sandbox, persistapi datatypes `a0e0e186` hypervisors: introduce pkg to unbreak vc/persist dependency `b5dfcf26` watcher: tests: ensure there is 20ms delay between fs writes `78dff468` agent/device: Adjust PCIDEVICE_* container environment variables for VM `4530e7df` agent/device: Use simpler structure in update_spec_devices() `b6062278` agent/device: Correct misleading comment on test case `89ff7000` agent/device: Remove unnecessary check for empty container_path `c855a312` agent/device: Make DevIndex local to update_spec_devices() `084538d3` agent/device: Change update_spec_device to handle multiple devices at once `d6a3ebc4` agent/device: Obtain guest major/minor numbers when creating DevNumUpdate `f4982130` agent/device: Check for conflicting device updates `f10e8c81` agent/device: Batch changes to the OCI specification `46a4020e` agent/device: Types to represent update for a device in the OCI spec `e7beed54` agent/device: Remove unneeded clone() from several device handlers `2029eeeb` agent/device: Improve update_spec_device() final_path handling `57541315` agent/device: Correct misleading parameter name in update_spec_device() `0c51da3d` agent/device: Correct misleading error message in update_spec_device() `94b7936f` agent/device: Use nix::sys::stat::{major,minor} instead of libc::* `296e76f8` watchers: handle symlinked directories, dir removal `2b6dfe41` watchers: don't dereference symlinks when copying files `3c9ae7fb` kata-deploy: Ensure we test HEAD with `/test_kata_deploy` `0380b9bd` runtime: Update containerd to 1.5.8 `112ea258` qemu: fix snap build by disabling libudev `d5a18173` virtcontainers: fix failing template test on ppc64le `6955d144` kata-deploy: Add back stable & latest tags `bbaf57ad` agent: fix the issue of missing create a new session for container `46fd5069` docs: update using-SPDK-vhostuser-and-kata.md `7e6f2b8d` vc-utils: don't export unused function `860f3088` virtcontainers: move oci, uuid packages top level `8acb3a32` virtcontainers: remove unused package nsenter `4788cb82` vc-network: remove unused functions `b6ebddd7` oci: remove unused function GetContainerType `599bc0c2` agent: Update README `1e7cb4bc` macvlan: drop bridged part of name `55412044` monitor: Fix monitor race condition doing hypervisor.check() `eb11d053` cri-o: Update deployment documentation `92e3a140` cri-o: Update links for the CRI-O github page `0a19340a` cri-o: Remove outdated documentation `a3b3c85e` workflows: Remove non-used main.yaml `09f7962f` runtime: merge virtcontainers/pkg/types into virtcontainers/types `6acedc25` runtime: delete not used codes `395638c4` versions: bump golang to 1.17.x `570915a8` docs: update kata 2.0 metrics documentation `bcf181b7` cgroups: Fix systemd cgroup support `34307235` release: Use ${GOPATH}/bin/yq for upload-libseccomp-tarball action `6339fdd1` docs: update kata metrics architecture image `57bb7ffa` agent: check environment variables if empty or invalid `8ab90e10` agent-ctl: Allow API specification in JSON format `eacfcdec` runtime: Revert "runtime: use containerd package instead of cri-containerd" `e7856ff1` rustjail: Fix created time of container `b7b89905` virtcontainers: Lint protection types `7566b736` kernel: add VFIO kernel dependencies for ppc64le `87f67606` agent: Remove dynamic tracing APIs `b09dd7a8` docs: Fix typo `d47484e7` logging: Always run crate tests `5c9c0b6e` build: Fix default target `b34ed403` cgroups: pass vhost-vsock device to cgroup `7362e1e8` runtime: remove prefix when cgroups are managed by systemd `1b1790fd` agent/src: improve unit test coverage for src/namespace.rs Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-29 18:34:45 +01:00
Fabiano Fidêncio	f8aaefc919	Merge pull request #3147 from Bevisy/main-3144 osbuilder: fix missing cpio package when building rootfs-initrd image	2021-11-29 18:27:49 +01:00
Binbin Zhang	8ee67aae4f	osbuilder: fix missing cpio package when building rootfs-initrd image 1. install cpio package before building rootfs-initrd image 2. add `pipefaili;errexit` check to the scripts Fixes: #3144 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2021-11-29 23:42:44 +08:00
Fabiano Fidêncio	879ec4e0e9	Merge pull request #3139 from bergwolf/coreutils osbuilder: add coreutils to guest rootfs	2021-11-29 10:19:39 +01:00
Fabiano Fidêncio	a6219cb5e0	Merge pull request #3134 from fidencio/wip/only-allow-users-who-are-part-of-the-org-to-run-test-kata-deploy workflows: only allow org members to run `/test_kata_deploy`	2021-11-29 07:55:40 +01:00
Peng Tao	f59d3ff600	osbuilder: add coreutils to guest rootfs So that the debug console is more useful. In the meantime, remove iptables as it is not used by kata-agent any more. Fixes: #3138 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2021-11-29 11:22:07 +08:00
Fabiano Fidêncio	7364cd4983	Merge pull request #3129 from liubin/fix/3122-use-tempdir-for-test-container agent: use temp directory for test containers	2021-11-26 23:11:27 +01:00
Fabiano Fidêncio	5e7c1a290f	workflows: only allow org members to run `/test_kata_deploy` Let's take advantage of the "is-organization-member" action and only allow members who are part of the `kata-containers` organization to trigger `/test_kata_deploy`. One caveat with this approach is that for the user to be considered as part of an organization, they must have their "Organization Visibility" configured as Public (and I think the default is Private). This was found out and suggested by @jcvenegas! Fixes: #3130 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-26 23:02:51 +01:00
Julio Montes	06d28d50ed	Merge pull request #3136 from devimc/2021-11-26/fixQATci tools/osbuilder: build QAT kernel in fedora 34	2021-11-26 15:38:57 -06:00
Julio Montes	857501d8dd	tools/osbuilder: build QAT kernel in fedora 34 kernel compiled in fedora 35 (latest) is not working, following error is reported: ``` qemu-system-x86_64: Error loading uncompressed kernel without PVH ELF Note ``` Build QAT kernel in fedora 34 container to fix it fixes #3135 Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-11-26 13:56:43 -06:00
bin	a32e02a1ee	agent: use temp directory as root of test containers Some tests in sandbox.rs need root user to run, because they need create directories under /run/agent directories, actually this is a limit that shouldn't be there. By using a temp directory for test containers will not need run tests as root user. Fixes: #3122 Signed-off-by: bin <bin@hyper.sh>	2021-11-26 15:18:38 +08:00
Manabu Sugimoto	7b35615191	agent: Log unknown seccomp system calls Kata agent logs unknown system calls given by seccomp profiles in advance before the log file descriptor closes. Fixes: #2957 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2021-11-26 15:10:04 +09:00
Peng Tao	c3de161168	Merge pull request #3118 from liubin/fix/3117-refactor-find_process agent: refactor find_process function and add test cases	2021-11-26 10:22:48 +08:00
Peng Tao	01b6ffc0a4	Merge pull request #3028 from egernst/hypervisor-hacking Hypervisor cleanup, refactoring	2021-11-26 10:21:49 +08:00
James O. D. Hunt	9412be39ba	Merge pull request #3092 from liubin/fix/3091-fix-test-warnings agent: clear cargo test warnings	2021-11-25 17:22:27 +00:00
James O. D. Hunt	a813378ac5	Merge pull request #3100 from jodh-intel/docs-code-pr-advice docs: Add a code PR advice document	2021-11-25 15:46:13 +00:00
James O. D. Hunt	f0734f52c1	docs: Remove extraneous whitespace Remove trailing whitespace in the unit test advice doc. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:49 +00:00
James O. D. Hunt	aff3275608	docs: Add a code PR advice document Add a document giving advice to code PR authors. Fixes: #3099. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:46 +00:00
James O. D. Hunt	d41c375c4f	docs: Add more advice to the UT advice doc Add information to the unit test advice document on test strategies and the test environment. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	baf4f76d97	docs: More detail on running tests as different users Add some more detail to the unit test advice document about running tests as different users. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	fcf45b0c92	docs: Use more idiomatic rust string check Rather than comparing a string to a literal in the rust example, use `.is_empty()` as that approach is more idiomatic and preferred. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	9fed7d0bde	docs: Mention anyhow for error handling in UT doc Add a comment stating that `anyhow` and `thiserror` should be used in real rust code, rather than the unwieldy default `Result` handling shown in the example. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	318b3f187b	docs: No present continuous in UT advice doc Change some headings to avoid using the present continuous tense which should not be used for headings. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	e8bb6b2666	docs: Correct repo name usage Change reference from "runtime repo" to "main repo" in unit test advice document. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	c1111a1d2d	docs: Use leading caps for lang names in UT advice doc Use a capital letter when referring to Golang and Rust (and remove unnecessary backticks for Rust). > Note: > > We continue refer to "Go" as "Golang" since it's a common alias, > but, crucially, familiarity with this name makes searching for > information using this term possible: "Go" is too generic a word. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	597b239ef3	docs: Remove TOC in UT advice doc Remove the table of contents in the Unit Test Advice document since GitHub auto-generates these now. See: https://github.com/kata-containers/kata-containers/pull/2023 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	cf360fad92	docs: Move unit test advice doc from tests repo Unit tests necessarily need to be maintained with the code they test so it makes sense to keep the Unit Test Advice document into the main repo since that is where the majority of unit tests reside. Note: The [`Unit-Test-Advice.md` file](https://github.com/kata-containers/tests/blob/main/Unit-Test-Advice.md) was copied from the `tests` repo when it's `HEAD` was `38855f1f40`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
James O. D. Hunt	bc9558149c	docs: Move doc requirements section higher Move the documentation requirements document link up so that it appears immediately below the "How to Contribute" section. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-25 14:44:40 +00:00
Fabiano Fidêncio	abf39ddef0	Merge pull request #3089 from fidencio/wip/kata-deploy-remove-files-and-revert-removal-as-part-of-the-release-scripts tools: Automatically revert kata-deploy changes	2021-11-25 15:23:52 +01:00
Chelsea Mafrica	ed7eb26bff	Merge pull request #3113 from liubin/fix/3112-delete-netmon runtime: delete netmon	2021-11-24 17:58:13 -08:00
bin	6a0b7165ba	agent: refactor find_process function and add test cases Delete redundant parameter init in find_process function and add test case for it. Fixes: #3117 Signed-off-by: bin <bin@hyper.sh>	2021-11-25 09:47:25 +08:00
Fupan Li	2938f60abb	Merge pull request #3012 from jodh-intel/agent-rm-unwraps agent: Remove some unwrap and expect calls	2021-11-25 09:37:39 +08:00
Fabiano Fidêncio	5ba2f52c73	tools: Quote functions arguments in the update repos script Although this is not strictly needed, better be safe than sorry on those cases. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 22:09:58 +01:00
Fabiano Fidêncio	5dbd752f8f	tools: Remove the check for the VERSION file All repos we release (https://github.com/kata-containers/kata-containers and https://github.com/kata-containers/tests) have a VERSION file. Keeping a check for it, although useful for a new repo, just complicates the use-case we currently deal with. While here, let's also anchor the '#' and potentially exclude blank lines, following James' suggestion. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 22:09:49 +01:00
Fabiano Fidêncio	85eb743f46	tools: Make hub usage slightly less fragile `grep`ing by a specific output, in a specific language, is quite fragile and could easily break `hub`. For now, let's work this around following James' suggestion of setting `LC_ALL=C LANG=C` when calling `hub`. > Note: I don't think we should invest much time on fixing `hub` > usage, as it'll be soon replaced by `gh`, see: > https://github.com/kata-containers/kata-containers/issues/3083 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 22:09:30 +01:00
Fabiano Fidêncio	76540dbdd1	tools: Automatically revert kata-deploy changes When branching the "stable-x.y" branch, we need to do some quite specific changes to kata-deploy / kata-cleanup files, such as: * changing the tags from "latest" to "stable-x.y". * removing the kata-deploy / kata-cleanup stable files. However, after the branching is done, we need to get the `main` repo to its original state, with the kata-deploy / kata-cleanup using the "latest" tag, and with the stable files present there, and this commit ensures that, during the release process, a new PR is automatically created with these changes. Fixes: #3069 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 22:07:53 +01:00
Fabiano Fidêncio	36d73c96c8	tools: Do the kata-deploy changes on its own commit Rather than doing the kata-deploy changes as part of the release bump commit, let's split those on its own changes, as it will both make the life of the reviewer less confusing and also allows us to start preparing the field for a possible automated revert of these changes, whenever it becomes needed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 22:07:52 +01:00
Fabiano Fidêncio	c8e22daf67	tools: Use vars for the registry in the update repo script Similarly to what was done for the yaml files, let's use a var for representing the registry where our images will be pushed to and avoid repetition and too long lines. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 22:07:03 +01:00
Fabiano Fidêncio	ac958a3073	tools: Use vars for the yaml files used in the update repo script Instead of always writing the full path of some files, let's just create some vars and avoid both repetition (which is quite error prone) and too long lines (which makes the file not so easy to read). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 22:06:45 +01:00
Fabiano Fidêncio	edca829242	tools: Rewrite the logic around kata-deploy changes We can simplify the code a little bit, as at least now we group common operationr together. Hopefully this will improve the maintainability and the readability of the code. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 22:05:35 +01:00
Fabiano Fidêncio	31f6c2c2ea	tools: Update comments about the kata-deploy yaml changes The comments were mentioning kata-deploy-base files while it really should mention kata-deploy-stable files. While here, I've also added a missing '"' to one of the tags. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-24 21:17:40 +01:00
Binbin Zhang	75bb340137	shimv2/service: fix defer funtions never run with os.Exit() os.Exit() will terminate program immediately, the defer functions won't be executed, so we add defer functions again before os.Exit(). Refer to https://pkg.go.dev/os#Exit Fixes: #3059 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2021-11-24 15:59:59 +01:00
James O. D. Hunt	bd3217daeb	agent: Remove redundant returns Remove an unnecessary `return` statement identified by clippy. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-24 11:43:49 +00:00
James O. D. Hunt	adab64349c	agent: Remove some unwrap and expect calls Replace some `unwrap()` and `expect()` calls with code to return the error to the caller. Fixes: #3011. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-24 11:43:49 +00:00
James O. D. Hunt	351cef7b6a	agent: Remove unwrap from verify_cid() Improved the `verify_cid()` function that validates container ID's by removing the need for an `unwrap()`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-24 11:43:49 +00:00
James O. D. Hunt	a7d1c70c4b	agent: Improve baremount Change `baremount()` to accept `Path` values rather than string values since: - `Path` is more natural given the function deals with paths. - This minimises the caller having to convert between string and `Path` types, which simplifies the surrounding code. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-24 11:43:49 +00:00
James O. D. Hunt	09abcd4dc6	agent-ctl: Remove some unwrap and expect calls Replace some `unwrap()` and `expect()` calls with code to return the error to the caller. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-24 11:43:49 +00:00
James O. D. Hunt	35db75baa1	agent-ctl: Remove redundant returns Remove a number of redundant `return`'s. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-24 11:43:49 +00:00
James O. D. Hunt	46e459584d	agent-ctl: Simplify main Make the `main()` function simpler. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-24 11:43:49 +00:00
James O. D. Hunt	c7349d0bf1	agent-ctl: Simplify error handling Replace `ok_or().map_err()` combinations with the simpler `ok_or_else()` construct. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-24 11:43:49 +00:00
bin	ddc68131df	runtime: delete netmon Netmon is not used anymore. Fixes: #3112 Signed-off-by: bin <bin@hyper.sh>	2021-11-24 15:08:18 +08:00
Carlos Venegas	ac058b3897	Merge pull request #3105 from YchauWang/wyc-agent-make-02 agent: fixed the `make optimize` bug	2021-11-23 13:17:05 -06:00
Fabiano Fidêncio	181f876fdb	Merge pull request #3098 from fidencio/wip/move_kata-deploy-install-instruction_to_docs docs: make kata-deploy more visible	2021-11-23 18:32:42 +01:00
João Vanzuita	705687dc42	docs: Add kata-deploy as part of the install docs This PR links the kata-deloy installation instructions to the docs/install folder. Fixes: #2450 Signed-off-by: João Vanzuita <joao.vanzuita@de.bosch.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-23 13:57:22 +01:00
Fabiano Fidêncio	acece84906	docs: Use the default notation for "Note" on install README Let's use the default GitHub notation for notes in documentation, as describe here: https://github.com/kata-containers/kata-containers/blob/main/docs/Documentation-Requir Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Suggested-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-23 13:27:35 +01:00
Fabiano Fidêncio	143fb27802	kata-deploy: Use the default notation for "Note" Let's use the default GitHub notation for notes in documentation, as describe here: https://github.com/kata-containers/kata-containers/blob/main/docs/Documentation-Requirements.md#notes Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Suggested-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-23 13:24:42 +01:00
Fabiano Fidêncio	45d76407aa	kata-deploy: Don't mention arch specific binaries in the README Although the binary name of the shipped binary is `qemu-system-x86_64`, and we only ship kata-deploy for `x86_64`, we better leaving the architecture specific name out of our README file. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-23 13:21:37 +01:00
wangyongchao.bj	0c6c0735ec	agent: fixed the `make optimize` bug The unrecognized option: 'deny-warnings' args caused `make optimize` failed. Fixed the Makefile of the agent project, make sure the `make optimize` command execute correctly. This PR modify the rustc args from '--deny-warnings' to '--deny warnings'. Fixes: #3104 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2021-11-23 09:44:05 +08:00
Fabiano Fidêncio	0ae77e1232	Merge pull request #3102 from fidencio/wip/add-back-wrongly-removed-check-for-test-kata-deploy workflows: Add back the checks for running test-kata-deploy	2021-11-22 22:36:03 +01:00
Fabiano Fidêncio	a7c08aa4b6	workflows: Add back the checks for running test-kata-deploy Commit `3c9ae7f` made /test_kata_deploy run against HEAD, but it also mistakenly removed all the checks that ensure /test_kata_deploy only runs when explicitly called. Mea culpa on this, and let's add the tests back. Fixes: #3101 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-22 18:33:10 +01:00
Carlos Venegas	3be15aed1c	Merge pull request #3071 from fidencio/wip/test-kata-deploy-should-use-the-latest-builds kata-deploy: Ensure we test HEAD with `/test_kata_deploy`	2021-11-22 10:48:35 -06:00
bin	ce0693d6dc	agent: clear cargo test warnings Function parameters in test config is not used. This commit will add under score before variable name in test config. Fixes: #3091 Signed-off-by: bin <bin@hyper.sh>	2021-11-22 20:45:46 +08:00
Tim Zhang	cad279b37d	Merge pull request #3055 from liubin/fix/3054-update-spdk-doc docs: update using-SPDK-vhostuser-and-kata.md	2021-11-22 15:47:02 +08:00
Binbin Zhang	7304e52a59	Makefile: update `make go-test` call 1. use ci/go-test.sh to replace the direct call to go test 2. fix data race test 3. install hook whether it is root or not Fixes #1494 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2021-11-22 13:59:22 +08:00
David Gibson	1b28d7180f	Merge pull request #2927 from dgibson/vfio-env-mangling Update k8s SR-IOV plugin environment variables to work properly with Kata	2021-11-22 13:44:19 +11:00
Eric Ernst	a0919b0865	Merge pull request #2998 from egernst/fix-symlinks watchers: don't dereference symlinks when copying files	2021-11-19 12:43:22 -08:00
Eric Ernst	ce92cadc7d	vc: hypervisor: remove setSandbox The hypervisor interface implementation should not know a thing about sandboxes. Fixes: #2882 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	2227c46c25	vc: hypervisor: use our own logger This'll end up moving to hypervisors pkg, but let's stop using virtLog, instead introduce hvLogger. Fixes: #2884 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	4c2883f7e2	vc: hypervisor: remove dependency on persist API Today the hypervisor code in vc relies on persist pkg for two things: 1. To get the VM/run store path on the host filesystem, 2. For type definition of the Load/Save functions of the hypervisor interface. For (1), we can simply remove the store interface from the hypervisor config and replace it with just the path, since this is all we really need. When we create a NewHypervisor structure, outside of the hypervisor, we can populate this path. For (2), rather than have the persist pkg define the structure, let's let the hypervisor code (soon to be pkg) define the structure. persist API already needs to call into hypervisor anyway; let's allow us to define the structure. We'll probably want to look at following similar pattern for other parts of vc that we want to make independent of the persist API. In doing this, we started an initial hypervisors pkg, to hold these types (avoid a circular dependency between virtcontainers and persist pkg). Next step will be to remove all other dependencies and move the hypervisor specific code into this pkg, and out of virtcontaienrs. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	34f23de512	vc: hypervisor: Remove need to get shared address from sandbox Add shared path as part of the hypervisor config Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	c28e5a7807	acrn: remove dependency on sandbox, persistapi datatypes Today, acrn relies on sandbox level information, as well as a store provided by common parts of the hypervisor. As we cleanup the abstractions within our runtime, we need to ensure that there aren't cross dependencies between the sandbox, the persistence logic and the hypervisor. Ensure that ACRN still compiles, but remove the setSandbox usage as well as persist driver setup. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	a0e0e18639	hypervisors: introduce pkg to unbreak vc/persist dependency Initial hypervisors pkg, with just basic state types defined. Fixes: #2883 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	b5dfcf2653	watcher: tests: ensure there is 20ms delay between fs writes We noticed s390x test failures on several of the watcher unit tests. Discovered that on s390 in particular, if we update a file in quick sucecssion, the time stampe on the file would not be unique between the writes. Through testing, we observe that a 20 millisecond delay is very reliable for being able to observe the timestamp update. Let's ensure we have this delay between writes for our tests so our tests are more reliable. In "the real world" we'll be polling for changes every 2 seconds, and frequency of filesystem updates will be on order of minutes and days, rather that microseconds. Fixes: #2946 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 11:33:36 -08:00
Fabiano Fidêncio	d08bcde7aa	Merge pull request #3068 from fidencio/wip/kata-deploy-re-add-latest-and-stable-tags kata-deploy: Add back stable & latest tags	2021-11-19 15:58:55 +01:00
David Gibson	78dff468bf	agent/device: Adjust PCIDEVICE_* container environment variables for VM The k8s SR-IOV plugin, when it assigns a VFIO device to a container, adds an variable of the form PCIDEVICE_<identifier> to the container's environment, so that the payload knows which device is which. The contents of the variable gives the PCI address of the device to use. Kata allows VFIO devices to be passed in to a Kata container, however it runs within a VM which has a different PCI topology. In order for the payload to find the right device, the environment variables therefore need to be converted to list the guest PCI addresses instead of the host PCI addresses. fixes #2897 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 17:44:05 +11:00
David Gibson	4530e7df29	agent/device: Use simpler structure in update_spec_devices() update_spec_devices() takes a bunch of updates for the device entries in the OCI spec and applies them, adjusting things in both the linux.devices and linux.resources.devices sections of the spec. It's important that each entry in the spec only be updated once. Currently we ensure this by first creating an index of where the entries are, then consulting that as we apply each update, so that earlier updates don't cause us to incorrectly detect an entry as being relevant to a later update. This method works, but it's quite awkward. This inverts the loop structure in update_spec_devices() to make this clearer. Instead of stepping through each update and finding the relevant entries in the spec to change, we step through each entry in the spec and find the relevant update. This makes it structurally clear that we're only updating each entry once. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 17:21:11 +11:00
Tim Zhang	653b461dc2	Merge pull request #3064 from lifupan/main agent: fix the issue of missing create a new session for container	2021-11-19 11:28:54 +08:00
David Gibson	b60622786d	agent/device: Correct misleading comment on test case We have a test case commented as testing the case where linux.devices is empty in the OCI spec. While it's true that linux.devices is empth in this example, the reason it fails isn't specifically because it's empty but because it doesn't contain a device for the update we're trying to apply. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 14:25:04 +11:00
David Gibson	89ff700038	agent/device: Remove unnecessary check for empty container_path update_spec_devices() explicitly checks for being called with an empty container path and fails. We have a unit test to verify this behaviour. But while an empty container_path probably does mean something has gone wrong elsewhere, that's also true of any number of other bad paths. Having an empty string here doesn't prevent what we're doing in this function making sense - we can compare it to the strings in the OCI spec perfectly well (though more likely we simply won't find it there). So, there's no real reason to check this one particular odd case. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 14:25:03 +11:00
David Gibson	c855a312f0	agent/device: Make DevIndex local to update_spec_devices() The DevIndex data structure keeps track of devices in the OCI specification. We used to carry it around to quite a lot of functions, but it's now used only within update_spec_devices(). That means we can simplify things a bit by just open coding the maps we need, rather than declaring a special type. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 14:24:47 +11:00
David Gibson	084538d334	agent/device: Change update_spec_device to handle multiple devices at once update_spec_device() adjusts the OCI spec for device differences between the host and guest. It is called repeatedly for each device we need to alter. These calls are now all in a single loop in add_devices(), so it makes more sense to move the loop into a renamed update_spec_devices() and process all the fixups in one call. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 14:23:58 +11:00
David Gibson	d6a3ebc496	agent/device: Obtain guest major/minor numbers when creating DevNumUpdate Currently the DevNumUpdate structure is created with a path to a device node in the VM, which is then used by update_spec_device(). However the only piece of information that update_spec_device() actually needs is the VM side major and minor numbers for the device. We can determine those when we create the DevNumUpdate structure. This means we detect errors earlier and as a bonus we don't need to make a copy of the vm path string. Since that change requires updating 2 of the log statements, we take the opportunity to update all the log statements to structured style. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 14:23:36 +11:00
David Gibson	f4982130e1	agent/device: Check for conflicting device updates For each device in the OCI spec we need to update it to reflect the guest rather than the host. We do this with additional device information provided by the runtime. There should only be one update for each device though, if there are multiple, something has gone horribly wrong. Detect and report this situation, for safety. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 14:23:34 +11:00
David Gibson	f10e8c8165	agent/device: Batch changes to the OCI specification As we process container devices in the agent, we repeatedly call update_spec_device() to adjust the OCI spec as necessary for differences between the host and the VM. This means that for the whole of a pretty complex call graph, the spec is in a partially-updated state - neither fully as it was on the host, not fully as it will be for the container within the VM. Worse, it's not discernable from the contents itself which parts of the spec have already been updated and which have not. We used to have real bugs because of this, until the DevIndex structure was introduced, but that means a whole, fairly complex, parallel data structure needs to be passed around this call graph just to keep track of the state we're in. Start simplifying this by having the device handler functions not directly update the spec, but instead return an update structure describing the change they need. Once all the devices are added, add_devices() will process all the updates as a batch. Note that collecting the updates in a HashMap, rather than a simple Vec doesn't make a lot of sense in the current code, but will reduce churn in future changes which make use of it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 14:23:15 +11:00
David Gibson	46a4020e9e	agent/device: Types to represent update for a device in the OCI spec Currently update_spec_device() takes parameters 'vm_path' and 'final_path' to give it the information it needs to update a single device in the OCI spec for the guest. This bundles these parameters into a single structure type describing the updates to a single device. This doesn't accomplish much immediately, but will allow a number of further cleanups. At the same time we change the representation of vm_path from a Unicode string to a std::path::Path, which is a bit more natural since we are performing file operations on it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 12:27:52 +11:00
David Gibson	e7beed5430	agent/device: Remove unneeded clone() from several device handlers virtio_blk_device_handler(), virtio_blk_ccw_device_handler() and virtio_scsi_device_handler() all take a clone of their 'device' parameter. They appear to do this in order to get a mutable copy in which they can update the vm_path field. However, the copy is dropped at the end of the function, so the only thing that's used in it is the vm_path field passed to update_spec_device() afterwards. We can avoid the clone by just using a local variable for the vm_path. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 12:27:52 +11:00
David Gibson	2029eeebca	agent/device: Improve update_spec_device() final_path handling update_spec_device() takes a 'final_path' parameter which gives the name the device should be given in the "inner" OCI spec. We need this for VFIO devices where the name the payload sees needs to match the VM's IOMMU groups. However, in all other cases (for now, and maybe forever), this is the same as the original 'container_path' given in the input OCI spec. To make this clearer and simplify callers, make this parameter an Option, and only update the device name if it is non-None. Additionally, update_spec_device() needs to call to_string() on update_path to get an owned version. Rust convention[0] is to let the caller decide whether it should copy, or just give an existing owned version to the function. Change from &str to String to allow that; it doesn't buy us anything right now, but will make some things a little nicer in future. [0] https://rust-lang.github.io/api-guidelines/flexibility.html?highlight=clone#caller-decides-where-to-copy-and-place-data-c-caller-control Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 12:27:52 +11:00
David Gibson	57541315db	agent/device: Correct misleading parameter name in update_spec_device() update_spec_device() takes a 'host_path' parameter which it uses to locate the device to correct in the OCI spec. Although this will usually be the path of the device on the host, it doesn't have to be - a traditional runtime like runc would create a device node of that name in the container with the given (host) major and minor numbers. To clarify that, rename it to 'container_path'. We also update the block comment to explain the distinctions more carefully. Finally we update some variable names in tests to match. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 12:27:52 +11:00
David Gibson	0c51da3dd0	agent/device: Correct misleading error message in update_spec_device() This error is returned if we have information for a device from the runtime, but a matching device does not appear in the OCI spec. However, the name for the device we print is the name from the VM, rather than the name from the container which is what we actually expect in the spec. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 12:27:52 +11:00
David Gibson	94b7936f51	agent/device: Use nix::sys::stat::{major,minor} instead of libc::* update_spec_devices() includes an unsafe block, in order to call the libc functions to get the major and minor numbers from a device ID. However, the nix crate already has a safe wrapper for this function, which we use in other places in the file. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-19 12:27:52 +11:00
Eric Ernst	296e76f8ee	watchers: handle symlinked directories, dir removal - Even a directory could be a symlink - check for this. This is very common when using configmaps/secrets - Add unit test to better mimic a configmap, configmap update - We would never remove directories before. Let's ensure that these are added to the watched_list, and verify in unit tests - Update unit tests which exercise maximum number of files per entry. There's a change in behavior now that we consider directories/symlinks watchable as well. For these tests, it means we support one less file in a watchable mount. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-18 16:23:45 -08:00
Eric Ernst	2b6dfe414a	watchers: don't dereference symlinks when copying files The current implementation just copies the file, dereferencing any simlinks in the process. This results in symlinks no being preserved, and a change in layout relative to the mount that we are making watchable. What we want is something like "cp -d" This isn't available in a crate, so let's go ahead and introduce a copy function which will create a symlink with same relative path if the source file is a symlink. Regular files are handled with the standard fs::copy. Introduce a unit test to verify symlinks are now handled appropriately. Fixes: #2950 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-18 16:23:45 -08:00
Fabiano Fidêncio	3c9ae7fb4b	kata-deploy: Ensure we test HEAD with `/test_kata_deploy` Is the past few releases we ended up hitting issues that could be easily avoided if `/test_kata_deploy` would use HEAD instead of a specific tarball. By the end of the day, we want to ensure kata-deploy works, but before we cut a release we also want to ensure that the binaries used in that release are in a good shape. If we don't do that we end up either having to roll a release back, or to cut a second release in a really short time (and that's time consuming). Note: there's code duplication here that could and should be avoided,b but I sincerely would prefer treating it in a different PR. Fixes: #3001 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-18 23:38:55 +01:00
Greg Kurz	c01189d4a6	Merge pull request #3075 from c3d/bugs/3074-containerd-update runtime: Update containerd to 1.5.8	2021-11-18 22:42:05 +01:00
Christophe de Dinechin	0380b9bda7	runtime: Update containerd to 1.5.8 Release 1.5.8 of containerd contains fixes for two low-severity advisories: [GHSA-5j5w-g665-5m35](https://github.com/opencontainers/distribution-spec/security/advisories/GHSA-mc8v-mgrf-8f4m) [GHSA-77vh-xpmg-72qh](https://github.com/opencontainers/image-spec/security/advisories/GHSA-77vh-xpmg-72qh) Fixes: #3074 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2021-11-18 18:38:27 +01:00
Greg Kurz	bdde8beb52	Merge pull request #3003 from Amulyam24/snap_ppc qemu: fix snap build on ppc64le	2021-11-18 17:46:23 +01:00
Greg Kurz	f80ca66300	Merge pull request #2921 from Amulyam24/template_test virtcontainers: fix failing template test on ppc64le	2021-11-18 17:32:18 +01:00
Julio Montes	d432e21d6f	Merge pull request #206 from liubin/fix/205-fix-wait-parameter-for-client-socket qemu: only set wait parameter for server mode socket based char device	2021-11-18 09:56:43 -06:00
Amulyam24	112ea25859	qemu: fix snap build by disabling libudev While building snap, static qemu is considered. Disable libudev as it doesn't have static libraries on most of the distros of all archs. Fixes: #3002 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2021-11-18 18:50:19 +05:30
Amulyam24	d5a18173b9	virtcontainers: fix failing template test on ppc64le If a file/directory doesn't exist, os.Stat() returns an error. Assert the returned value with os.IsNotExist() to prevent it from failing. Fixes: #2920 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2021-11-18 15:37:40 +05:30
Fabiano Fidêncio	6955d1442f	kata-deploy: Add back stable & latest tags stable-2.3 was the first time we branched the repo since `43a72d76e2` was merged. One bit that I didn't notice while working on this, regardless of being warned by @amshinde (sorry!), was that the change would happen on `main` branch, rather than on the branched `stable-2.3` one. In my mind, the workflow was: * we branch. * we do the changes, including removing the files. * we tag a release. However, the workflow actually is: * we do the changes, including removing the files. * we branch. * we tag a release. A better way to deal with this has to be figured out before 2.4.0 is out, but for now let's just re-add the files back. Fixes: #3067 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-18 09:41:54 +01:00
James O. D. Hunt	7269352fd4	Merge pull request #3057 from jodh-intel/docs-update-agent-readme agent: Update README	2021-11-18 08:02:10 +00:00
bin liu	f971801b10	qemu: only set wait parameter for server mode socket based char device Now the `wait` is passed to qmp command, even at non-server mode. This will cause qemu return this error: 'wait' option is incompatible with socket in client connect mode Fixes: #205 Signed-off-by: bin liu <liubin0329@gmail.com>	2021-11-18 15:52:22 +08:00
Fupan Li	bbaf57adb0	agent: fix the issue of missing create a new session for container When the container didn't had a tty console, it would be in a same process group with the kata-agent, which wasn't expected. Thus, create a new session for the container process. Fixes: #3063 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2021-11-18 14:12:51 +08:00
bin	46fd5069c9	docs: update using-SPDK-vhostuser-and-kata.md Use `ctr` instead of `Docker`. Fixes: #3054 Signed-off-by: bin <bin@hyper.sh>	2021-11-18 09:41:12 +08:00
Eric Ernst	076dbe6cea	Merge pull request #2973 from egernst/remove-cruft Remove cruft, do some simple non-functional cleanup in the runtime	2021-11-17 15:26:12 -08:00
Eric Ernst	7e6f2b8d64	vc-utils: don't export unused function Many of these functions are just used on one place throughout the rest of the code base. If we create hypervisor package, newtork package, etc, we may want to parse this out. Fixes: #3049 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	860f30882a	virtcontainers: move oci, uuid packages top level This will be useful at runtime level; no need for oci or uuid to be subpkg of virtcontainers. While at it, ensure we run gofmt on the changed files. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	8acb3a32b6	virtcontainers: remove unused package nsenter Package is not utilized. Remove. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	4788cb8263	vc-network: remove unused functions Unused functions -- let's clean up! Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	b6ebddd7ef	oci: remove unused function GetContainerType This is unused - we utilize ContainerType directly. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
James O. D. Hunt	599bc0c2a9	agent: Update README Update the agent README by removing the historical details about the conversion from golang to rust which (occurred at the start of Kata 2.x development) and replacing it with information that developers and testers should find more useful. Fixes: #3056. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-17 17:57:45 +00:00
Fabiano Fidêncio	e34893a0c4	Merge pull request #3051 from egernst/macvlan-rename macvlan: drop bridged part of name	2021-11-17 10:21:07 +01:00
Eric Ernst	1e7cb4bc3a	macvlan: drop bridged part of name The fact that we need to "bridge" the endpoint is a bit irrelevant. To be consistent with the rest of the endpoints, let's just call this "macvlan" Fixes: #3050 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-16 16:44:29 -08:00
Carlos Venegas	15b5d22e81	Merge pull request #2778 from jcvenegas/clh-race-condition-check clh: Fix race condition that prevent start pods	2021-11-16 14:15:06 -06:00
Carlos Venegas	55412044df	monitor: Fix monitor race condition doing hypervisor.check() The thread monitor will check if the agent and the VMM are alive every second in a blocking thread. The Cloud hypervisor API server is single-threaded, if the monitor does a `check()`, while a slow request is still in progress, the monitor check() method will timeout. The monitor thread will stop all the shim-v2 execution. This commit modifies the monitor thread to make it check the status of the hypervisor after 5 seconds. Additionally, the `check()` method from cloud-hypervisor will use the method `clh.isClhRunning(timeout)` with a 10 seconds timeout. The monitor function does no timeout, so even if `hypervisor.check()` takes more 10 seconds, the isClhRunning method handles errors doing a VmmPing and retry in case of errors until the timeout is reached. Reduce the time to the next check to 5 should not affect any functionality, but it will reduce the overhead polling the hypervisor. Fixes: #2777 Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>	2021-11-16 18:28:29 +00:00
James O. D. Hunt	480343671b	Merge pull request #3046 from fidencio/wip/update-crio-documentation Update CRI-O documentation	2021-11-16 08:33:29 +00:00
Fabiano Fidêncio	eb11d053d5	cri-o: Update deployment documentation CRI-O deployment documentation was quite outdated, giving info from the `1.x` era. Let's update this to reflect what we currently have. Fixes: #2498 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-15 18:30:40 +01:00
Fabiano Fidêncio	92e3a14023	cri-o: Update links for the CRI-O github page The links are either pointing to the not-used-anymore `master` branch, or to the kubernetes-incubator page. Let's always point to the CRI-O github page, using the `main`branch. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-15 11:39:09 +01:00
Fabiano Fidêncio	0a19340a93	cri-o: Remove outdated documentation Although the documentation removed is correct, it's not relevant to the current supported versions of CRI-O. Related: #2498 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-15 11:39:08 +01:00
snir911	b046c1ef6b	Merge pull request #2959 from snir911/wip/cgroups-systemd-fix cgroups: Fix systemd cgroup support	2021-11-15 10:44:45 +02:00
Eric Ernst	e89c06e68b	Merge pull request #3032 from liubin/fix/3031-merge-two-types-packages runtime: merge virtcontainers/pkg/types into virtcontainers/types	2021-11-12 14:23:21 -08:00
Chelsea Mafrica	b585264555	Merge pull request #3034 from fidencio/wip/remove-non-used-actions workflows: Remove non-used main.yaml	2021-11-12 11:25:47 -08:00
Chelsea Mafrica	d38135c93b	Merge pull request #2570 from YchauWang/wyc-agent-test agent/src: improve unit test coverage for src/namespace.rs	2021-11-12 11:24:13 -08:00
Fabiano Fidêncio	a3b3c85ec3	workflows: Remove non-used main.yaml The main.yaml workflow was created and used only on 1.x. We inherited it, but we didn't remove it after deprecating the 1.x repos. While here, let's also update the reference to the `main.yaml` file, and point to `release.yaml` (the file that's actually used for 2.x). Fixes: #3033 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-12 18:17:11 +01:00
Chelsea Mafrica	6b48d3754a	Merge pull request #3013 from fgiudici/kata_metrics_doc doc: update kata metrics documentation	2021-11-12 09:11:36 -08:00
Chelsea Mafrica	c8f2ef9488	Merge pull request #3030 from liubin/fix/3029-delete-codes runtime: delete not used codes	2021-11-12 08:53:20 -08:00
bin	09f7962ff1	runtime: merge virtcontainers/pkg/types into virtcontainers/types There are two types packages under virtcontainers, and the virtcontainers/pkg/types has a few codes, merging them into one can make it easy for outstanding and using types package. Fixes: #3031 Signed-off-by: bin <bin@hyper.sh>	2021-11-12 15:06:39 +08:00
bin	6acedc2531	runtime: delete not used codes Functions EnvVars and GetOCIConfig in runtime/virtcontainers/pkg/oci/utils.go are not used anymore. Fixes: #3029 Signed-off-by: bin <bin@hyper.sh>	2021-11-12 11:35:31 +08:00
Fabiano Fidêncio	c0aea3f662	Merge pull request #3017 from fidencio/wip/bump-golang versions: bump golang to 1.17.x	2021-11-11 16:57:50 +01:00
Fabiano Fidêncio	7c947357ad	Merge pull request #3015 from ManaSugi/fix-yq-path release: Use ${GOPATH}/bin/yq for upload-libseccomp-tarball action	2021-11-11 10:48:42 +01:00
Fabiano Fidêncio	395638c4bc	versions: bump golang to 1.17.x According to https://endoflife.date/go golang 1.15 is not supported anymore. Let's remove it from out tests, add 1.17.x, and bump the newest version known to work when building kata to 1.17.3. Fixes: #3016 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-11-11 10:43:18 +01:00
Bin Liu	bf24eb6b33	Merge pull request #2979 from jodh-intel/agent-ctl-json-api-spec agent-ctl: Allow API specification in JSON format	2021-11-11 16:45:30 +08:00
Francesco Giudici	570915a8c3	docs: update kata 2.0 metrics documentation We now support any container engine CRI compliant in kata-monitor. Update documentation to reflect it. Fixes: #980 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2021-11-11 09:33:01 +01:00
Snir Sheriber	bcf181b7ee	cgroups: Fix systemd cgroup support As github.com/containerd/cgroups doesn't support scope units which are essential in some cases lets create the cgroups manually and load it trough the cgroups api This is currently done only when there's single sandbox cgroup (sandbox_cgroup_only=true), otherwise we set it as static cgroup path as it used to be (until a proper soultion for overhead cgroup under systemd will be suggested) Fixes: #2868 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2021-11-11 08:51:45 +02:00
Manabu Sugimoto	3430723594	release: Use ${GOPATH}/bin/yq for upload-libseccomp-tarball action We need to explicitly call `${GOPATH}/bin/yq` that is installed by `ci/install_yq.sh`. Fixes: #3014 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2021-11-11 13:42:12 +09:00
Bin Liu	04185bd068	Merge pull request #2997 from Jakob-Naucke/lint-protection virtcontainers: Lint protection types	2021-11-11 08:34:48 +08:00
Fabiano Fidêncio	05cf7cdddb	Merge pull request #3007 from liubin/fix/3006-check-env-key-value agent: check environment variables if empty or invalid	2021-11-10 19:19:47 +01:00
Francesco Giudici	6339fdd1f6	docs: update kata metrics architecture image We now support any CRI container engine in kata-monitor, notably CRI-O. Add both containerd and CRI-O in the kata metrics architecture image. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2021-11-10 18:58:15 +01:00
bin	57bb7ffae3	agent: check environment variables if empty or invalid Invalid environment variable key/value will cause set_env panic. Refer: https://doc.rust-lang.org/std/env/fn.set_var.html#panics Fixes: #3006 Signed-off-by: bin <bin@hyper.sh>	2021-11-10 20:54:21 +08:00
Fabiano Fidêncio	653976c0fd	Merge pull request #3000 from bergwolf/crioptions runtime: Revert "runtime: use containerd package instead of cri-containerd"	2021-11-10 13:41:24 +01:00
Tim Zhang	fbf3bb55c0	Merge pull request #2995 from Tim-Zhang/fix-container-created-time rustjail: Fix created time of container	2021-11-10 19:44:04 +08:00
James O. D. Hunt	8ab90e1068	agent-ctl: Allow API specification in JSON format Update the `agent-ctl` tool to allow API fields to be specified in JSON format, either directly on the command-line, or via a file URI. This feature is made possible by enabling `serde` support in the agent `protocols` crate. Careful use of the `serde` macros allows the `agent-ctl` tool to accept _partially_ specified API objects in JSON format; fields that are not specified are set to the default value for their respective types. `build.rs` changes based on work by Fupan. Fixes: #2978. Contributions-by: Fupan Li <lifupan@gmail.com> Contributions-by: Bin Liu <bin@hyper.sh> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-10 10:16:04 +00:00
James O. D. Hunt	18c47fe8f3	Merge pull request #2986 from jodh-intel/rm-dynamic-tracing-api agent: Remove dynamic tracing APIs	2021-11-10 10:10:14 +00:00
Peng Tao	eacfcdec19	runtime: Revert "runtime: use containerd package instead of cri-containerd" This reverts commit `76f16fd1a7` to bring back cri-containerd crioptions parsing so that kata works with older containerd versions like v1.3.9 and v1.4.6. Fixes: #2999 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2021-11-10 16:06:42 +08:00
Tim Zhang	e7856ff10c	rustjail: Fix created time of container Got wrong created time of container after an exec this commit will fix this problem. Fixes: #2994 Signed-off-by: Tim Zhang <tim@hyper.sh>	2021-11-10 10:43:03 +08:00
Chelsea Mafrica	8b01666109	Merge pull request #2992 from Amulyam24/kernel_vfio kernel: add VFIO kernel dependencies for ppc64le	2021-11-09 15:22:16 -08:00
Jakob Naucke	b7b89905d4	virtcontainers: Lint protection types Protection types like tdxProtection or seProtection were marked nolint, remove this. As a side effect, ARM needs dummy tests for these. Fixes: #2801 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-11-09 18:36:32 +01:00
Amulyam24	7566b736ac	kernel: add VFIO kernel dependencies for ppc64le Recently added VFIO kernel configs require addtional dependencies on pcc64le. Fixes: #2991 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2021-11-09 14:38:03 +05:30
James O. D. Hunt	87f676062c	agent: Remove dynamic tracing APIs Remove the `StartTracing` and `StopTracing` agent APIs that toggle dynamic tracing. This is not supported in Kata 2.x, as documented in the [tracing proposals document](https://github.com/kata-containers/kata-containers/pull/2062). Fixes: #2985. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-09 08:39:06 +00:00
James O. D. Hunt	b09dd7a883	docs: Fix typo Correct a typo identified by the static checker's spell checker. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-09 08:38:42 +00:00
James O. D. Hunt	b192d388c1	Merge pull request #2970 from jodh-intel/logging-create-tests-and-checks logging: Always run crate tests	2021-11-08 13:16:48 +00:00
Julio Montes	e438cc5d8c	Merge pull request #204 from zhsj/test-32 qemu: Fix 32 bit int overflow in test file	2021-11-08 07:09:15 -06:00
Shengjing Zhu	82cc01d24d	qemu: Fix 32 bit int overflow in test file Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2021-11-07 03:00:27 +08:00
Manabu Sugimoto	c66b56683b	agent: Ignore unknown seccomp system calls If Kata agent cannot resolve the system calls given by seccomp profiles, the agent ignores the system calls and continues to run without an error. Fixes: #2957 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2021-11-05 21:00:41 +09:00
Eric Ernst	ab7aa42147	Merge pull request #203 from mcastelino/topic/legacy-serial qemu: Add support for legacy serial device	2021-11-04 16:15:28 -07:00
Manohar Castelino	1d1a23134a	qemu: Add support for legacy serial device - Add support for legacy serial device - Additionally add support for the file backend for chardev Legacy serial plus char backend file will allow us to support capture early boot messages. Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-11-04 15:44:12 -07:00
James O. D. Hunt	d47484e7c1	logging: Always run crate tests Ensure the tests in the local `logging` crate are run for all consumers of it. Additionally, add a new test which checks that output is generated by a range of different log level `slog` macros. This is designed to ensure debug level output is always available for the consumers of the `logging` crate. Fixes: #2969. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-04 17:26:52 +00:00
James O. D. Hunt	5c9c0b6e62	build: Fix default target Fixed the top-level build which was broken: the kata deploy Makefile was being sourced, but it was defining the first target, which became the default. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-04 16:30:50 +00:00
Snir Sheriber	b34ed403c5	cgroups: pass vhost-vsock device to cgroup for the sandbox cgroup Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2021-11-04 10:59:10 +02:00
Snir Sheriber	7362e1e8a9	runtime: remove prefix when cgroups are managed by systemd as done previously in `9949daf4dc` Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2021-11-04 10:13:22 +02:00
Julio Montes	8eb2fe0d36	Merge pull request #190 from Jakob-Naucke/overcommit qemu: Remove -realtime in favor of -overcommit	2021-10-18 11:42:46 -05:00
Jakob Naucke	9a2bbedac7	qemu: Remove -realtime in favor of -overcommit as `-realtime` has been removed in QEMU 6. `-overcommit` has been supported since at least QEMU 3.1. Fixes: #189 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-09-22 11:24:15 +02:00
wangyongchao.bj	1b1790fdbc	agent/src: improve unit test coverage for src/namespace.rs Improve unit test coverage for src/namespace.rs for Kata 2.0 agent Fixes: #289 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2021-09-17 14:15:14 +08:00
Eric Ernst	c4da1a902a	Merge pull request #202 from mcastelino/topic/fix-shutdown Add clean shutdown support	2021-09-16 14:20:51 -07:00
Manohar Castelino	fe83c208dc	qemu: Add support for --no-shutdown Knob Add support for --no-shutdown Knob. This allows us to shutdown the VM without quitting QEMU. Note: Also fix the comment around --no-reboot to be more accurate. Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-09-16 13:07:48 -07:00
Manohar Castelino	1ed52714c0	qmp: wait for POWERDOWN event in ExecuteSystemPowerdown() ExecuteSystemPowerdown issues `system_powerdown` and waits for `SHUTDOWN`. The event emitted is `POWERDOWN` per spec. Without this we get an error even though the VM has shutdown gracefully. Per QEMU spec: ``` POWERDOWN (Event) Emitted when the virtual machine is powered down through the power control system, such as via ACPI. Since 0.12 Example <- { "event": "POWERDOWN", "timestamp": { "seconds": 1267040730, "microseconds": 682951 } } SHUTDOWN (Event) Emitted when the virtual machine has shut down, indicating that qemu is about to exit. Arguments guest: boolean If true, the shutdown was triggered by a guest request (such as a guest-initiated ACPI shutdown request or other hardware-specific action) rather than a host request (such as sending qemu a SIGINT). (since 2.10) reason: ShutdownCause The ShutdownCause which resulted in the SHUTDOWN. (since 4.0) Note If the command-line option “-no-shutdown” has been specified, qemu will not exit, and a STOP event will eventually follow the SHUTDOWN event Since 0.12 Example <- { "event": "SHUTDOWN", "data": { "guest": true }, "timestamp": { "seconds": 1267040730, "microseconds": 682951 } } ``` Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-09-16 13:01:58 -07:00
Julio Montes	1b60b536f3	Merge pull request #201 from dgibson/bridge-reserve govmm/qemu: Let IO/memory reservations be specified for bridge devices	2021-09-09 10:50:07 -05:00
David Gibson	de039da2a9	govmm/qemu: Let IO/memory reservations be specified for bridge devices This adds fields to BridgeDevice struct to allow qemu's io-reserve, mem-reserve and pref64-reserve properties to be set for PCI bridges. This is needed for Kata's upcoming change to ACPI hotplug. fixes #200 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-09 11:47:50 +10:00
Julio Montes	2f8e417bb2	Merge pull request #199 from teawater/add_swap QMP: Add ExecuteBlockdevAddWithDriverCache	2021-08-31 07:48:34 -05:00
Hui Zhu	5c7998db04	QMP: Add ExecuteBlockdevAddWithDriverCache ExecuteBlockdevAddWithDriverCache has three one parameter driver than ExecuteBlockdevAddWithCache. Parameter driver can set the driver of block device. Fixes: #198 Signed-off-by: Hui Zhu <teawater@antfin.com>	2021-08-31 16:34:33 +08:00
Julio Montes	68676b43a5	Merge pull request #179 from Jakob-Naucke/iommu-platform qemu: Fix iommu_platform for CCW	2021-08-19 07:52:15 -05:00
Fabiano Fidêncio	b681d61a37	Merge pull request #197 from fengwang666/non-root qemu: Add credentials to qemu Cmd	2021-08-17 13:06:15 +02:00
Feng Wang	3a9a67499f	qemu: Add credentials to qemu Cmd add credentials to the command attribute Fixes #2444 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2021-08-16 10:44:00 -07:00
David Gibson	3c64244cbb	Merge pull request #194 from dgibson/object-add-props Don't use deprecated 'props' argument to QMP 'object-add'	2021-08-04 13:57:56 +10:00
David Gibson	d27256f863	qmp: Don't use deprecated 'props' field for object-add Use of the 'props' argument to 'object-add' has been deprecated since QEMU 5.0 (commit 5f07c4d60d09) in favor of flattening the properties directly into the 'object-add' arguments. Support for 'props' is removed entirely in qemu 6.0 (commit 50243407457a). fixes #193 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-08-04 13:42:41 +10:00
David Gibson	d8cdf9aa2a	qemu: Drop support for versions older than 5.0 Kata requires version 5.2 (or 5.1 on ARM) anyway. Simplify code by dropping support for older versions. In any case explicit checks against version number aren't necessarily reliable for patched qemu versions. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-08-04 13:42:41 +10:00
Jakob Naucke	18352c36ec	qemu: Fix iommu_platform for vhost user CCW Enable iommu_platform for vhost user devices Fixes: #178 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-07-29 12:51:32 +02:00
David Gibson	40843efc26	Merge pull request #192 from dgibson/host-device Use 'host_device' driver for blockdev backends	2021-07-29 17:03:44 +10:00
David Gibson	1b02192986	Use 'host_device' driver for blockdev backends ExecuteBlockdevAdd() and ExecuteBlockdevAddWithCache() both appear to be intended to create block devices in the guest which backend onto a block device in the host. That seems to be the way that Kata always uses it. However blockdevAddBaseArgs(), used by both those functions always uses the "file" driver, which is only intended for use with regular file backends. Use of the "file" driver for host block devices was deprecated in qemu-3.0, and has been removed entirely in qemu-6.0 (commit 8d17adf34f5). We should be using the "host_device" driver instead. fixes #191 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-29 13:32:39 +10:00
Julio Montes	b507f32392	Merge pull request #186 from LiangZhou-CTY/master add support for "sandbox" feature to qemu	2021-07-23 08:36:57 -05:00
Liang Zhou	9518675e11	add support for "sandbox" feature to qemu Update the govmm code in order to support "sandbox" feature on qemu, which can introduce another protect layer on the host, to make the secure container more secure. Fixes: #185 Signed-off-by: Liang Zhou <zhoul110@chinatelecom.cn>	2021-07-23 04:24:40 -07:00
Archana Shinde	0173713ea9	Merge pull request #187 from devimc/2021-07-21/nvdimmRO qemu: support read-only nvdimm	2021-07-22 04:53:11 -07:00
Julio Montes	7e200ea9d7	Merge pull request #188 from devimc/2021-07-21/gomods Support golang 1.16	2021-07-21 15:35:12 -05:00
Julio Montes	335fa81667	qemu: fix golangci-lint errors fix golangci-lint errors Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-07-21 15:08:12 -05:00
Julio Montes	61b6378749	.github/workflows: reimplement github actions CI * Remove golang 1.13 and 1.14, add golang 1.16 * gometalinter has been deprecated, use golangci-lint instead Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-07-21 15:08:07 -05:00
Julio Montes	9d6e7970b6	go: support go modules Add go.mod file to support Golang 1.16.x Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-07-21 11:38:07 -05:00
Julio Montes	0d21263a9b	qemu: support read-only nvdimm Append `readonly=on` to a `memory-backend-file` object and `unarmed=on` to a `nvdimm` device when `ReadOnly` is set to `true` Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-07-21 11:26:19 -05:00
James O. D. Hunt	f3533734ac	Merge pull request #184 from Jakob-Naucke/consistent-joins qemu: Consistent parameter building	2021-07-19 09:37:54 +01:00
Jakob Naucke	ff34d283db	qemu: Consistent parameter building Always join by ",", do not put commas in the parameter slices. Always use the variable name `deviceParams`. Fixes: #180 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-07-16 15:14:14 +02:00
Fabiano Fidêncio	263136e69a	Merge pull request #177 from marcel-apf/memdev-on-bridge qemu: Allow hot-plugging memory devices on PCI bridges	2021-06-22 09:55:16 +02:00
Marcel Apfelbaum	0e19ffb67e	qemu: Allow hot-plugging memory devices on PCI bridges Currently virtio-mem-pci devices can be hotplugged only on the root bus. This doesn't work for PCIe machines like q35. Extend the API to optionally support hotplugging on PCI bridges. Fixes: #176 Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>	2021-06-21 19:55:20 +03:00
Pradipta Banerjee	eb57f004d8	Merge pull request #175 from Amulyam24/pef qemu: Add support for PEF	2021-05-20 19:54:20 +05:30
Amulyam24	c135681d9a	qemu: Add support for PEF Adding the support for Protected Execution Facility(PEF) is which is the confidential computing technology on ppc64le. Fixes: #174 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2021-05-20 13:50:25 +00:00
Julio Montes	6fd848e95e	Merge pull request #173 from Jakob-Naucke/sec-exec qemu: Add support for Secure Execution	2021-05-20 07:59:01 -05:00
Jakob Naucke	03b55ea51d	qemu: Add support for Secure Execution Secure Execution, also known as Protected Virtualization in QEMU, is a confidential computing technology for s390x (IBM Z & LinuxONE). Allow the respective object. Fixes: #172 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-05-20 10:45:39 +02:00
Jakob Naucke	7a367dc0a8	qemu: Simplify (Object).Valid() so that more object types can be added without going over cyclomatic complexity limits Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-05-20 10:45:37 +02:00
Sandeep Gupta	a6cec2d38c	qemu: add support for SevGuest object Signed-off-by: Jim Cadden <jcadden@ibm.com>	2021-05-20 10:08:02 +02:00
Fabiano Fidêncio	f0e9a35308	Merge pull request #171 from Jakob-Naucke/fix-virtiofs-s390x qemu: VhostUserDevice CCW device numbers	2021-04-28 18:36:04 +02:00
Jakob Naucke	abd3c7ea03	qemu: VhostUserDevice CCW device numbers Add CCW (s390x) device numbers to VhostUserDevices, as is with other device types. Add them to VhostUserFS devices (the only type currently supported on s390x) when building QEMU parameters. Fixes: #170 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-04-28 00:28:25 +02:00
Jakob Naucke	3eaeda7f6d	qemu: Refactor vhostuserDev.QemuParams by splitting out the respective functionality to QemuNetParams, QemuSCSIParams, QemuBlkParams, and QemuFSParams. This allows adding functionality to these functions without going beyond the cyclomatic complexity of 15 mandated by the lint checks. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-04-28 00:28:11 +02:00
Fabiano Fidêncio	7183b12b07	Merge pull request #166 from kata-containers/egernst-patch-1 qmp: remove chatty log	2021-04-26 23:36:31 +02:00
Chelsea Mafrica	092293f1d0	Merge pull request #169 from QiuMike/master Fix qemu commandline issue with empty romfile	2021-04-23 18:58:27 -07:00
Michael Qiu	511cf58b0c	Fix qemu commandline issue with empty romfile Currently, if romfile field is empty, the commandline will shows like below: -device driver=virtio-net-pci,...,mq=on,vectors=4,romfile= This does not make sense, just remove this field in commandline Add unittest support. Signed-off-by: Michael Qiu <qiudayu@huayun.com>	2021-04-22 04:09:16 -04:00
Julio Montes	8ba62b02ca	Merge pull request #164 from devimc/2021-03-30/tdxSupport qemu: add support for tdx-guest object	2021-04-09 09:53:53 -05:00
Eric Ernst	b3eac95b28	qmp: remove frequent, chatty log In Kata, we are getting a lot of logs at runtime from QMP, in particular `read from QMP: xxxx` Ideally we'd set this to only be visible for trace, but I did not see this working when adding a V(7) check around these prints. To avoid filling journal with info that isn't useful, let's drop. Fixes: #165 Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>	2021-04-01 09:09:32 -07:00
Julio Montes	3141894033	qemu: add support for tdx-guest object support tdx-guest guest objects Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-03-30 16:18:11 -06:00
Fabiano Fidêncio	7fbc685865	Merge pull request #161 from Jakob-Naucke/memory-backend qemu: Append memory backend for non-DIMM setups	2021-03-29 22:58:24 +02:00
GabyCT	4f6a403cde	Merge pull request #162 from devimc/2021-03-24/deviceLoader qemu: add support for device loaders	2021-03-29 10:22:21 -06:00
GabyCT	164d28a27b	Merge pull request #163 from devimc/2021-03-24/supportQEMU6 qemu: support QEMU 6	2021-03-29 10:21:55 -06:00
Jakob Naucke	4b136f3f1c	qemu: Append memory backend for non-DIMM setups Some architectures and setups do not support DIMM/NUMA. However, they can still use memory backends, provided a memory backend of the same ID is specified under -machine. This was introduced in QEMU 5.0. Enable this functionality in appendMemoryKnobs. Fixes: #160 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-03-29 15:53:39 +02:00
Julio Montes	6213dea42a	qemu: support QEMU 6 Use `on` and `off` to enable or disable features, `no` prefix is deprecated Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-03-24 11:05:24 -06:00
Julio Montes	0d47025d05	qemu: add support for device loaders Devices loaders can be used to load some firmwares. Signed-off-by: Julio Montes <julio.montes@intel.com>	2021-03-24 10:35:45 -06:00
Eric Ernst	7d320e8f5d	Merge pull request #158 from egernst/blk-ro qmp: Add ro argument for block-device hotplug funcs	2021-01-11 17:37:50 -08:00
Eric Ernst	e2eb549fcd	qmp: Add ro argument for block-device hotplug funcs We should allow users to specify if a block device should be hotplugged as read-only. Fixes: #157 Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>	2021-01-11 15:33:20 -08:00
Julio Montes	5b0331c0fa	Merge pull request #156 from jongwu/dimm qemu: add arm64 to support list of dimm	2020-11-19 07:48:48 -06:00
Jianyong Wu	0592c82536	qemu: add arm64 to support list of dimm dimm is supported on arm64, so add is to check list. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Fixes: #155	2020-11-19 16:44:16 +08:00
Julio Montes	5e9aa08c4f	Merge pull request #154 from edmond-hk/pflash qemu: enable "-pflash"	2020-10-22 10:38:25 -05:00
Edmond AK Dantes	2079c15c26	qemu: enable "-pflash" flash image can store some critical data like firmware, enable it here. Fixes: #140 Signed-off-by: Edmond AK Dantes <edmond.dantes.ak47@outlook.com>	2020-10-22 21:26:23 +08:00
Peng Tao	99f43ec188	Merge pull request #153 from liubin/feature/152-add-pvpanic-and-dump-guest-memory-support qemu: add pvpanic and dump guest memory support	2020-10-20 13:20:39 +08:00
bin liu	b8cd705901	qmp: add dump-guest-memory support By adding `dump-guest-memory` command, user can get kernel memory dump when guest panic occurred. Fixes: #152 Signed-off-by: bin liu <bin@hyper.sh>	2020-10-19 17:09:12 +08:00
bin liu	d7836877e9	qemu: add pvpanic device to get GUEST_PANICKED event Listening to the events channel from QEMU and a guest panic event issued, then we can get the event and do some work for the special event. Fixes: #152 Signed-off-by: bin liu <bin@hyper.sh>	2020-10-19 16:59:37 +08:00
Julio Montes	11b6ac380d	Merge pull request #151 from mazzy89/blk-device-serial Add serial ID to blk device	2020-10-16 08:28:30 -05:00
Julio Montes	0bd15d6dbf	Merge pull request #150 from mazzy89/fix-fwcfg Make fw_cfg a slice	2020-10-15 09:13:39 -05:00
Salvatore Mazzarino	43d774d27b	Add serial to blk device Signed-off-by: Salvatore Mazzarino <dev@mazzarino.cz>	2020-10-12 17:35:06 +02:00
Salvatore Mazzarino	8cb8b24c05	Make fw_cfg a slice Signed-off-by: Salvatore Mazzarino <dev@mazzarino.cz>	2020-10-12 12:29:05 +02:00
James O. D. Hunt	546cc55ea4	Merge pull request #148 from devimc/2020-10-09/fixup contributors: remove CONTRIBUTORS.md file	2020-10-09 15:04:31 +01:00
Julio Montes	cb0d339141	contributors: remove CONTRIBUTORS.md file Remove CONTRIBUTORS.md file since, this repo is now part of the kata-containers organization, the other repos don't have this file and we are not willing to maintain (update) it. Signed-off-by: Julio Montes <julio.montes@intel.com>	2020-10-09 08:03:25 -05:00
Julio Montes	2f6bb3dbec	Merge pull request #146 from jodh-intel/update-for-new-github-org misc: Update for new GitHub organisation name	2020-10-09 08:01:28 -05:00
James O. D. Hunt	69f9a50bb2	Merge pull request #144 from mazzy89/fw-cfg qemu: add fw_cfg flag to config	2020-10-09 09:23:50 +01:00
Salvatore Mazzarino	29ba5a9012	qemu: add fw_cfg flag to config Signed-off-by: Salvatore Mazzarino <dev@mazzarino.cz>	2020-10-09 10:17:58 +02:00
James O. D. Hunt	9f309c2aa1	misc: Update for new GitHub organisation name `govmm` is now part of the `kata-containers` GitHub organisation, so update to reflect this. Fixes: #145. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2020-10-09 09:10:10 +01:00
Julio Montes	6fa954a506	Merge pull request #139 from dgibson/main Add qom-get function	2020-09-03 07:36:07 -05:00
David Gibson	3d46d08a90	Add qom-get function Add a function to access the qom-get QMP command so we can query information from qemu. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-09-03 14:05:00 +10:00
James O. D. Hunt	6042f60331	Merge pull request #110 from heychenbin/master typo fix	2020-08-25 07:50:22 +01:00
Julio Montes	9901db52fd	Merge pull request #134 from Jakob-Naucke/vfio-ap-mdev Add support for hot-plugging IBM Adjunct Processor (AP) devices	2020-08-19 07:16:13 -05:00
Julio Montes	a0d27643ee	Merge pull request #138 from devimc/2020-08-17/enableGithubActions github: enable github actions	2020-08-19 07:15:53 -05:00
Jakob-Naucke	39c372a201	Add support for hot-plugging IBM VFIO-AP devices Add ExecuteAPVFIOMediatedDeviceAdd to qmp.go, which executes a hotplug for an IBM Adjunct processor (AP) VFIO device (see also https://www.kernel.org/doc/html/latest/s390/vfio-ap.html ) Also includes the respective unittest and adds the VfioAP DeviceDriver constant to qemu.go. Pushing again due to incidental CI failure Fixes: #133 Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com> Reviewed-by: alicefr <afrosi@redhat.com>	2020-08-18 17:35:23 +02:00
Julio Montes	4c33e5e823	Merge pull request #137 from devimc/2020-08-17/fixCoveralls travis: Run coveralls after success	2020-08-18 10:18:53 -05:00
Julio Montes	f5bdd53ce6	travis: disable amd64 jobs move amd64 CI jobs to github actions Signed-off-by: Julio Montes <julio.montes@intel.com>	2020-08-18 07:48:20 -05:00
Julio Montes	1af1c0d783	github: enable github actions Use github actions to run unit tests. Github actions service looks more stable and reliable than travis. fixes #136 Signed-off-by: Julio Montes <julio.montes@intel.com>	2020-08-17 12:59:21 -05:00
Julio Montes	4831c6e0a3	travis: Run coveralls after success Fix the following error: ``` Bad response status from coveralls: 422 {"message":"service_job_id (717167073) must be unique for Travis Jobs not supplying a Coveralls Repo Token","error":true} The command "$GOPATH/bin/goveralls -v -service=travis-ci" exited with 1. ``` fixes #135 Signed-off-by: Julio Montes <julio.montes@intel.com>	2020-08-17 11:00:07 -05:00
Julio Montes	547a851809	Merge pull request #132 from huoqifeng/iommu_platform qemu: add iommu_platform knob for qemuParams	2020-07-31 08:16:13 -05:00
Qi Feng Huo	cf0f05d2e9	qemu: add iommu_platform knob for qemuParams Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com> fix typo Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com> qemu: remove useless fmt.Sprintf for qemuParams Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com> fix test cases for s390x Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>	2020-07-30 22:19:49 +08:00
Julio Montes	6c3315ba8a	Merge pull request #131 from merwick/master qemu: Add NoReboot config Knob for qemuParams	2020-07-28 08:52:09 -05:00
Liam Merwick	6645baf249	qemu: Add NoReboot config Knob for qemuParams The Kata architecture does not support rebooting VMs (the lifecycle being start/exec/kill) and if a VM is killed (e.g. using sysrq-trigger), the VM does not exit fully and other layers do not notice the state change. Kata needs a way to tell QEMU to run with the '--no-reboot' option so that the guest VM exits and does not attempt to reboot. Add a NoReboot boolean Knob so when Knobs.NoReboot is set, the '--no-reboot' command-line option will be passed to QEMU on startup. Signed-off-by: Liam Merwick <liam.merwick@oracle.com>	2020-07-27 15:04:54 +01:00
Julio Montes	af9e34b91a	Merge pull request #130 from devimc/2020-07-22/addMultidevs Add multidevs option to fsdev	2020-07-24 12:06:48 -05:00
Julio Montes	abca6f3ce9	Add multidevs option to fsdev multidevs specifies how to deal with multiple devices being shared with a 9p export. `multidevs=remap` fixes the following warning: ``` 9p: Multiple devices detected in same VirtFS export, which might lead to file ID collisions and severe misbehaviours on guest! You should either use a separate export for each device shared from host or use virtfs option 'multidevs=remap'! ``` Signed-off-by: Julio Montes <julio.montes@intel.com>	2020-07-23 10:56:22 -05:00
James O. D. Hunt	7cc469641b	Merge pull request #128 from devimc/2020-05-29/qmp/vhostBool qemu/qmp: use boolean type for the vhost	2020-06-02 15:54:48 +01:00
Julio Montes	cc53876661	qemu/qmp: use boolean type for the vhost vhost is a Netdev Tap Option used to configure a host TAP network interface backend, according to the QMP API documentation the type for such option must be a boolean. Use boolean type for vhost option to fix the following error on recent versions of QEMU: ``` Invalid parameter type for 'vhost', expected: boolean ``` Signed-off-by: Julio Montes <julio.montes@intel.com>	2020-05-29 20:52:44 +00:00
Julio Montes	7efaf0b1cd	Merge pull request #127 from amorenoz/iommu qemu: add IOMMU Device	2020-05-27 08:54:42 -05:00
Adrian Moreno	e57e86e2ea	qemu: add IOMMU Device The following options can be provided Intremap: activates interrupt remapping DeviceIotlb: enables device IOTLB support for the vIOMMU CachingMode: enables Cahing Mode See: https://wiki.qemu.org/Features/VT-d Signed-off-by: Adrian Moreno <amorenoz@redhat.com>	2020-05-26 18:29:02 +02:00
Julio Montes	10b22acda6	Merge pull request #125 from bpradipt/master Enable Numa support for Power (ppc64le) architecture	2020-05-14 10:25:37 -05:00
Pradipta Kr. Banerjee	b2aa0225ac	Enable Numa support for Power (ppc64le) architecture Fixes #124 Signed-off-by: bpradipt@in.ibm.com	2020-05-13 01:21:00 +05:30
Julio Montes	ad66e4caf8	Merge pull request #122 from devimc/topic/qemu/maxPorts qemu: Add max_ports option to virtio-serial device	2020-05-08 13:47:10 -05:00
Julio Montes	621af7ebe8	Merge pull request #123 from LinShuicheng/master Add rt clock definition for rtc clock in qemu	2020-05-06 09:46:32 -05:00
Shuicheng Lin	29529a5d72	Add rt clock definition for rtc clock in qemu There are three different types for the RTC clock: host, rt and vm. Add `rt` to the list of RTC clocks. Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>	2020-05-06 08:43:40 +08:00
Julio Montes	0e98b613a8	qemu: Add max_ports option to virtio-serial device Allow API consumers to change the maximum number of ports in the virtio-serial devices, setting a lower number of ports can improve the boot time and reduce the attack surface. fixes #120 Signed-off-by: Julio Montes <julio.montes@intel.com>	2020-04-24 15:10:51 +00:00
Julio Montes	f6f627acef	Merge pull request #121 from merwick/microvm qemu: Add microvm machine type support	2020-04-24 09:33:11 -05:00
Liam Merwick	787c86b7e5	qemu: Add microvm machine type support Following on from #111 which added support for multiple virtio transports, add code to use virtio-mmio as the transport when booting a guest with the microvm machine type and add a microvm case when checking for NUMA support. Also add a test case for machine string parsing. Signed-off-by: Liam Merwick <liam.merwick@oracle.com>	2020-04-23 22:27:03 +01:00
Julio Montes	e969afbec5	Merge pull request #119 from devimc/topic/qemu/AddPmem qemu: add pmem flag to memory-backend-file	2020-03-04 08:25:14 -06:00
Julio Montes	5378725f11	qemu: add pmem flag to memory-backend-file According to QEMU's nvdimm documentation: When 'pmem' is 'on' and QEMU is built with libpmem support, QEMU will take necessary operations to guarantee the persistence of its own writes to the vNVDIMM backend. Signed-off-by: Julio Montes <julio.montes@intel.com>	2020-03-03 14:28:59 +00:00
Peng Tao	3700c55dd7	qemu: add block device readonly support So that we can attach it readonly. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2020-02-21 08:58:53 +01:00
Mark Ryan	37b0d9c12f	Merge pull request #111 from slp/multiple_transports Refactor code to support multiple virtio transports at runtime	2020-02-18 13:51:07 +01:00
Mark Ryan	20f3977bc7	Merge pull request #117 from fidencio/wip/dont_always_set_cache_size qemu: Don't set ".cache-size=" when CacheSize is 0	2020-02-08 10:49:46 +01:00
Sergio Lopez	88a25a2d68	Refactor code to support multiple virtio transports at runtime Currently, virtio transports for each device are determined with architecture dependent build time conditionals. This isn't the ideal solution, as virtio transports aren't exactly tied to the host's architecture. For example, aarch64 VMs do support both PCI and MMIO devices, and after the recent introduction of the microvm machine type, that's also the case for x86_64. This patch extends each device that supports multiple transports with a VirtioTransport field, so users of the library can manually specify a transport for each device. To avoid breaking the compatibility, if VirtioTransport is empty a behavior equivalent to the legacy one is achieved by checking runtime.GOARCH and Config.Machine.Type. Keeping support for isVirtioPCI/isVirtioCCW in qmp.go is a bit tricky. Eventually, the hot-plug API should be extended so callers must manually specify the transport for the device. Signed-off-by: Sergio Lopez <slp@redhat.com>	2020-02-07 18:17:12 +01:00
Fabiano Fidêncio	2ee53b00ca	qemu: Don't set ".cache-size=" when CacheSize is 0 As there's no guarantee that ".cache-size" is a supported QEMU property, let's not add it to the QEMU command line when the user explicitly set virtio_fs_cache_size to zero. By not always setting ".cache-size" property we avoid errors like: ``` $ sudo podman --runtime=/usr/bin/kata-runtime run --security-opt label=disable -it fedora:31 /bin/bash Error: failed to launch qemu: exit status 1, error messages from qemu log: qemu-kvm: -device vhost-user-fs-pci,chardev=char-88c350403e95d3db,tag=kataShared,cache-size=0M: Property '.cache-size' not found: OCI runtime error ``` Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>	2020-02-07 09:56:36 +01:00
Julio Montes	cab4709376	Merge pull request #116 from Jimmy-Xu/add-pcie-root-port qemu: Add pcie-root-port device support.	2020-01-31 08:07:07 -06:00
Jimmy Xu	f1252f6e17	qemu: Add pcie-root-port device support.	2020-01-26 21:44:11 +08:00
Julio Montes	ee21903287	Merge pull request #115 from teawater/virtio-mem qmp: Add ExecMemdevAdd and ExecQomSet API	2020-01-21 08:58:30 -06:00
Hui Zhu	6667f4e90b	qmp_test: Add TestExecMemdevAdd and TestExecQomSet Add TestExecMemdevAdd and TestExecQomSet to qmp_test.go. They can test ExecMemdevAdd and ExecQomSet. Signed-off-by: Hui Zhu <teawater@antfin.com>	2020-01-21 10:26:59 +08:00
Hui Zhu	201fd0ae82	qmp: Add ExecMemdevAdd and ExecQomSet API Add ExecMemdevAdd and ExecQomSet API to support virtio-mem. Signed-off-by: Hui Zhu <teawater@antfin.com>	2020-01-19 14:51:17 +08:00
Mark Ryan	94145ff380	Merge pull request #114 from dong-liuliu/xliu2/vhost-user-dev qmp: add ExecutePCIVhostUserDevAdd and ExecuteChardevDel to hotplug vhost-user device	2020-01-15 10:03:26 +01:00
Liu Xiaodong	e04be2cc38	qmp: add ExecutePCIVhostUserDevAdd API Caller can hotplug vhost-user device via qmp. The Qemu vhost-user device, like vhost-user-blk-pci and vhost-user-scsi-pci can be hotplugged by qmp API: ExecuteCharDevUnixSocketAdd() together with ExecutePCIVhostUserDevAdd() Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>	2020-01-14 00:41:53 -05:00
Liu Xiaodong	13aeba09d5	qmp: support command 'chardev-remove' So that caller can remove hotremove chardev via qmp Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>	2020-01-14 00:12:04 -05:00
Mark Ryan	dfb6cf6041	Merge pull request #112 from alicefr/enable-travis-s390x s390x: add s390x travis support	2019-12-18 08:42:39 +01:00
Alice Frosi	6d6b2d8892	s390x: add s390x travis support Since we have travis support for s390x. Let's enable it Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2019-12-17 14:54:09 +01:00
Chenbin	175ac4993e	typo fix	2019-09-21 19:52:56 +08:00
Mark Ryan	8cba5a8e5f	Merge pull request #109 from jschintag/qemu-img-sharing virtio-blk: Add support for share-rw flag	2019-09-16 09:15:17 +02:00
Jan Schintag	cb9f640b4e	virtio-blk: Add support for share-rw flag This allows multiple instances of qemu to share the same file for virtio-blk device. Fixes: #108 Signed-off-by: Jan Schintag <jan.schintag@de.ibm.com>	2019-09-13 08:58:23 +02:00
Mark Ryan	ee460e3008	Merge pull request #107 from alicefr/no-numa-bck-mem s390x: dimm not supported	2019-09-02 10:22:18 +02:00
Alice Frosi	9463486d58	s390x: dimm not supported Dimm is not supported on s390x Fixes: #106 Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2019-08-27 08:37:21 +02:00
Julio Montes	e6644f4a25	Merge pull request #105 from bergwolf/interaction improve qemu interaction	2019-08-14 08:01:15 -05:00
Peng Tao	164bd8cd22	test/fmt: drop extra newlines They are unneeded. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2019-08-14 00:32:03 -07:00
Peng Tao	73555a409c	qmp: add query-status API So that caller can find out guest status via qmp. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2019-08-14 00:32:03 -07:00
Peng Tao	234e0edfd7	qemu: fix memory prealloc handling Memory preallocation is just a property of different memory backends. We should treat it similar to memory sharing property. Also rename FileBackedMemShared to MemShared as it is just another memory backend property that works with different memory backends not just file backed memory. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2019-08-14 00:32:00 -07:00
Peng Tao	30bfcaaa6d	qemu: add debug logfile When LogFile is specified, output debug log there. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2019-08-13 01:44:45 -07:00
Mark Ryan	aa341b005e	Merge pull request #104 from BetaXOi/query-schema qmp: support command 'query-qmp-schema'	2019-08-01 11:36:29 +02:00
Ning Bo	79e0d5333d	qmp: support command 'query-qmp-schema' The upper hyervisor manager application maybe need to wait some QMP event to control boot sequence, but the event we wanted maybe not exist in some older version, so we need query all QMP ABI and check the event is supported or not. related: kata-containers/runtime#1918 Signed-off-by: Ning Bo <ning.bo9@zte.com.cn>	2019-08-01 17:14:54 +08:00
Julio Montes	e0505242c0	Merge pull request #103 from alicefr/cpu_topology qmp: add checks for the CPU toplogy	2019-07-26 08:59:28 -05:00
Alice Frosi	68cdf64fe5	test: add cpu topology tests Add cpu driver types in TestQMPCPUDeviceAdd Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2019-07-26 14:27:25 +02:00
Alice Frosi	e0cf9d5c14	qmp: add checks for the CPU toplogy Support for function isSocketIDSupported, isThreadIDSupported and isDieIDSupported. The functions check if the cpu driver and the qemu version support the id parameter. Fixes: #102 Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2019-07-26 14:27:25 +02:00
Mark Ryan	e894e7ad00	Merge pull request #101 from devimc/topic/supportQemu41 qemu: support x86 SMP die	2019-07-25 15:12:41 +02:00
Julio Montes	a5c119086a	qemu: support x86 SMP die In QEMU 4.1 the CPU topology for x86 will change to: `socket > die > core > thread`. Add `die-id` field to `CPUProperties` and include it in CPU hotplugging Signed-off-by: Julio Montes <julio.montes@intel.com>	2019-07-16 14:08:40 +00:00
Mark Ryan	52b2309a55	Merge pull request #100 from Ace-Tang/add-pci-param Support x-pci-vendor-id and x-pci-device-id pass to qemu	2019-07-09 16:39:06 +02:00
Ace-Tang	8fd28e23ac	Support x-pci-vendor-id and x-pci-device-id pass to qemu since some vendor id like 1ded can not be identified by virtio-pci driver, so upper level need to pass a specified vendor id to qemu. the upper level will change unavailable id and pass it to qemu. Signed-off-by: Ace-Tang <aceapril@126.com>	2019-07-09 12:19:51 +08:00
Mark Ryan	8d18f344c5	Merge pull request #99 from alicefr/devno-blk-ccw Support for virtio-blk-ccw	2019-07-05 09:09:09 +02:00
Alice Frosi	713d0d9406	s390x: add virtio-blk-ccw type In order to hotplug virtio-blk, on s390x the CCW device drivers is used instad of PCI. Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2019-07-05 08:08:25 +02:00
Alice Frosi	65cc343f7b	test: add devno in the tests for s390x Add test with devno param Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2019-07-05 08:08:25 +02:00
Alice Frosi	9cf98da0be	s390x: add devno support DevNo is used to identify the ccw device for s390x systems Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2019-07-05 08:08:25 +02:00
Julio Montes	9f389cb319	Merge pull request #96 from ganeshmaharaj/mem-hotplug-share Allow sharing of memory backend file	2019-06-18 12:18:17 -05:00
Ganesh Maharaj Mahalingam	0c900f596e	Allow sharing of memory backend file Hotplugged memory could be backed by a file on the host with sharing turned on. This change allows qmp to pass that option to a govmm. Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>	2019-06-18 08:07:17 -07:00
Mark Ryan	516e0c5b7c	Merge pull request #95 from bergwolf/migration-incoming qemu: add migration incoming defer support	2019-06-14 14:05:33 +02:00
Peng Tao	f695ddf8f3	qemu: add migration incoming defer support qemu commandline supports -incoming defer and qmp supports migrate-incoming uri. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2019-06-14 00:24:26 -07:00
Sebastien Boeuf	27363b1aca	Merge pull request #94 from bergwolf/multiqueue qmp: add virtio-blk multiqueue	2019-05-28 08:00:15 -07:00
Peng Tao	f0f18dd0f2	qmp: add virtio-blk multiqueue Hotplug virtio-blk with multiqueue support. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2019-05-27 20:40:12 -07:00
Mark Ryan	a6e2655b90	Merge pull request #93 from lifupan/fixvirtioblkdriver qemu: fix the issue of wrong driver for VirtioBlock	2019-04-17 09:22:32 +02:00
lifupan	7d3deea4fc	qemu: Add a virtio-blk-pci device driver support Add a pci bus based virtio block device driver support. Fixes:#92 Signed-off-by: lifupan <lifupan@gmail.com>	2019-04-16 11:45:50 -04:00
Julio Montes	b3e7a9e784	Merge pull request #91 from stefanha/virtio-fs-cache-size-mb qemu: use MiB instead of Gib for virtio-fs cache size	2019-04-09 11:33:17 -05:00
Stefan Hajnoczi	058cda0603	qemu: use MiB instead of Gib for virtio-fs cache size QEMU supports finer-grained units than GiB. Change the cache size to MiB so users have more control over the cache size. Note that changing the semantics of the CacheSize field is fine because there are no users of this API yet. kata-runtime will be the first users and prefers MiB instead of GiB. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-04-09 10:21:43 +01:00
Mark Ryan	35a8fd3ca9	Merge pull request #90 from devimc/topic/FixQemu4qmp qemu/qmp: re-implement mainLoop	2019-04-08 09:38:17 +02:00
Julio Montes	694a7b1c61	qemu/qmp: re-implement mainLoop In newer versions of QEMU, like 4.0-rc2, QMP events can be thrown even before the QMP-version response, one example of this behaviour is when a virtio serial is closed and a VSERPORT_CHANGE event is thrown. Re-implement mainLoop to check the data received from the VM channel, since it's not a guarantee that the first data read from the VM channel is the QMP version. fixes https://github.com/kata-containers/runtime/issues/1474 Signed-off-by: Julio Montes <julio.montes@intel.com>	2019-04-05 13:25:22 -06:00
Julio Montes	4963fb587f	Merge pull request #89 from woshijpf/master qemu/qmp: fix readLoop() reuse scanner.Bytes() underlying array problem	2019-03-13 08:49:08 -06:00
jiangpengfei	5712b1198e	qemu/qmp: fix readLoop() reuse scanner.Bytes() underlying array problem Since []byte channel type transfer slice info(include slice underlying array pointer, len, cap) between channel sender and receiver. scanner.Bytes() function returned slice's underlying array may point to data that will be overwritten by a subsequent call to Scan(reference from: https://golang.org/pkg/bufio/#Scanner.Bytes), which may make consecutive scan() call write the read data into the same underlying array which causes receiver read mixed data,so we need to copy line to new allocated space and then send to channel receiver to solve this problem. Fixes: #88 Signed-off-by: jiangpengfei <jiangpengfei9@huawei.com>	2019-03-13 19:45:05 -04:00
Mark Ryan	b48780f3d3	Merge pull request #86 from stefanha/virtio-fs govmm: add VhostUserFS vhost-user device type	2019-02-20 17:57:34 +01:00
Stefan Hajnoczi	3c84b1daa3	govmm: add VhostUserFS vhost-user device type The QEMU vhost-user-fs-pci device provides virtio-fs host<->guest file system sharing (https://virtio-fs.gitlab.io/). The device is instantiated like this: $ qemu -chardev socket,path=/tmp/vhost-fs.sock,id=chr0 -device vhost-user-fs-pci,tag=myfs,chardev=chr0,cache-size=4G,versiontable=/dev/shm/fuse_shared_versions This patch adds the VhostUserFS DeviceDriver and command-line generation for this QEMU device. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-02-19 13:03:18 +00:00
Julio Montes	78d079db6d	Merge pull request #84 from nitkon/master qmp: Conditionally pass threadID and socketID when CPU device add	2019-01-28 10:43:02 -06:00
Nitesh Konkar	4692f6b965	qmp: Conditionally pass threadID and socketID when CPU device add For vCPU hotplug to work on ppc64le, we need not pass threadID and socketID. So conditionally pass arguments when executing CPU device add. Fixes: #83 Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com	2019-01-28 21:44:41 +05:30
Sebastien Boeuf	b9c8f76ebe	Merge pull request #85 from markdryan/fix-travis Fix travis	2019-01-28 08:02:40 -08:00
Mark Ryan	1f51b4386b	Update the versions of Go used to build GoVMM The .travis file was building GoVMM with some old of date versions of Go that seem to be incompatible with the latest versions of gometalinter. This commit updates the .travis file so that we build against 1.10 and 1.11. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2019-01-28 16:36:15 +01:00
Mark Ryan	ad310f9fde	Fix staticcheck S1023 Static check was complaining about code that looked like _ = <-ch when it wants to see simply <-ch There was only one instance of this in govmm and this commit fixes that instance. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2019-01-28 16:20:23 +01:00
Mark Ryan	932fdc7f50	Fix staticcheck S1023 By removing a redundant return statement. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2019-01-28 16:19:20 +01:00
Mark Ryan	cb2ce9339c	Fix staticcheck S1008 static check was complaining about code that looked like if x == "" { return false } return true when what it wants to see is return x != "". This commit fixes the issue. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2019-01-28 16:17:24 +01:00
Mark Ryan	f0172cd2a6	Fix staticcheck (S1002) staticcheck was complaining about code that looked like if x == true { } rather than if x { } This commit fixes the issue. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2019-01-28 16:13:48 +01:00
Mark Ryan	5f2e630bda	Fix staticcheck (S1025) staticcheck was complaining as there were quite a lot of fmt.Sprintf("%s",d) in the code where d was either a string or had string as its underlying type. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2019-01-28 16:06:49 +01:00
Mark Ryan	4beea5133e	Fix staticcheck (ST1005) errors staticcheck was complaining as some of the error messages returned by govmm began with a capital letter. This commit fixes the issue. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2019-01-28 15:32:07 +01:00
Sebastien Boeuf	737f03de59	Merge pull request #76 from teawater/nvdimm qmp: Add nvdimm support	2018-12-06 19:43:30 +00:00
Hui Zhu	97fc3435cf	contributors: add my name Signed-off-by: Hui Zhu <teawater@hyper.sh>	2018-12-06 11:35:15 +08:00
Hui Zhu	c891f5f84b	qmp: Add nvdimm support ExecuteNVDIMMDeviceAdd can add a nvdimm disk to qemu. Not implement NVDIMM device delete function because qemu doesn't support it. Signed-off-by: Hui Zhu <teawater@hyper.sh>	2018-12-06 11:35:07 +08:00
Mark Ryan	32f64a0630	Merge pull request #81 from sboeuf/fix_qmp_disable_modern qemu: Allow disable-modern option from QMP	2018-12-05 21:12:01 +01:00
Sebastien Boeuf	f9b31c0f80	qemu: Allow disable-modern option from QMP For devices that actually support the option disable-modern, this current commit provides a proper flag to the caller. This will allow for better support when used in nested environment as virtio-pci devices should rely on virtio 0.9 instead of 1.0 due to a bug in KVM. Fixes #80 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-12-05 09:16:41 -08:00
Mark Ryan	908b6aab14	Merge pull request #69 from BetaXOi/output-qmp-err qmp: Output error detail when execute QMP command failed	2018-12-04 09:20:41 +01:00
Mark Ryan	d31bc8d300	Merge pull request #79 from markdryan/s390x-tests Run tests for the s390x build	2018-12-03 16:37:12 +01:00
Mark Ryan	d6173077f1	Run tests for the s390x build It turns out it is possible to run the unit tests for the s390x build on travis by renaming the s390x specific files, so that their inclusion in the build is determined only by tags and not by filename, and by introducing a new tag s390x_test that we can use to force their inclusion into a build by using this tag. The .travis file is then updated to include the line go test --tags s390x_test ./... This creates a build on travis that includes the s390x specific files and runs the unit tests. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-12-03 15:56:20 +01:00
Mark Ryan	09923e8ed7	Merge pull request #78 from clarecch/master Contributors: Add Clare Chen to CONTRIBUTORS.md	2018-12-03 12:34:54 +01:00
Clare Chen	b36b5a8f67	Contributors: Add Clare Chen to CONTRIBUTORS.md Signed-off-by: Clare Chen <clare.chenhui@huawei.com>	2018-12-03 06:22:11 -05:00
Mark Ryan	900f3a1f18	Merge pull request #74 from markdryan/s390-travis Verify govmm builds on s390x	2018-12-03 10:16:09 +01:00
Mark Ryan	2fbc7e5ed2	Merge pull request #77 from caoruidong/contri Contributors: Add my name	2018-12-03 09:21:01 +01:00
Ruidong Cao	b41939c6b4	Contributors: Add my name Signed-off-by: Ruidong Cao <caoruidong@huawei.com>	2018-12-03 20:48:35 +08:00
NingBo	dab4cf1d70	qmp: Add tests Test execute QMP command with error response. Signed-off-by: NingBo <ning.bo9@zte.com.cn>	2018-12-03 14:40:26 +08:00
Mark Ryan	5ea6da1448	Verify govmm builds on s390x This commit adds a single command to the travis script that checks that the s390x build works. We can't run the unit tests but at least we can check that everything builds on this architecture. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-11-30 15:20:26 +01:00
Mark Ryan	dddf0f08ea	Merge pull request #68 from alicefr/s390x qemu: Add s390x support	2018-11-30 11:12:17 +01:00
Alice Frosi	ee75813ad1	contributors: add my name Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2018-11-30 10:14:45 +01:00
Alice Frosi	c80fc3b12f	qemu: Add s390x support The PR adds the s390x support. It sets the CCW devices and sets to false all the devices in the mapping isVirtioPCI. It reimplements the functions QemuNetdevParam and QemuDeviceParam to print an error message if the vhost-user devices are used. It introduces a new function ExecuteNetCCWDeviceAdd for qmp for the CCW devices. Fixes: #37 Co-authored-by: Yash D Jain <ydjainopensource@gmail.com> Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2018-11-30 10:13:28 +01:00
Mark Ryan	c5440a8819	Merge pull request #73 from markdryan/contributing Update file headers , CONTRIBUTING.md and add CONTRIBUTORS.md	2018-11-30 10:04:14 +01:00
Mark Ryan	ca477a18b6	Update source file headers This commit updates the headers in the Go source files to adhere to the new guidelines in the CONTRIBUTING.md file. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-11-30 09:34:21 +01:00
Mark Ryan	e68e005697	Update the CONTRIBUTING.md The CONTRIBUTING.md file is updated to provide a template for new source files and to invite contributors to add themselves to the CONTRIBUTORS.md file. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-11-30 09:34:21 +01:00
Mark Ryan	2b7db5473f	Add the CONTRIBUTORS.md file This file is a partial list of contributors to the Virtual Machine Manager for Go project. To see the full list of contributors, see the revision history in source control. Contributors who wish to be recognized in this file should add themselves (or their employer, as appropriate). Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-11-30 09:34:20 +01:00
Mark Ryan	18948af4d4	Merge pull request #67 from BetaXOi/fix-mempath qmp: fix mem-path properties for hotplug memory.	2018-11-30 08:49:21 +01:00
Rob Bradford	7efe742ea8	Merge pull request #71 from alicefr/vsock_cid qemu: change Context ID for Vsock to uint64	2018-11-29 16:40:38 +00:00
Alice Frosi	b3b765cbe6	qemu: test Valid for Vsock for Context ID Add test for the validation when the Context ID is larger than 32 bits Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2018-11-29 12:29:46 +00:00
Alice Frosi	3becff5f4e	qemu: change of ContextID from uint32 to uint64 The correct type used by qemu and in kernel is uint64 and this leads to an endianess problem with ioctl system call. See the issue https://github.com/kata-containers/runtime/issues/947 Fixes: #70 Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2018-11-29 12:29:29 +00:00
NingBo	f30fd1354a	qmp: Output error detail when execute QMP command failed Only get 'QMP command failed' error message now when execute QMP command by 'executeCommandWithResponse' failed. This patch will output more error detail. Signed-off-by: NingBo <ning.bo9@zte.com.cn>	2018-11-29 16:32:14 +08:00
NingBo	7da6a4c7c6	qmp: fix mem-path properties for hotplug memory. The QMP command 'object-add' only has three arguments: 'qom-type' 'id' and 'props', thus 'mem-path' has to be saved in 'props'. https://github.com/qemu/qemu/blob/stable-2.0/qapi-schema.json#L2958 https://github.com/qemu/qemu/blob/stable-2.12/qapi/misc.json#L1846 Signed-off-by: NingBo <ning.bo9@zte.com.cn>	2018-11-29 09:56:26 +08:00
Sebastien Boeuf	60a5f7ca7f	Merge pull request #64 from alicefr/preparation qemu/qmp: preparation for s390x support	2018-11-27 19:23:58 +00:00
Sebastien Boeuf	c664d3dd94	Merge pull request #60 from teawater/cache qemu/qmp: add new function ExecuteBlockdevAddWithCache	2018-11-27 07:46:36 +00:00
Alice Frosi	e4892e3396	qemu/qmp: preparation for s390x support This PR prepares for the s390x support. It introduces: - a generalization of ccw and pci devices. The variables for the pci devices have been renamed by removing the Pci suffix. They have been moved to the qemu_arch_base.go - the mapping isVirtioPCI has been move to qemu_arch_base.go because in this way a different mapping can be added for other architecture (e.g s390x) - the functions QemuNetdevParam and QemuDeviceParam have been moved to qemu_arch_base.go. In this way, they could be reimplemented for other architecture for the case VHOSTUSER - a function disableModern has been introduced to check if the device is a pci device and then returns the right parameters. In the case of ccw devices, they don't have the disable-modern flag - a function mqParameter has been introduced to return the right parameters for the mq case. The virtio-net-ccw device doesn't have the vectors flag - in qemu_arch_base_test.go contains the test and strings that can be overwritten for other architectures (e.g s390). The devices names and the flags for the devices can be overwritten. - the string for the romfile has been replaced by a variable romfile that could be left empty if the devices doesn't support a romfile as for the ccw devices for s390. - clean-up: the disable-modern=on/off options have been changed to disable-modern=true/false. In the code there was a mixture of on/true off/false Fixes: #61 Co-authored-by: Yash D Jain <ydjainopensource@gmail.com> Signed-off-by: Alice Frosi <afrosi@de.ibm.com>	2018-11-23 10:15:09 +00:00
Hui Zhu	110d2fa049	qemu/qmp: add new function ExecuteBlockdevAddWithCache ExecuteBlockdevAddWithCache has two more parameters direct and noFlush than ExecuteBlockdevAdd. They are cache-related options for block devices that are described in https://github.com/qemu/qemu/blob/master/qapi/block-core.json. direct denotes whether use of O_DIRECT (bypass the host page cache) is enabled. noFlush denotes whether flush requests for the device are ignored. Signed-off-by: Hui Zhu <teawater@hyper.sh>	2018-11-23 17:23:06 +08:00
Hui Zhu	a0b0c86e9c	qmp_test: Change QMP version from 2.6 to 2.9 Also change TestQMPXBlockdevDel to TestQMPBlockdevDel because QMP verion 2.9 and older use blockdev-del but not x-blockdev-del. Signed-off-by: Hui Zhu <teawater@hyper.sh>	2018-11-23 09:33:21 +08:00
Mark Ryan	99e0358ba9	Merge pull request #63 from jingxiaolu/add_pidfile qemu: add support for pidfile option	2018-11-22 08:57:25 +01:00
l00397676	10c36a13da	qemu: add support for pidfile option Add input for -pidfile option of qemu, so that we can get pid of qemu main process, and apply resource limitations to it. Fixes #62 Signed-off-by: l00397676 <lujingxiao@huawei.com>	2018-11-21 19:51:49 +08:00
Sebastien Boeuf	e82e8498c5	Merge pull request #59 from sboeuf/fix_virtio-net-pci qemu: Fix virtio-net-pci QMP command	2018-10-16 14:14:23 -07:00
Sebastien Boeuf	9c819db5a3	qemu: Fix virtio-net-pci QMP command This patch fixes the wrong behavior of specifying a netdev, MAC address or PCI address entry when those were empty. Instead, it does not provide those entries if the content is empty. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-10-16 13:20:55 -07:00
Manohar Castelino	b1635d5dcb	Merge pull request #56 from sboeuf/fix_romfile qemu: Add support for romfile option	2018-10-12 10:26:31 -07:00
Sebastien Boeuf	7fdfc6a4c9	qemu: Add support for romfile option Any device inheriting from virtio-pci can specify a ROM file. This option is provisioned by default with "efi-virtio.rom", but most of the time, firmwares such as OVMF or seabios will already support what is provided by this ROM file. In order to reduce the "forced" dependency on such ROM file, govmm should provide an empty path if the consumer of the library does not provide one. This patch reorganizes the list of devices, so that it gets easier to list which devices inherit from virtio-pci, and then adds the romfile option to every single device that support this option. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-10-10 17:17:36 -07:00
Rob Bradford	35b7308881	Merge pull request #57 from markdryan/contributing-security Update guidelines on security issue reporting	2018-10-10 15:09:21 +01:00
Mark Ryan	e74de3c7f1	Update guidelines on security issue reporting This commit clarifies the process to be used when reporting security issues. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-10-10 10:05:58 +02:00
Sebastien Boeuf	5770f40f4b	Merge pull request #55 from jcvenegas/virtio-balloon qemu: Add virtio-balloon device suppport.	2018-10-05 10:53:46 -07:00
Jose Carlos Venegas Munoz	ec83abe69e	qemu: Add virtio-balloon device suppport. Add support for virtio-balloon. - Add test - Support disable-modern - Support deflate-on-oom Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>	2018-10-05 11:18:31 -05:00
Rob Bradford	53c0c33bb2	Merge pull request #54 from jodh-intel/show-qemu-path-on-launch qemu: Show full path to qemu binary at launch time	2018-10-03 16:48:47 +01:00
James O. D. Hunt	46970781fa	qemu: Show full path to qemu binary at launch time Rather than show the generic "qemu", log the full path to the particular qemu binary being used. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2018-10-03 16:41:08 +01:00
Sebastien Boeuf	f03df80fc3	Merge pull request #53 from sboeuf/fix_pcie_bridge qemu: Fix the support of PCIe bridge	2018-10-02 15:47:02 -07:00
Sebastien Boeuf	ef7250508c	qemu: Fix the support of PCIe bridge In case the type of bridge is PCIEBridge, which we expect as ending up using pcie-pci-bridge device from Qemu, the properties chassis_nr and shpc don't exist. This commit simply fixes this use case by removing those parameters from the command line. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-10-02 15:35:49 -07:00
Mark Ryan	6ba3b3fad1	Merge pull request #51 from bergwolf/ExecuteQueryMigration qmp: add ExecuteQueryMigration	2018-09-28 15:18:36 +02:00
Peng Tao	56f645eac6	qmp: add ExecuteQueryMigration It sends query-migrate qmp command to check migration status. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-09-28 21:10:21 +08:00
Mark Ryan	c2d92fe208	Merge pull request #48 from bergwolf/memsize qemu: skip setting system memory if it is set via dimm device	2018-09-26 08:53:24 +02:00
Peng Tao	a429677a0b	govmm: fix memory prealloc The memory-backend-ram should also be set to a numa node instead of being inserted as a new device. Otherwise it becomes additional memory and requires explicit online to be available, instead of just being a backend of the memory specified by -m option. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-09-25 15:01:31 +08:00
Sebastien Boeuf	f3e45a09b7	Merge pull request #52 from WeiZhang555/qmp-query-cpus qmp: add "query-cpus" support	2018-09-24 22:10:31 -07:00
Wei Zhang	1130aab85e	qmp: add "query-cpus" support Add "query-cpus" and "query-cpus-fast" to query CPU information from qemu Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-09-21 10:14:25 +08:00
Mark Ryan	9905ae92c5	Merge pull request #47 from xindazhao/gpu-vfio-mdev qemu/qmp: add vfio mediated device support on root bus	2018-09-18 10:00:32 +02:00
Zhao Xinda	de5d278889	qemu/qmp: add vfio mediated device support on root bus In addition to supporting hotplug for VFIO mediated device on PCI bridge, this patch adds hotplug functionality on root bus. When parameter bus and addr are set to be empty, the system will pick up an empty slot on root bus. Signed-off-by: Zhao Xinda <xinda.zhao@intel.com>	2018-09-18 15:54:53 +08:00
Mark Ryan	66bfe83589	Merge pull request #50 from markdryan/fix-perms qemu/image: Reduce permissions of .iso creation dir	2018-09-13 11:59:19 +01:00
Mark Ryan	de00d7a681	qemu/image: Reduce permissions of .iso creation dir The contents of .iso used to bootstrap VMs with cloudinit are initialised using a precreated, short-lived directory. The permissions on this directory were too lenient. This commit restricts access to this directory to the user and his/her group. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-09-13 09:18:47 +02:00
Mark Ryan	032705ba6a	Merge pull request #49 from caoruidong/undefault-vhost qemu/qmp: nic can works without vhost	2018-09-11 11:36:32 +01:00
Ruidong Cao	1a1fee75e5	qemu/qmp: nic can works without vhost If host doesn't support vhost_net, we won't pass vhost="on" in QMP. Signed-off-by: Ruidong Cao <caoruidong@huawei.com>	2018-09-11 11:45:31 +08:00
Rob Bradford	e2c716433e	Merge pull request #45 from jcvenegas/rng-knob qemu: Add rng device .	2018-09-10 17:04:38 +01:00
Jose Carlos Venegas Munoz	6c3d84ea8c	qemu: Add virtio RNG device. Add support for virtio-rng divice. Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>	2018-09-07 15:11:02 -05:00
Rob Bradford	25277d52ad	Merge pull request #44 from clarecch/master qemu/qmp: support query-memory-devices qmp command.	2018-08-29 14:07:13 +01:00
Clare Chen	b16291cfab	qemu/qmp: support query-memory-devices qmp command. Implement query qemu memory devices function and testcase. Signed-off-by: Clare Chen <clare.chenhui@huawei.com>	2018-08-28 23:19:52 -04:00
Julio Montes	1a16b5f98f	Merge pull request #42 from woshijpf/fix-qemu-2.8 govmm: modify govmm to be compatible with qemu 2.8	2018-08-24 11:52:20 -05:00
flyflypeng	ce070d11f7	govmm: modify govmm to be compatible with qemu 2.8 govmm has ExecuteBlockdevAdd() function and ExecuteBlockdevDel() function doesn't compatible with qemu 2.8,because blockdev-add and x-blockdev-del usages are different between qemu 2.7 and qemu 2.8 Follow the qemu 2.7 and qemu 2.8 qmp-commands.txt documents to modify ExecuteBlockdevAdd() function and ExecuteBlockdevDel() function to be compatible with qemu 2.8 Signed-off-by: flyflypeng <jiangpengfei9@huawei.com>	2018-08-24 22:56:27 +08:00
Julio Montes	cb112dba2c	Merge pull request #41 from caoruidong/support-mq qemu/qmp: support hotplug a nic whose qdisc is mq	2018-08-23 12:01:44 -05:00
Ruidong Cao	0286ff9e6e	qemu/qmp: support hotplug a nic whose qdisc is mq If we hotplug a nic with args mq=on, its qdisc will be mq by default. This aligns with cold plug nics. Signed-off-by: Ruidong Cao <caoruidong@huawei.com>	2018-08-23 20:42:59 +08:00
Sebastien Boeuf	6aa35d33f2	Merge pull request #40 from rbradford/qmp-caps-comment qmp: Remind users that you must first call ExecuteQMPCapabilities()	2018-08-22 10:14:33 -07:00
Rob Bradford	8515ae4817	qmp: Remind users that you must first call ExecuteQMPCapabilities() Before calling any other command it is necessary to call ExecuteQMPCapabilities() otherwise QEMU will not process the subsequent QMP commands. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2018-08-22 17:07:21 +01:00
Rob Bradford	5a5e5b720f	Merge pull request #39 from sboeuf/vhost_hp qemu/qmp: Add netdev_add with chardev support	2018-08-22 09:21:24 +01:00
Sebastien Boeuf	21504d31ff	qemu/qmp: Add netdev_add with chardev support In order to be able to hotplug network devices such as vhost user net, we need to be able to define a previously declared chardev as a parameter of this new network device. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-08-21 15:59:43 -07:00
Julio Montes	cfdbc15148	Merge pull request #38 from markdryan/negative Add some negative test cases	2018-08-20 10:50:45 -05:00
Mark Ryan	ed34f61664	Add some negative test cases for qmp.go This commit adds a couple of negative test cases for qmp.go, one which checks that failed commands return errors and the other checks that QMPStart exits gracefully when passed an invalid socket path. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-08-20 15:40:37 +01:00
Mark Ryan	17cacc7238	Add negative test cases for qemu.go This commit adds some negative test cases for the append functions in qemu.go that build up the qemu command line. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-08-20 15:40:37 +01:00
Rob Bradford	d8f80cafe3	Merge pull request #36 from rbradford/use-context-for-launch qemu: Use the supplied context.Context for launching	2018-08-14 18:11:35 +01:00
Rob Bradford	2706a07be5	qemu: Use the supplied context.Context for launching This will kill the process when the context is cancelled. As using a nil context is not permitted it is necessary to substitute with a real context if it is not initialised in the Config struct. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2018-08-14 15:09:49 +01:00
Sebastien Boeuf	24ee4be532	Merge pull request #32 from amshinde/add-share-rw disk: Add --share-rw option for hotplugging disks	2018-08-13 14:44:28 -07:00
Mark Ryan	c202f5d0ba	Merge pull request #30 from xindazhao/gpu-vfio-mdev qemu/qmp: add vfio mediated device support	2018-08-13 22:07:21 +01:00
Mark Ryan	f3ab90f21b	Merge pull request #35 from rbradford/rtc-valid-tweak qemu: Do not try and generate invalid RTC parameters	2018-08-10 15:11:15 +01:00
Rob Bradford	e46092e03a	qemu: Do not try and generate invalid RTC parameters If no RTC is specified in the config then do not generate any RTC command line options. RTC command line options are optional for QEMU so make Valid() return false when presented with the empty version of the RTC struct containing empty strings. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2018-08-10 14:54:22 +01:00
Zhao Xinda	fcaf61dcb1	qemu/qmp: add vfio mediated device support In addition to normal VFIO device, this patch adds VFIO mediated device as a supplement to do hot plug on PCI(E) bridges. Signed-off-by: Zhao Xinda <xinda.zhao@intel.com>	2018-08-10 12:43:22 +08:00
Archana Shinde	4461c459a3	disk: Add --share-rw option for hotplugging disks With qemu 2.10, a write lock was added for qcow images that prevents the same image to be passed more than once. This can be over-ridden using the --share-rw option which is desired for raw images. This solves an issue with running Kata with devicemapper using the privileged mode as in this case all devices on the host are passed to the container using the block device associated with the rootfs, causing it to be passed twice to qemu. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-08-08 14:48:02 -07:00
Sebastien Boeuf	301ea5e989	Merge pull request #34 from devimc/topic/addrBusVsock qemu/qmp: add addr and bus to hotplug vsock devices	2018-08-08 08:44:09 -07:00
Julio Montes	685199980d	qemu/qmp: add addr and bus to hotplug vsock devices For machines types based on PCIe like q35, device addr and bus must be specified. For machines types based on PCI like pc, device addr must be specified and bus is optional since devices can be hot plugged directly on the root bus. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-08-08 09:53:09 -05:00
Sebastien Boeuf	eda239928b	Merge pull request #33 from caoruidong/hotplug-by-fds qemu/qmp: add function for hotplug network by fds	2018-08-08 07:49:07 -07:00
Ruidong Cao	10efa84132	qemu/qmp: add function for hotplug network by fds Implement function to hotplug a network device to QEMU by fds. Macvtap can only be hotplug by this way. Signed-off-by: Ruidong Cao <caoruidong@huawei.com>	2018-08-08 11:12:47 +08:00
Mark Ryan	8d626afb0c	Merge pull request #31 from devimc/topic/virtserialportHotplug qemu/qmp: implement functions to hotplug chardevs and serial ports	2018-08-06 18:44:34 +01:00
Julio Montes	80ed88edb1	qemu/qmp: implement function to hotplug serial ports Implement function to hotplug virtio serial ports, the serial ports are visible in the guest at the directory /dev/virtio-ports. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-08-03 13:50:25 -05:00
Julio Montes	ca46f21f3f	qemu/qmp: implement function to hotplug character devices implement function to hotplug character devices using as backend unix sockets, binding a character device with a serial port allows the communnication between processes running in the guest with processes running in the host. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-08-03 08:01:05 -05:00
Sebastien Boeuf	1c5466db3d	Merge pull request #23 from devimc/topic/vsockHotplug qemu: add vhostfd and disable-modern to vsock hotplug	2018-08-03 01:19:55 -07:00
Sebastien Boeuf	a5cbc6122f	Merge pull request #19 from markdryan/static-checks Add two additional static analysis tools to the travis builds	2018-08-03 01:19:22 -07:00
Julio Montes	03f1a1c3a8	qemu/qmp: implement getfd `getfd` receives a file descriptor via SCM rights and assign it a name, this command is useful to send file descriptors from the host, and then hot plug devices that needs file descriptors like vhost-vsock-pci devices. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-08-02 11:07:16 -05:00
Julio Montes	84b212f1b8	qemu: add vhostfd and disable-modern to vsock hotplug `vhostfd` is used to specify the vhost-vsock device fd, and it holds the context ID previously opened. `disable-modern` is to disable the use of "modern" devices, by using virtio 0.9 instead of virtio 1.0. Particularly, this is useful when running the VM in a nested environment. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-08-02 11:07:16 -05:00
Sebastien Boeuf	131c8d0caa	Merge branch 'master' into static-checks	2018-08-02 08:27:54 -07:00
Sebastien Boeuf	79e74d936b	Merge pull request #24 from caoruidong/master qemu/qmp: implement function for hotplug network	2018-07-25 09:44:51 -07:00
Ruidong Cao	12dfa87293	qemu/qmp: implement function for hotplug network Implement function to hotplug and delete a network device to QEMU Signed-off-by: Ruidong Cao <caoruidong@huawei.com>	2018-07-25 17:39:23 +08:00
Sebastien Boeuf	6ff20ae2f4	Merge pull request #25 from devimc/topic/improveVSockColdplug qemu: add vhostfd and disable-modern to vhost-vsock-pci	2018-07-24 16:20:31 -07:00
Julio Montes	3830b4419f	qemu: add vhostfd and disable-modern to vhost-vsock-pci `vhostfd` is the vhost file descriptor that holds the socket context ID `disable-modern` prevents qemu from relying on fast MMIO Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-07-24 15:24:43 -05:00
Mark Ryan	db7e149611	Merge pull request #22 from devimc/topic/vsockHotplug qemu/qmp: implement function to hotplug vsock-pci	2018-07-17 09:07:44 +01:00
Julio Montes	f700a97bee	qemu/qmp: implement function to hotplug vsock-pci Implement function to hotplug vsocks, vsocks are needed to communicate processes are running inside the VM with processes are running on the host. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-07-16 14:19:43 -05:00
Mark Ryan	4ca232ecdf	qmp_test: Fix Warning and Error level logs This commit fixes an issue with the log handlers defined by qmp_test. The issue was picked up by the latest version of go vet on go tip. qemu/qmp_test.go:56::error: missing ... in args forwarded to printf-like function (vet) qemu/qmp_test.go:60::error: missing ... in args forwarded to printf-like function (vet) Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-06-28 16:05:00 +01:00
Mark Ryan	430e72c63b	qemu,qmp: Enable gas security checker This commit enables the gas security checker on govmm builds. The security checker has signalled 4 issues all of which I've checked and have determined to be non issues. These issues are disabled by this commit. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-06-28 15:56:27 +01:00
Mark Ryan	ffc06e6bc4	qemu,qmp: Add staticcheck to travis and fix errors This commit enables staticcheck in the travis builds and fixes the existing errors detected by staticcheck. There was one type of error repeated in qemu.go in which the type of some constants was not explicitly specified. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-06-28 15:11:13 +01:00
Sebastien Boeuf	ff2401825e	Merge pull request #18 from bergwolf/templating Add APIs to enable vm templating	2018-06-25 07:58:45 -07:00
Peng Tao	54caf7810b	qmp: add hotplug memory It adds size of MiB memory to the guest. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-06-23 11:39:28 +08:00
Peng Tao	e66a9b481b	qemu: add appendMemoryKnobs helper To fix travis failure about cyclomatic complexity in appendKnobs(). Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-06-23 11:39:28 +08:00
Peng Tao	8aeca15388	qmp: add migrate set arguments It allows to set migration arguments so that callers can control how migration is done. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-06-23 11:39:28 +08:00
Peng Tao	a03d4968e1	qmp: add set migration capabilities It allows to set guest migration capabilities. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-06-23 11:39:28 +08:00
Peng Tao	0ace4176b4	qemu: allow to set migration incoming It is useful when we want to specify migration incoming source. Supported source are fd and exec right now. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-06-23 11:39:23 +08:00
Peng Tao	723bc5f3c6	qemu: allow to create a stopped guest When Knobs.Stopped is set, the guest CPU will not be started at startup. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-06-19 14:56:34 +08:00
Peng Tao	283d7df99e	qemu: add file backed memory device support It allows a caller to use a local file as the memory backend of the guest, and it also allows the file backed memory device to be set shared or not. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-06-19 14:55:47 +08:00
Sebastien Boeuf	9cf8ce6c6d	Merge pull request #15 from amshinde/pass-addr-bridge qemu: Add qemu parameter for PCI address for a bridge.	2018-04-03 12:21:35 -07:00
Archana Shinde	30aeacb89e	qemu: Add qemu parameter for PCI address for a bridge. We need to be able to specify the PCI slot for a bridge while adding it. Add test to verify bridge is correctly added. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-04-03 12:10:02 -07:00
Sebastien Boeuf	1509acf186	Merge pull request #14 from amshinde/scsi-iothreads Add ability to associate a SCSI controller device with an iothread	2018-03-29 10:35:47 -07:00
Archana Shinde	9130f37516	scsi: Allow scsi controller to associate with an IO thread. This enable data-plane for scsi. All drives attached to the scsi controller will have their IO processed in a single separate IO thread instead of qemu's main event loop. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-03-28 17:02:47 -07:00
Archana Shinde	a54de1835b	iothread: Add ability to configure iothreads IOthreads also known as x-data-plane allow IO to be processed in a separate thread rather than the main event loop. This produces much better IO throughput and latency. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-03-28 17:02:47 -07:00
Mark Ryan	82c67ab9b2	Merge pull request #12 from bergwolf/initrd qemu: add initrd support	2018-03-20 11:02:07 +00:00
Peng Tao	0c0ec8f3c9	qemu: add initrd support Append initrd image to qemu arguments if configured. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-03-20 16:42:39 +08:00
Mark Ryan	e87160f8ea	Merge pull request #11 from devimc/scsi/disable_modern qemu: add DisableModern to SCSIController	2018-03-06 18:39:29 +00:00
Julio Montes	68f3071806	qemu: add DisableModern to SCSIController DisableModern prevents qemu from relying on fast MMIO. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-03-06 12:11:02 -06:00
Mark Ryan	d60256118f	Merge pull request #9 from devimc/qemu/extraOptions qemu: add extra options for the machine type	2018-02-12 15:33:44 +00:00
Julio Montes	693d9548dc	qemu: add options for the machine type certain machines types need to have options to enable or disable features For example the machine type virt in certain hosts must have the gic version (gic-version=3 or gic-version=host) to start without problems Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-02-12 09:27:30 -06:00
Mark Ryan	065d1d2517	Merge pull request #7 from amshinde/scsi-device-add scsi: Add function to send device_add qmp command for a scsi device	2018-01-12 11:09:25 +00:00
Archana Shinde	3273aafd53	scsi: Add function to send device_add qmp command for a scsi device device_add qmp command for scsi devices accepts additional parameters like scsi-id and lun. Implement function to add scsi devices. Devices with drivers "scsi-hd", "scsi-cd" and "scsi-disk" are accepted. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-01-11 18:19:10 -08:00
Julio Montes	22c99930c2	Merge pull request #8 from markdryan/coveralls Compute coverage statistics for unit tests in Travis builds	2018-01-04 14:00:30 -06:00
Mark Ryan	6d198b8a13	Compute coverage statistics for unit tests in Travis builds This commit enables unit test coverage computation in Travis CI builds. Going forward, builds that decrease the unit test coverage by more than 1.0% will fail. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2018-01-02 16:21:01 +00:00
Julio Montes	0ecfba63e5	Merge pull request #5 from amshinde/add-scsi-controller-device scsi: Add a scsi controller device	2017-12-21 18:33:03 -06:00
Archana Shinde	3a31da32af	scsi: Add a scsi controller device SCSI controller allows scsi disks to be attached on the SCSI bus created by the controller. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2017-12-21 16:11:17 -08:00
Julio Montes	9250e77eda	Merge pull request #6 from sameo/topic/vsock qemu: Add VSOCK support	2017-12-20 08:21:44 -06:00
Samuel Ortiz	5316779d35	qemu: Add VSOCK support VSOCK sockets are added through a vhost PCI device. It takes a device ID and a context ID, the latter being the endpoint value to be reached from the host. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2017-12-19 23:40:39 +01:00
Manohar Castelino	064ffdb2b2	Merge pull request #4 from egernst/vhost-user-add-blk Vhost-user: add block device support	2017-12-15 13:45:12 -08:00
Manohar Castelino	1bbe457172	Merge pull request #3 from devimc/hotplug/CPU qemu: Add maxcpus attribute to -smp	2017-12-15 13:44:54 -08:00
Eric Ernst	f565536673	vhost-user: add blk device support Introduce basic vhost-user-blk-pci support. In adding this, cleaned up the QemuParams function to use a more appropriate switch statement. Similarly, cleanup up the Valid() logic. We still need to look into parameterization of the block parameter fields as well as introducing multiqueue support for the vhost-user devices. Signed-off-by: Eric Ernst <eric.ernst@intel.com>	2017-12-13 07:19:28 -08:00
Eric Ernst	e9e27673fa	vhost-user: updating comments for accuracy, rename device field Some comments were network specific for vhost-user devices, which is incorect. Fixed these. Renamed the HWAddress field to be Address, so that it could potentially be used more generically for non-network based vhost-user types. Signed-off-by: Eric Ernst <eric.ernst@intel.com>	2017-12-13 07:19:28 -08:00
Julio Montes	8fe572367a	qemu: Add maxcpus attribute to -smp maxcpus is used to specify how many cpus a VM can have. This attribute must be specified to enable the hotplugging CPUs capability, otherwise the maximum number of CPU will be defined by the number of CPU in -smp. Signed-off-by: Julio Montes <julio.montes@intel.com>	2017-12-12 10:14:13 -06:00
Mark Ryan	425b3629c7	Merge pull request #2 from markdryan/badges Add badges to the README.md file	2017-12-12 14:50:14 +00:00
Mark Ryan	3baa776515	Add badges to the README.md file This commit adds three badges to the README.md file - Goreportcard - Godoc - Travis Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-12-12 12:06:13 +00:00
Mark Ryan	eacde4d37d	Merge pull request #1 from markdryan/add-travis Enable Travis builds	2017-12-12 12:05:08 +00:00
Mark Ryan	d74e3b6633	Fix errcheck failures in the unit tests There were some unchecked errors in some of the unit files relating to the closure and removal of temporary files. As the closure and removal of these files is not really important to whether the next passes or fails we ignore the errors. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-12-12 11:51:17 +00:00
Mark Ryan	db60e32f30	Enable Travis builds This commit adds a .travis file which enables Travis builds for govmm. The script builds the source and runs the unit tests and gometalinter enabling - misspell - vet - ineffassign - gofmt - gocyclo 15 - golint - errcheck - deadcode Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-12-12 11:46:40 +00:00
Mark Ryan	9cb47fc07d	Add .gitignore file. Currently it just ignores emacs backup files. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-12-11 11:05:31 +00:00
Mark Ryan	a8aaf534b6	Add project documentation This commit adds three documents: - CONTRIBUTING.md ( a files describing how to contribute to the project )` - COPYING ( the Apache 2.0 license ) - README.md ( a brief description of the project) Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-12-11 11:05:31 +00:00
Mark Ryan	57aafb5638	Remove all references to and dependencies on ciao This commit removes all the references to the ciao project. It also removes some of the dependencies that the unit tests were pulling in. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-12-11 11:05:31 +00:00
Mark Ryan	27709fce43	Move files to the qemu folder This commit moves all of the source files to the qemu folder. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-12-11 11:05:31 +00:00
Mark Ryan	367ac50fe8	Merge pull request #1624 from egernst/vhostuser-enabling qemu: introduce vhost-user handling	2017-12-08 17:28:37 +00:00
Eric Ernst	48feb29fe5	qemu: introduce vhost-user handling Add ability to add a vhostuser device to the QEMU commandline. We expect two different types of devices to be connected through a vhostuser socket: SCSI and network. Signed-off-by: Eric Ernst <eric.ernst@intel.com>	2017-12-08 09:03:34 -08:00
Julio Montes	b8ddd24400	qemu: Add function to list hotpluggable CPUs ExecuteQueryHotpluggableCPUs returns the list of hotpluggable CPUs Signed-off-by: Julio Montes <julio.montes@intel.com>	2017-12-08 10:01:06 -06:00
Julio Montes	8c428ed722	qemu: Add function to hotplug CPUs ExecuteCPUDeviceAdd hot-adds a CPU to a running VM Signed-off-by: Julio Montes <julio.montes@intel.com>	2017-12-07 14:16:00 -06:00
Julio Montes	24b14059b3	qemu: Add functions to process QMP response Some QMP commands like ```query-hotpluggable-cpus``` returns a response that needs to be processed and returned to the client as a struct. This patch adds the function ```executeCommandWithResponse``` that returns the response of a QMP command. Signed-off-by: Julio Montes <julio.montes@intel.com>	2017-12-07 12:33:12 -06:00
Julio Montes	e39da6ca47	qmp: Add support for hot plugging VFIO devices on PCI(E) bridges This patch adds a new function to hot plug VFIO devices on PCI(E) bridges, This change allows to hot plug N VFIO devices in Qemu PC and Q35 Signed-off-by: Julio Montes <julio.montes@intel.com>	2017-11-29 10:48:53 -06:00
Mark Ryan	bc030d13d1	qemu: Add a SysProcAttr parameter to CreateCloudInitISO This change adds an additional parameter to CreateCloudInitISO that allows users more control over the newly created xorriso process. They can for instance specify the user under which the new qemu process should run and which capabilities should be retained in the child xorriso process. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-11-20 17:27:02 +00:00
Mark Ryan	11977072ea	qemu: Add a SysProcAttr parameter to LaunchCustomQemu This change adds an additional parameter to LaunchCustomQemu that allows users more control over the newly created process. They can for instance specify the user under which the new qemu process should run and which capabilities should be retained in the child qemu process. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-11-20 17:27:02 +00:00
Archana Shinde	b639da45ed	qemu: Add function to hotplug vfio device Add ability to hotplug a pci device bound to vfio-pci driver. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2017-11-09 18:04:33 -08:00
Manohar Castelino	7e5614b8a7	Networking: Add vhost fd support Add vhost fd support. This is needed in the case of multi queue. Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>	2017-11-02 13:02:33 -07:00
Julio Montes	14316ce0b1	qemu/qmp: Implement function to hot plug PCI devices ExecutePCIDeviceAdd is a function that can be used to hot plug devices directly on pci(e).0 or pci(e) bridges. ExecutePCIDeviceAdd is PCI specific because unlike ExecuteDeviceAdd, it includes an extra parameter to specify the device address on its parent bus. Signed-off-by: Julio Montes <julio.montes@intel.com>	2017-10-24 09:01:12 -05:00
Julio Montes	83485dc9a4	qemu: Implement Bridge struct Bridge struct represent pci bridges(pci-bridge) or pcie bridges(pcie-pci-bridges), bridges can be used to hot plug devices in pc and q35 machines Signed-off-by: Julio Montes <julio.montes@intel.com>	2017-10-24 08:31:37 -05:00
Manohar Castelino	cfa8a995de	Networking: Add support for handling macvtap interfaces Add support for macvtap interfaces. This also brings in support for generic multiqueue support in virt containers. Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>	2017-10-12 09:59:03 -07:00
Julio Montes	83126d3e05	bios: add support for custom bios Add Bios field into qemu Config struct, this allows to start VM with custom bios Partially fixes https://github.com/clearcontainers/runtime/issues/686 Signed-off-by: Julio Montes <julio.montes@intel.com>	2017-10-06 14:28:12 -05:00
Manohar Castelino	3da2ef9dea	QEMU: Knobs: Huge Page Support: Add support for huge pages Add support to launch virtual machines where the RAM is allocated using huge pages. This is useful for running with a user mode networking stack, and for custom setups which require high performance and low latency. Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>	2017-09-26 11:29:45 -07:00
Archana Shinde	9bfa792795	vfio: Add ability to pass VFIO devices to qemu VFIO is meant for exposing exposing direct device access to the virtual machine. Add ability to append VFIO devices to qemu command line. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2017-09-22 16:02:27 -07:00
Mark Ryan	a70ffd1980	Build: Fix the build after repo move. Ciao has recently moved from github.com/01org/ciao to github.com/ciao-project/ciao. This moves requires us to update our import paths to build successfully. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-09-21 17:39:45 +01:00
Manohar Castelino	0c206170c4	Knobs: Modify the behaviour of the Mlock knob. The Mlock knob is unfortunately tied to realtime. Allow Mlock knob to implicitly enable realtime to get the desired swapping behavior when swapping is desired. Note: Realtime as implemented today can only be used to enable swap, and as such does not really control realtime behaviour. The knob is redundant but retained here just to ensure that when more capabilities are added in future QEMU iterations we can take advantage of the same. Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>	2017-09-15 10:36:13 -07:00
Manohar Castelino	ddee41d553	QEMU: Enable realtime options Enable realtime options in QEMU. Also add support to control memory locking. Turning realtime on with memory locking disabled allows memory to be swapped out, potentially increasing density of VMs. Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>	2017-09-14 08:54:35 -07:00
Manohar Castelino	4ecb9de5b3	qemu: Add support for memory pre-allocation Add support for pre-allocating all of the RAM. This increases the memory footprint of QEMU and should be used only when needed. Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>	2017-09-12 15:45:16 -07:00
Archana Shinde	1fbe6c5d1d	qmp: Update block device deletion for newer versions of qemu blockdev-del command has been added in qemu 2.9 to replace x-blockdev-del command used earlier for deleting block devices. Update ExecuteXBlockdevDel() to use this updated qmp command. Rename ExecuteXBlockdevDel to ExecuteBlockdevDel as this no longer executes x-block-del command for qemu>=2.9. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2017-08-30 18:39:51 -07:00
Sebastien Boeuf	e74aeef1ad	qemu: Add disable-modern option for virtio devices For some cases, we have to disable the fast MMIO support, by disabling virtio 1.0. The reason for this is that we want to be able to nest our qemu VM inside a VM run by an hypervisor with no support for fast MMIO. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2017-08-17 08:47:50 -07:00
Sebastien Boeuf	8d617ff5b9	qemu: Update virtio-net-pci command line In case of a network device, and specifically virtio-net-pci, we have to update to what is expected by qemu. In this case, the driver name should be prefixed with "driver=". Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2017-08-17 00:51:32 -07:00
Archana Shinde	25a2dc8f6e	qemu: Update blockdev-add qmp command to support newer qemu versions With qemu 2.9, the qmp block-dev command was updated from: { "execute": "blockdev-add", "arguments": { "options": { ... } } } to: { "execute": "blockdev-add", "arguments": { ... } } Also, instead of id, blockdev-add now requires a node-name for the root node(https://wiki.qemu.org/index.php/ChangeLog/2.9) Store the version information with QMPStart and use that to issue qmp command for adding block devices in the correct format. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2017-08-08 08:53:05 -07:00
Rob Bradford	d4f77103be	misc: Remove some of the code flagged by unused linter Unfortunately the ununused linter is overzealous with some of the fields that it things are unused as gophercloud relies on their values. So go ahead with the most straightforward removals but do not enable unused on travis builds. ciao-image/datastore/datastore_test.go:28:5⚠️ var metaDsTables is unused (U1000) (unused) ciao-controller/api/api_test.go:39:6⚠️ func myHostname is unused (U1000) (unused) ciao-cli/identity.go:58:3⚠️ field Description is unused (U1000) (unused) ciao-cli/identity.go:59:3⚠️ field DomainID is unused (U1000) (unused) ciao-cli/identity.go:60:3⚠️ field Enabled is unused (U1000) (unused) ciao-cli/identity.go:62:3⚠️ field ParentID is unused (U1000) (unused) ciao-cli/identity.go:63:3⚠️ field Links is unused (U1000) (unused) ciao-cli/identity.go:70:3⚠️ field Self is unused (U1000) (unused) ciao-cli/identity.go:71:3⚠️ field Previous is unused (U1000) (unused) ciao-cli/identity.go:72:3⚠️ field Next is unused (U1000) (unused) ciao-cli/identity.go:207:3⚠️ field Next is unused (U1000) (unused) ciao-cli/identity.go:208:3⚠️ field Previous is unused (U1000) (unused) ciao-cli/identity.go:209:3⚠️ field Self is unused (U1000) (unused) ciao-cli/identity.go:213:3⚠️ field Description is unused (U1000) (unused) ciao-cli/identity.go:214:3⚠️ field DomainID is unused (U1000) (unused) ciao-cli/identity.go:215:3⚠️ field Enabled is unused (U1000) (unused) ciao-cli/identity.go:217:3⚠️ field Links is unused (U1000) (unused) ciao-cli/identity.go:221:3⚠️ field ParentID is unused (U1000) (unused) ciao-cli/main.go:105:6⚠️ type action is unused (U1000) (unused) ciao-cli/volume.go:37:6⚠️ type customVolumeExt is unused (U1000) (unused) ciao-cli/volume.go:39:2⚠️ field customVolumeExt is unused (U1000) (unused) networking/ciao-cnci-agent/network.go:98:8⚠️ const maxKey is unused (U1000) (unused) networking/libsnnet/tests/parallel/parallel_test.go:371:6⚠️ func dockerNetList is unused (U1000) (unused) networking/libsnnet/tests/parallel/parallel_test.go:379:6⚠️ func dockerNetInfo is unused (U1000) (unused) openstack/compute/api.go:308:2⚠️ const limit is unused (U1000) (unused) openstack/compute/api.go:309:2⚠️ const marker is unused (U1000) (unused) openstack/compute/api.go:312:6⚠️ type pager is unused (U1000) (unused) openstack/compute/api.go:313:2⚠️ func pager.filter is unused (U1000) (unused) openstack/compute/api.go:314:2⚠️ func pager.nextPage is unused (U1000) (unused) openstack/compute/api_test.go:34:6⚠️ func myHostname is unused (U1000) (unused) ciao-controller/api.go:72:2⚠️ const statusFilter is unused (U1000) (unused) ciao-controller/api.go:75:6⚠️ type pager is unused (U1000) (unused) ciao-controller/api.go:76:2⚠️ func pager.filter is unused (U1000) (unused) ciao-controller/api.go:77:2⚠️ func pager.nextPage is unused (U1000) (unused) ciao-controller/api.go:136:25⚠️ func (nodePager).filter is unused (U1000) (unused) ciao-controller/api.go:198:31⚠️ func (nodeServerPager).filter is unused (U1000) (unused) ciao-controller/controller_test.go:107:6⚠️ func addTestTenantNoCNCI is unused (U1000) (unused) ciao-controller/controller_test.go:1104:6⚠️ func startTestWorkload is unused (U1000) (unused) ciao-controller/controller_test.go:1123:6⚠️ func testStartWorkloadLaunchCNCI is unused (U1000) (unused) ciao-controller/openstack_compute.go:552:5⚠️ field Links is unused (U1000) (unused) qemu/qmp_test.go:493:3⚠️ const seconds is unused (U1000) (unused) qemu/qmp_test.go:494:3⚠️ const microsecondsEv1 is unused (U1000) (unused) qemu/qmp_test.go:495:3⚠️ const device is unused (U1000) (unused) qemu/qmp_test.go:496:3⚠️ const path is unused (U1000) (unused) templateutils/example_test.go:53:3⚠️ field hidden is unused (U1000) (unused) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2017-07-19 10:23:46 +01:00
Rob Bradford	a1600dc15b	misc: Remove unused fields identified by structcheck Add structcheck to the list of linters used on travis runs. ciao-cli/event.go:109:2⚠️ unused struct field github.com/01org/ciao/ciao-cli.eventDeleteCommand.all (structcheck) ciao-cli/event.go:110:2⚠️ unused struct field github.com/01org/ciao/ciao-cli.eventDeleteCommand.tenant (structcheck) ciao-cli/external_ips.go:636:2⚠️ unused struct field github.com/01org/ciao/ciao-cli.poolAddCommand.ips (structcheck) ciao-cli/node.go:43:2⚠️ unused struct field github.com/01org/ciao/ciao-cli.nodeListCommand.nodeID (structcheck) ciao-controller/client_wrapper_test.go:29:2⚠️ unused struct field github.com/01org/ciao/ciao-controller.ssntpClientWrapper.ctl (structcheck) qemu/qmp.go:111:2⚠️ unused struct field github.com/01org/ciao/qemu.qmpResult.data (structcheck) ssntp/ssntp_test.go:193:2⚠️ unused struct field github.com/01org/ciao/ssntp_test.ssntpClient.evtTracedChannel (structcheck) ssntp/ssntp_test.go:192:2⚠️ unused struct field github.com/01org/ciao/ssntp_test.ssntpClient.staTracedChannel (structcheck) ssntp/ssntp_test.go:194:2⚠️ unused struct field github.com/01org/ciao/ssntp_test.ssntpClient.errTracedChannel (structcheck) ssntp/server.go:75:2⚠️ unused struct field github.com/01org/ciao/ssntp.Server.roleVerify (structcheck) networking/ciao-cnci-agent/client.go:97:2⚠️ unused struct field github.com/01org/ciao/networking/ciao-cnci-agent.agentClient.netCh (structcheck) testutil/agent.go:37:2⚠️ unused struct field github.com/01org/ciao/testutil.SsntpTestClient.ticker (structcheck) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2017-07-13 14:52:16 +01:00
Rob Bradford	58a835e6a6	misc: Remove unused variables identified by varcheck And add varcheck to the list of linters used on travis runs (with an increased deadline.) ciao-launcher/qemu_test.go:31:5⚠️ unused variable or constant imageInfoTestGood (varcheck) ciao-launcher/qemu_test.go:44:5⚠️ unused variable or constant imageInfoTestMissingBytes (varcheck) ciao-launcher/qemu_test.go:57:5⚠️ unused variable or constant imageInfoTestMissingLine (varcheck) ciao-launcher/qemu_test.go:69:5⚠️ unused variable or constant imageInfoTooBig (varcheck) ciao-launcher/qemu_test.go:82:5⚠️ unused variable or constant imageInfoBadBytes (varcheck) configuration/configuration_test.go:35:7⚠️ unused variable or constant glanceURL (varcheck) ciao-controller/controller_test.go:1918:5⚠️ unused variable or constant testClients (varcheck) qemu/qmp_test.go:44:2⚠️ unused variable or constant qmpSuccess (varcheck) qemu/qmp_test.go:45:2⚠️ unused variable or constant qmpFailure (varcheck) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2017-07-13 14:52:16 +01:00
Sebastien Boeuf	d48b5b5f48	qemu: Add PCI option to the NetDevice The existing NetDevice relies on virtio-net driver, but there is a useful PCI variant which was not available: virtio-net-pci. This patch adds this new driver and adds two parameters specific to this: "bus" and "addr". Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2017-03-09 14:54:11 -08:00
Mark Ryan	a84228ae99	qemu: Document how cancelling works. The code that handles the serialization and cancelling of QMP commands is a little complex and it took me some time to remember how it actually works and why it works in this particular way. For this reason I've added some comments which will hopefully make the next bug fix in this area a little less painful. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-02-16 09:49:44 +00:00
Mark Ryan	1e7202a5a6	qemu: Fix spelling error in qmp_test.go Command only has two ms. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-02-16 09:49:44 +00:00
Mark Ryan	c6f334533a	qemu: Fix command cancelling. There was a bug with the cancelling of commands that meant that when an attempt was made to cancel a command and then to issue a second command, the first, cancelled command was re-issued. This commit fixes the issue and adds a new test case to check that cancelling of commands does indeed work. There was also an issue with the test harness which meant that tests that issued more than one command were not actually testing the second and third commands. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-02-16 09:49:44 +00:00
Mark Ryan	a8a798b0c0	qemu, ciao-launcher: Move ConfigDrive ISO creation code to qemu Launcher's ConfigDrive ISO creation function, createCloudInitISO has been moved to the qemu package so that it can be re-used by ciao-down. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2017-02-06 12:16:09 +00:00
Sebastien Boeuf	30cf11632c	Add missing bus parameter for a CharDevice When creating a CharDevice, we need to add a "bus" parameter so that it can match the serial pci device previously created. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2016-10-21 16:04:22 -07:00
Samuel Ortiz	2aa5f5a3c0	qemu: Add support for serial port addition We add a new device driver, and also a name to the CharDev structure this is needed for qemu to actually create the serial port on the guest. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-10-13 17:56:31 +02:00
Samuel Ortiz	6fe338d604	qemu: Support creating multiple QMP sockets The QMP socket implementation does not support multiple clients sending and receiving QMP commands. As a consequence we need to be able to create multiple QMP sockets from the qemu package, so that at least we can support a fixed number of QMP clients. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-10-11 14:32:41 +02:00
Samuel Ortiz	992b861ec5	qemu: Add the daemonize qemu option to the Knobs structure This way callers can choose if they want the qemu process to be a daemon or not. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-30 15:56:28 +02:00
Samuel Ortiz	997cb23399	qemu: Remove dead code appendCharDevice() got replaced by the CharDevice's QemuParams method but never got deleted. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-30 15:46:57 +02:00
Samuel Ortiz	e555f565f4	qemu: Add support for socket based consoles When we get no virtual console to plug into, we may want qemu to create a socket where we can asynchronously connect to. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-30 15:46:40 +02:00
Samuel Ortiz	eae8fae0e7	qemu: Fix security model typo The right qemu parameter is "security_model", not "security-model". Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-21 17:48:47 +02:00
Samuel Ortiz	db067857bd	qemu: Make Config's FDs field private All file descriptors will come from specific devices configurations, so this patch: 1) Make the Config FDs file private 2) Provide an appendFDs() method for Config, that takes a slice of os.File pointers and a) Adds them to the Config private fd slice b) Return a slice of ints that represent the file descriptors for these device specific files, as seen by the qemu process. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-19 12:43:03 +02:00
Samuel Ortiz	12f6ebe389	qemu: Embed the qemu parameters into the Config structure It is a private field now, and all append*() routines are now Config methods instead of private qemu functions. Since we will have to carry a kernelParams private field as well, this change will keep all built parameters internal and make things consistent. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-19 12:41:40 +02:00
Samuel Ortiz	e193a77b8d	qemu: Add support for block devices For now we only support QCOW2 backed block devices. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 01:08:31 +02:00
Samuel Ortiz	3908185ccd	qemu: Add MACVTAP support The networking device structure now supports MACVTAP. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:43:49 +02:00
Samuel Ortiz	6d7dfa04bf	qemu: Get rid of the Driver structure By adding QemuParams() to the Device interface, we can get rid of the driver structure and simplify further the appendDevices() routine. With that implementation we can generate the following qemu parameters: "-device virtio-9p-pci,fsdev=foo,mount_tag=rootfs -fsdev local,id=foo,path=/bar/foo,security-model=none" from these single structures: fsdev := FSDevice{ Driver: Virtio9P FSDriver: Local, ID: "foo", Path: "/bar/foo", MountTag: "rootfs", SecurityModel: None, } Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:43:41 +02:00
Samuel Ortiz	cc9cb33a5d	qemu: Add QMPSocket specific type Instead of open coding the QMP socket type, we now have a specific type for it. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	2d736d7173	qemu: Add RTC specific types Instead of open coding the RTC fields, we now have specific types for it. We also have a RTC unit test now. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	e543c3383d	qemu: Probe each qemu device with a driver Having separate structures for the qemu driver definitions and each possible device definitions is confusing and error prone as one needs to be very careful using matching IDs and names in both structures. As the driver parameter can be derived from the device ones, this patch changes the Device and Driver structures to be linked together, i.e. each driver needs to have its corresponding device. For example this allows us to build the following 9pfs qemu parameters: "-fsdev local,id=foo,path=/bar/foo,security-model=none -device virtio-9p-pci,fsdev=foo,mount_tag=rootfs" from these structures: fsdev := FSDevice{ Driver: Local, ID: "foo", Path: "/bar/foo", MountTag: "rootfs", SecurityModel: None, } driver := Driver{ Driver: Virtio9P, Device: fsdev, } Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	eda8607cc6	qemu: Add netdev options to the Device structure With the NetDev and MACAddress strings, we can now create networking device drivers. We also add a unit test for netdev Device creation. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	4780e2371f	qemu: Add multi-queue and vhost definitions to NetDevice We can now specify if we want vhost to be enabled and wich fds we should use for multiqueue support. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	137e7c7242	qemu: Add a NetDevice slice to the Config structure The NetDevice structure represents a network device to be emulated by qemu. We also add the corresponding unit test. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	c0e2aacad2	qemu: Add one unit test for the Config strings Here we test that name, UUID and the CPU model are properly built. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	5ba8ef79df	qemu: Add QMP socket unit tests We test that the QMP socket parameter is properly built. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	7b2f7eb5d8	qemu: Add Memory and SMP unit tests We test that the memory and SMP configuration parameters are properly built. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	2ea9b9a385	qemu: Add a Kernel unit test We test that the kernel path and the kernel parameters are properly built. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	8e495f6eff	qemu: Add a Knobs unit test We test that all true and all false knobs parameters are properly built. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	8aeb3d45aa	qemu: Add an Object unit test We test that memory-backend-file and empty objects parameters are properly built. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	38e041dc9d	qemu: Add Device unit tests We add a NVDIMM, a filesystem and an empty device. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	54d32c2414	qemu: Add parameters adding unit tests We only test the Machine parameters for now. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	ebfa382d2e	qemu: Add a Knobs field to the Config structure The Knobs structure groups all qemu isolated boolean settings. For now this is -no-user-config, -no-defaults and -nographic. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	fe1bdcd2f7	qemu: Remove the extra parameters field from the Config structure The extraParams is confusing and can conflict with the rest of the Config structure definitions. We remove it and will add new fields to that structure as needed. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	15bce61a90	qemu: Group all machine configurations into one structure Here we group the machine type and acceleration together as they are defined through the same qemu parameter (-machine). Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	d94b5af875	qemu: Add a VGA parameter field to the Config structure The VGA string represents the type of VGA card qemu should emulate. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	4892d041e7	qemu: Add a Global parameter field to the Config structure The Global string represents the set of default Device driver properties we want qemu to use. This is mostly useful for automatically created devices. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	612a5a9e5d	qemu: Add a RTC field to the Config structure The RTC structure represents the guest Real Time Clock configuration. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	c63ec0965a	qemu: Add a SMP field to the Config structure The SMP structure defines the amount of virtual CPUs, sockets, and threads per CPU that is made available to the guest. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	7cf386a81c	qemu: Add a Memory field to the Config structure The Memory field holds the guest memory configuration. It is used to define the current and maximum RAM is made available to the guest and how this amount of RAM is splitted into several slots. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	b198bc67e7	qemu: Add a UUID field to the Config structure The qemu UUID will be used to set the guest system UUID. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	6239e846b7	qemu: Add a Character Devices slice field to the Config structure Qemu character devices typically allow for sending traffic from the guest to the host by emulating a console, a tty, a serial device for example. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	73e2d53c9a	qemu: Add a Filesystem Devices slice field to the Config structure Each Filesystem device should have a corresponding "virtio-9p-pci" Device driver. They represent a filesystem to be exported through 9pfs. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	518ba627b1	qemu: Add a Kernel field to the Config structure The Kernel structure holds the guest kernel configuration: its path and its parameters. This is the kernel qemu will boot the VM from. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	b973bc59fb	qemu: Add an Object slice field to the Config structure The Object slice tells qemu which specific object to create. Qemu objects can represent memory backend files, random number generators, TLS credentials, etc... Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	8744dfe85e	qemu: Add a Device slice field to the Config structure We may need to support a large range of devices in the qemu created VM and the Device slice allows us to define which drivers are needed. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	5458de70ad	qemu: Add a QMP socket field to the Config structure QMP sockets are used to send qemu specific commands to the running qemu process. The QMPSocket structure allows us to define the socket type we want, along with its name. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	171182709d	qemu: Add qemu's name to the Config structure This allows us to set the qemu -name option. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Samuel Ortiz	37a1f5003d	qemu: Add configuration structure to simplify LaunchQemu LaunchQemu() now takes a Config structure that contains some more descriptive fields than raw qemu parameter strings. LaunchQemu is now simpler to call and more extensible as supporting more qemu parameters would mean expanding Config instead of changing the API. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-17 00:41:13 +02:00
Mark Ryan	5ccbaf2b59	ciao-launcher, qemu: Upgrade to new context package. Ciao will use the new standard library context package from now on. This will allow us to use some of the new standard library functions such as DialContext. Partial fix for issue #541 Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2016-09-12 11:51:00 +01:00
Samuel Ortiz	f57201989b	qemu: Use null QMP logger when the logger parameter is nil Or else LaunchQemu() ends up dereferencing a nil pointer and panic'ing. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-09-09 18:45:31 +02:00
Mark Ryan	7d4199a449	qemu: Fix ineffassign error Fix ciao/qemu/qmp.go:349:3: ineffectual assignment to ok. Strictly speaking this is a bug in ineffassign but it's easier to change the ciao code. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2016-09-01 18:46:37 +01:00
Mark Ryan	7f50a41525	qemu: Fix a silly bug in LaunchQemu There's no point in setting cmd.ExtraFiles if the fds array is an empty slice. This won't do any harm but is essentially a no-op. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2016-08-26 16:52:43 +01:00
Mark Ryan	fc6bf8cf80	qemu: Add package documentation This commit adds some package documentation to the qemu package, including an overview of the package and an example of its use. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2016-08-26 16:52:36 +01:00
Mark Ryan	306f54a907	ciao-launcher, qemu: Move launchQemu to qemu The launcher function launchQemu has been moved to the qemu package and is now called LaunchQemu. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2016-08-26 16:33:41 +01:00
Mark Ryan	344aa22bd2	qemu: Add the qemu package The qemu package is a self contained package used for launching, halting and managing qemu instances. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>	2016-08-26 16:33:34 +01:00

8069 changed files with 1734692 additions and 524644 deletions

									
										30

.github/actionlint.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,30 @@

				# Copyright (c) 2024 Red Hat

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# Configuration file with rules for the actionlint tool.

				#

				self-hosted-runner:

				  # Labels of self-hosted runner that linter should ignore

				  labels:

				    - amd64-nvidia-a100

				    - amd64-nvidia-h100-snp

				    - arm64-k8s

				    - garm-ubuntu-2004

				    - garm-ubuntu-2004-smaller

				    - garm-ubuntu-2204

				    - garm-ubuntu-2304

				    - garm-ubuntu-2304-smaller

				    - garm-ubuntu-2204-smaller

				    - ppc64le

				    - ppc64le-k8s

				    - ppc64le-small

				    - ubuntu-24.04-ppc64le

				    - ubuntu-24.04-s390x

				    - metrics

				    - riscv-builder

				    - sev-snp

				    - s390x

				    - s390x-large

				    - tdx

				    - ubuntu-24.04-arm

									
										40

.github/cargo-deny-composite-action/cargo-deny-generator.sh
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,40 @@

				#!/bin/bash

				#

				# Copyright (c) 2022 Red Hat

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				script_dir=$(dirname "$(readlink -f "$0")")

				parent_dir=$(realpath "${script_dir}/../..")

				cidir="${parent_dir}/ci"

				source "${cidir}/../tests/common.bash"

				cargo_deny_file="${script_dir}/action.yaml"

				cat cargo-deny-skeleton.yaml.in > "${cargo_deny_file}"

				changed_files_status=$(run_get_pr_changed_file_details)

				changed_files_status=$(echo "$changed_files_status" | grep "Cargo\.toml$" || true)

				changed_files=$(echo "$changed_files_status" | awk '{print $NF}' || true)

				if [ -z "$changed_files" ]; then

				  cat >> "${cargo_deny_file}" << EOF

				    - run: echo "No Cargo.toml files to check"

				      shell: bash

				EOF

				fi

				for path in $changed_files

				do

				    cat >> "${cargo_deny_file}" << EOF

				    - name: ${path}

				      continue-on-error: true

				      shell: bash

				      run: |

				        pushd $(dirname ${path})

				        cargo deny check

				        popd

				EOF

				done

									
										30

.github/cargo-deny-composite-action/cargo-deny-skeleton.yaml.in
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,30 @@

				#

				# Copyright (c) 2022 Red Hat

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				name: 'Cargo Crates Check'

				description: 'Checks every Cargo.toml file using cargo-deny'

				env:

				  CARGO_TERM_COLOR: always

				runs:

				  using: "composite"

				  steps:

				    - name: Install Rust

				      uses: actions-rs/toolchain@v1

				      with:

				        profile: minimal

				        toolchain: nightly 

				        override: true

				    - name: Cache

				      uses: Swatinem/rust-cache@f0deed1e0edfc6a9be95417288c0e1099b1eeec3 # v2.7.7

				    - name: Install Cargo deny

				      shell: bash

				      run: |

				        which cargo

				        cargo install --locked cargo-deny || true

									
										93

.github/dependabot.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,93 @@

				---

				version: 2

				updates:

				  - package-ecosystem: "cargo"

				    directories:

				      - "/src/agent"

				      - "/src/dragonball"

				      - "/src/libs"

				      - "/src/mem-agent"

				      - "/src/mem-agent/example"

				      - "/src/runtime-rs"

				      - "/src/tools/agent-ctl"

				      - "/src/tools/genpolicy"

				      - "/src/tools/kata-ctl"

				      - "/src/tools/runk"

				      - "/src/tools/trace-forwarder"

				    schedule:

				      interval: "daily"

				    ignore:

				    # rust-vmm repos might cause incompatibilities on patch versions, so

				    # lets handle them manually for now.

				      - dependency-name: "event-manager"

				      - dependency-name: "kvm-bindings"

				      - dependency-name: "kvm-ioctls"

				      - dependency-name: "linux-loader"

				      - dependency-name: "seccompiler"

				      - dependency-name: "vfio-bindings"

				      - dependency-name: "vfio-ioctls"

				      - dependency-name: "virtio-bindings"

				      - dependency-name: "virtio-queue"

				      - dependency-name: "vm-fdt"

				      - dependency-name: "vm-memory"

				      - dependency-name: "vm-superio"

				      - dependency-name: "vmm-sys-util"

				    # As we often have up to 8/9 components that need the same versions bumps

				    # create groups for common dependencies, so they can all go in a single PR

				    # We can extend this as we see more frequent groups

				    groups:

				      bit-vec:

				        patterns:

				          - bit-vec

				      bumpalo:

				        patterns:

				          - bumpalo

				      clap:

				        patterns:

				          - clap

				      crossbeam:

				        patterns:

				          - crossbeam

				      h2:

				        patterns:

				          - h2

				      idna:

				        patterns:

				          - idna

				      openssl:

				        patterns:

				          - openssl

				      protobuf:

				        patterns:

				          - protobuf

				      rsa:

				        patterns:

				          - rsa

				      rustix:

				        patterns:

				          - rustix

				      slab:

				        patterns:

				          - slab

				      time:

				        patterns:

				          - time

				      tokio:

				        patterns:

				          - tokio

				      tracing:

				        patterns:

				          - tracing

				  - package-ecosystem: "gomod"

				    directories:

				      - "src/runtime"

				      - "tools/testing/kata-webhook"

				      - "src/tools/csi-kata-directvolume"

				    schedule:

				      interval: "daily"

				  - package-ecosystem: "github-actions"

				    directory: "/"

				    schedule:

				      interval: "monthly"

									
										10

.github/workflows/PR-wip-checks.yaml
									
										vendored
									
												View File
												
				@@ -9,13 +9,19 @@ on:

				      - labeled

				      - unlabeled

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  pr_wip_check:

				    runs-on: ubuntu-latest

				    runs-on: ubuntu-22.04

				    name: WIP Check

				    steps:

				    - name: WIP Check

				      uses: tim-actions/wip-check@1c2a1ca6c110026b3e2297bb2ef39e1747b5a755

				      uses: tim-actions/wip-check@1c2a1ca6c110026b3e2297bb2ef39e1747b5a755 # master (2021-06-10)

				      with:

				        labels: '["do-not-merge", "wip", "rfc"]'

				        keywords: '["WIP", "wip", "RFC", "rfc", "dnm", "DNM", "do-not-merge"]'

									
										30

.github/workflows/actionlint.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,30 @@

				name: Lint GHA workflows

				on:

				  workflow_dispatch:

				  pull_request:

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  run-actionlint:

				    name: run-actionlint

				    env:

				      GH_TOKEN: ${{ github.token }}

				    runs-on: ubuntu-24.04

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install actionlint gh extension

				        run: gh extension install https://github.com/cschleiden/gh-actionlint

				      - name: Run actionlint

				        run:  gh actionlint

									
										55

.github/workflows/add-issues-to-project.yaml
									
										vendored
									
												View File
											
				@@ -1,55 +0,0 @@

				# Copyright (c) 2020 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				name: Add newly created issues to the backlog project

				on:

				  issues:

				    types:

				      - opened

				      - reopened

				jobs:

				  add-new-issues-to-backlog:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Install hub

				        run: |

				          HUB_ARCH="amd64"

				          HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\

				            jq -r .tag_name | sed 's/^v//')

				          curl -sL \

				            "https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && \

				          sudo install hub /usr/local/bin

				      - name: Install hub extension script

				        run: |

				          # Clone into a temporary directory to avoid overwriting

				          # any existing github directory.

				          pushd $(mktemp -d) &>/dev/null

				          git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts

				          sudo install hub-util.sh /usr/local/bin

				          popd &>/dev/null

				      - name: Checkout code to allow hub to communicate with the project

				        uses: actions/checkout@v2

				      - name: Add issue to issue backlog

				        env:

				          GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}

				        run: |

				          issue=${{ github.event.issue.number }}

				          project_name="Issue backlog"

				          project_type="org"

				          project_column="To do"

				          hub-util.sh \

				            add-issue \

				            "$issue" \

				            "$project_name" \

				            "$project_type" \

				            "$project_column"

									
										391

.github/workflows/basic-ci-amd64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,391 @@

				name: CI | Basic amd64 tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  run-containerd-sandboxapi:

				    name: run-containerd-sandboxapi

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # the all the tests are not flaky, otherwise we'll fail

				      # all the tests due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        containerd_version: ['active']

				        vmm: ['dragonball', 'cloud-hypervisor', 'qemu-runtime-rs']

				    # TODO: enable me when https://github.com/containerd/containerd/issues/11640 is fixed

				    if: false

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      SANDBOXER: "shim"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts

				      - name: Run containerd-sandboxapi tests

				        timeout-minutes: 10

				        run: bash tests/integration/cri-containerd/gha-run.sh run

				  run-containerd-stability:

				    name: run-containerd-stability

				    strategy:

				      fail-fast: false

				      matrix:

				        containerd_version: ['lts', 'active']

				        vmm: ['clh', 'cloud-hypervisor', 'dragonball', 'qemu', 'qemu-runtime-rs']

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      SANDBOXER: "podsandbox"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/stability/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/stability/gha-run.sh install-kata kata-artifacts

				      - name: Run containerd-stability tests

				        timeout-minutes: 15

				        run: bash tests/stability/gha-run.sh run

				  run-nydus:

				    name: run-nydus

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # the all the tests are not flaky, otherwise we'll fail

				      # all the tests due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        containerd_version: ['lts', 'active']

				        vmm: ['clh', 'qemu', 'dragonball', 'qemu-runtime-rs']

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/nydus/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				      - name: Install kata

				        run: bash tests/integration/nydus/gha-run.sh install-kata kata-artifacts

				      - name: Install kata-tools

				        run: bash tests/integration/nydus/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Run nydus tests

				        timeout-minutes: 10

				        run: bash tests/integration/nydus/gha-run.sh run

				  run-runk:

				    name: run-runk

				    # Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether

				    if: false

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: lts

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/runk/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts

				      - name: Run runk tests

				        timeout-minutes: 10

				        run: bash tests/integration/runk/gha-run.sh run

				  run-tracing:

				    name: run-tracing

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - clh # cloud-hypervisor

				          - qemu

				    # TODO: enable me when https://github.com/kata-containers/kata-containers/issues/9763 is fixed

				    # TODO: Transition to free runner (see #9940).

				    if: false

				    runs-on: garm-ubuntu-2204-smaller

				    env:

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/functional/tracing/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/functional/tracing/gha-run.sh install-kata kata-artifacts

				      - name: Run tracing tests

				        timeout-minutes: 15

				        run: bash tests/functional/tracing/gha-run.sh run

				  run-vfio:

				    name: run-vfio

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - clh

				          - qemu

				    # TODO: enable with clh when https://github.com/kata-containers/kata-containers/issues/9764 is fixed

				    # TODO: enable with qemu when https://github.com/kata-containers/kata-containers/issues/9851 is fixed

				    # TODO: Transition to free runner (see #9940).

				    if: false

				    runs-on: garm-ubuntu-2304

				    env:

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/functional/vfio/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Run vfio tests

				        timeout-minutes: 15

				        run: bash tests/functional/vfio/gha-run.sh run

				  run-nerdctl-tests:

				    name: run-nerdctl-tests

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # all the tests are not flaky, otherwise we'll fail them

				      # all due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        vmm:

				          - clh

				          - dragonball

				          - qemu

				          - cloud-hypervisor

				          - qemu-runtime-rs

				    runs-on: ubuntu-22.04

				    env:

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        env:

				          GITHUB_API_TOKEN: ${{ github.token }}

				          GH_TOKEN: ${{ github.token }}

				        run: bash tests/integration/nerdctl/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/nerdctl/gha-run.sh install-kata kata-artifacts

				      - name: Run nerdctl smoke test

				        timeout-minutes: 5

				        run: bash tests/integration/nerdctl/gha-run.sh run

				      - name: Collect artifacts ${{ matrix.vmm }}

				        if: always()

				        run: bash tests/integration/nerdctl/gha-run.sh collect-artifacts

				        continue-on-error: true

				      - name: Archive artifacts ${{ matrix.vmm }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: nerdctl-tests-garm-${{ matrix.vmm }}

				          path: /tmp/artifacts

				          retention-days: 1

				  run-kata-agent-apis:

				    name: run-kata-agent-apis

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/functional/kata-agent-apis/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				      - name: Install kata & kata-tools

				        run: | 

				          bash tests/functional/kata-agent-apis/gha-run.sh install-kata kata-artifacts

				          bash tests/functional/kata-agent-apis/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Run kata agent api tests with agent-ctl

				        run: bash tests/functional/kata-agent-apis/gha-run.sh run

									
										108

.github/workflows/basic-ci-s390x.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,108 @@

				name: CI | Basic s390x tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  run-containerd-sandboxapi:

				    name: run-containerd-sandboxapi

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # the all the tests are not flaky, otherwise we'll fail

				      # all the tests due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        containerd_version: ['active']

				        vmm: ['qemu-runtime-rs']

				    # TODO: enable me when https://github.com/containerd/containerd/issues/11640 is fixed

				    if: false

				    runs-on: s390x-large

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      SANDBOXER: "shim"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/cri-containerd/gha-run.sh

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts

				      - name: Run containerd-sandboxapi tests

				        timeout-minutes: 10

				        run: bash tests/integration/cri-containerd/gha-run.sh run

				  run-containerd-stability:

				    name: run-containerd-stability

				    strategy:

				      fail-fast: false

				      matrix:

				        containerd_version: ['lts', 'active']

				        vmm: ['qemu']

				    runs-on: s390x-large

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      SANDBOXER: "podsandbox"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/stability/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/stability/gha-run.sh install-kata kata-artifacts

				      - name: Run containerd-stability tests

				        timeout-minutes: 15

				        run: bash tests/stability/gha-run.sh run

									
										134

.github/workflows/build-checks-preview-riscv64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,134 @@

				# This yaml is designed to be used until all components listed in

				# `build-checks.yaml` are supported

				on:

				  workflow_dispatch:

				    inputs:

				      instance:

				        default: "riscv-builder"

				        description: "Default instance when manually triggering"

				  workflow_call:

				    inputs:

				      instance:

				        required: true

				        type: string

				permissions: {}

				name: Build checks preview riscv64

				jobs:

				  check:

				    name: check

				    runs-on: ${{ inputs.instance }}

				    strategy:

				      fail-fast: false

				      matrix:

				        command:

				          - "make vendor"

				          - "make check"

				          - "make test"

				          - "sudo -E PATH=\"$PATH\" make test"

				        component:

				          - name: agent

				            path: src/agent

				            needs:

				              - rust

				              - libdevmapper

				              - libseccomp

				              - protobuf-compiler

				              - clang

				          - name: agent-ctl

				            path: src/tools/agent-ctl

				            needs:

				              - rust

				              - musl-tools

				              - protobuf-compiler

				              - clang

				          - name: trace-forwarder

				            path: src/tools/trace-forwarder

				            needs:

				              - rust

				              - musl-tools

				          - name: genpolicy

				            path: src/tools/genpolicy

				            needs:

				              - rust

				              - musl-tools

				              - protobuf-compiler

				          - name: runtime

				            path: src/runtime

				            needs:

				              - golang

				              - XDG_RUNTIME_DIR

				          - name: runtime-rs

				            path: src/runtime-rs

				            needs:

				              - rust

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE" "$HOME"

				          sudo rm -rf "$GITHUB_WORKSPACE"/* || { sleep 10 && sudo rm -rf "$GITHUB_WORKSPACE"/*; }

				          sudo rm -f /tmp/kata_hybrid*  # Sometime we got leftover from test_setup_hvsock_failed()

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install yq

				        run: |

				          ./ci/install_yq.sh

				        env:

				          INSTALL_IN_GOPATH: false

				      - name: Install golang

				        if: contains(matrix.component.needs, 'golang')

				        run: |

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "$GITHUB_PATH"

				      - name: Setup rust

				        if: contains(matrix.component.needs, 'rust')

				        run: |

				          ./tests/install_rust.sh

				          echo "${HOME}/.cargo/bin" >> "$GITHUB_PATH"

				          if [ "$(uname -m)" == "x86_64" ] || [ "$(uname -m)" == "aarch64" ]; then

				            sudo apt-get update && sudo apt-get -y install musl-tools

				          fi

				      - name: Install devicemapper

				        if: contains(matrix.component.needs, 'libdevmapper') && matrix.command == 'make check'

				        run: sudo apt-get update && sudo apt-get -y install libdevmapper-dev

				      - name: Install libseccomp

				        if: contains(matrix.component.needs, 'libseccomp') && matrix.command != 'make vendor' && matrix.command != 'make check'

				        run: |

				          libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)

				          gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)

				          ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"

				          echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"

				          echo "LIBSECCOMP_LINK_TYPE=static" >> "$GITHUB_ENV"

				          echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> "$GITHUB_ENV"

				      - name: Install protobuf-compiler

				        if: contains(matrix.component.needs, 'protobuf-compiler') && matrix.command != 'make vendor'

				        run: sudo apt-get update && sudo apt-get -y install protobuf-compiler

				      - name: Install clang

				        if: contains(matrix.component.needs, 'clang') && matrix.command == 'make check'

				        run: sudo apt-get update && sudo apt-get -y install clang

				      - name: Setup XDG_RUNTIME_DIR

				        if: contains(matrix.component.needs, 'XDG_RUNTIME_DIR') && matrix.command != 'make check'

				        run: |

				          XDG_RUNTIME_DIR=$(mktemp -d "/tmp/kata-tests-$USER.XXX" | tee >(xargs chmod 0700))

				          echo "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}" >> "$GITHUB_ENV"

				      - name: Skip tests that depend on virtualization capable runners when needed

				        if: inputs.instance == 'riscv-builder'

				        run: |

				          echo "GITHUB_RUNNER_CI_NON_VIRT=true" >> "$GITHUB_ENV"

				      - name: Running `${{ matrix.command }}` for ${{ matrix.component.name }}

				        run: |

				          cd "${COMPONENT_PATH}"

				          ${COMMAND}

				        env:

				          COMMAND: ${{ matrix.command }}

				          COMPONENT_PATH: ${{ matrix.component.path }}

				          RUST_BACKTRACE: "1"

				          RUST_LIB_BACKTRACE: "0"

				          SKIP_GO_VERSION_CHECK: "1"

									
										146

.github/workflows/build-checks.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,146 @@

				on:

				  workflow_call:

				    inputs:

				      instance:

				        required: true

				        type: string

				permissions: {}

				name: Build checks

				jobs:

				  check:

				    name: check

				    runs-on: >-

				      ${{

				        ( contains(inputs.instance, 's390x') && matrix.component.name == 'runtime' ) && 's390x' ||

				        ( contains(inputs.instance, 'ppc64le') && (matrix.component.name == 'runtime' || matrix.component.name == 'agent') ) && 'ppc64le' ||

				        inputs.instance

				      }}

				    strategy:

				      fail-fast: false

				      matrix:

				        command:

				          - "make vendor"

				          - "make check"

				          - "make test"

				          - "sudo -E PATH=\"$PATH\" make test"

				        component:

				          - name: agent

				            path: src/agent

				            needs:

				              - rust

				              - libdevmapper

				              - libseccomp

				              - protobuf-compiler

				              - clang

				          - name: dragonball

				            path: src/dragonball

				            needs:

				              - rust

				          - name: runtime

				            path: src/runtime

				            needs:

				              - golang

				              - XDG_RUNTIME_DIR

				          - name: runtime-rs

				            path: src/runtime-rs

				            needs:

				              - rust

				          - name: libs

				            path: src/libs

				            needs:

				              - rust

				              - protobuf-compiler

				          - name: agent-ctl

				            path: src/tools/agent-ctl

				            needs:

				              - rust

				              - protobuf-compiler

				              - clang

				          - name: kata-ctl

				            path: src/tools/kata-ctl

				            needs:

				              - rust

				              - protobuf-compiler

				          - name: trace-forwarder

				            path: src/tools/trace-forwarder

				            needs:

				              - rust

				          - name: genpolicy

				            path: src/tools/genpolicy

				            needs:

				              - rust

				              - protobuf-compiler

				        instance:

				          - ${{ inputs.instance }} 

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE" "$HOME"

				          sudo rm -rf "$GITHUB_WORKSPACE"/* || { sleep 10 && sudo rm -rf "$GITHUB_WORKSPACE"/*; }

				          sudo rm -f /tmp/kata_hybrid*  # Sometime we got leftover from test_setup_hvsock_failed()

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install yq

				        run: |

				          ./ci/install_yq.sh

				        env:

				          INSTALL_IN_GOPATH: false

				      - name: Install golang

				        if: contains(matrix.component.needs, 'golang')

				        run: |

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "$GITHUB_PATH"

				      - name: Setup rust

				        if: contains(matrix.component.needs, 'rust')

				        run: |

				          ./tests/install_rust.sh

				          echo "${HOME}/.cargo/bin" >> "$GITHUB_PATH"

				          if [ "$(uname -m)" == "x86_64" ] || [ "$(uname -m)" == "aarch64" ]; then

				            sudo apt-get update && sudo apt-get -y install musl-tools

				          fi

				      - name: Install devicemapper

				        if: contains(matrix.component.needs, 'libdevmapper') && matrix.command == 'make check'

				        run: sudo apt-get update && sudo apt-get -y install libdevmapper-dev

				      - name: Install libseccomp

				        if: contains(matrix.component.needs, 'libseccomp') && matrix.command != 'make vendor' && matrix.command != 'make check'

				        run: |

				          libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)

				          gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)

				          ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"

				          echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"

				          echo "LIBSECCOMP_LINK_TYPE=static" >> "$GITHUB_ENV"

				          echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> "$GITHUB_ENV"

				      - name: Install protobuf-compiler

				        if: contains(matrix.component.needs, 'protobuf-compiler') && matrix.command != 'make vendor'

				        run: sudo apt-get update && sudo apt-get -y install protobuf-compiler

				      - name: Install clang

				        if: contains(matrix.component.needs, 'clang') && matrix.command == 'make check'

				        run: sudo apt-get update && sudo apt-get -y install clang

				      - name: Setup XDG_RUNTIME_DIR

				        if: contains(matrix.component.needs, 'XDG_RUNTIME_DIR') && matrix.command != 'make check'

				        run: |

				          XDG_RUNTIME_DIR=$(mktemp -d "/tmp/kata-tests-$USER.XXX" | tee >(xargs chmod 0700))

				          echo "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}" >> "$GITHUB_ENV"

				      - name: Skip tests that depend on virtualization capable runners when needed

				        if: ${{ endsWith(inputs.instance, '-arm') }}

				        run: |

				          echo "GITHUB_RUNNER_CI_NON_VIRT=true" >> "$GITHUB_ENV"

				      - name: Running `${{ matrix.command }}` for ${{ matrix.component.name }}

				        run: |

				          cd "${COMPONENT_PATH}"

				          eval "${COMMAND}"

				        env:

				          COMMAND: ${{ matrix.command }}

				          COMPONENT_PATH: ${{ matrix.component.path }}

				          RUST_BACKTRACE: "1"

				          RUST_LIB_BACKTRACE: "0"

				          SKIP_GO_VERSION_CHECK: "1"

									
										462

.github/workflows/build-kata-static-tarball-amd64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,462 @@

				name: CI | Build kata-static tarball for amd64

				on:

				  workflow_call:

				    inputs:

				      stage:

				        required: false

				        type: string

				        default: test

				      tarball-suffix:

				        required: false

				        type: string

				      push-to-registry:

				        required: false

				        type: string

				        default: no

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: false

				      KBUILD_SIGN_PIN:

				        required: true

				permissions: {}

				jobs:

				  build-asset:

				    name: build-asset

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - busybox

				          - cloud-hypervisor

				          - cloud-hypervisor-glibc

				          - coco-guest-components

				          - firecracker

				          - kernel

				          - kernel-confidential

				          - kernel-dragonball-experimental

				          - kernel-nvidia-gpu

				          - kernel-nvidia-gpu-confidential

				          - nydus

				          - ovmf

				          - ovmf-sev

				          - ovmf-tdx

				          - pause-image

				          - qemu

				          - qemu-snp-experimental

				          - qemu-tdx-experimental

				          - virtiofsd

				        stage:

				          - ${{ inputs.stage }}

				        exclude:

				          - asset: cloud-hypervisor-glibc

				            stage: release

				    env:

				      PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				          KBUILD_SIGN_PIN: ${{ contains(matrix.asset, 'nvidia') && secrets.KBUILD_SIGN_PIN || '' }}

				      - name: Parse OCI image name and digest

				        id: parse-oci-segments

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				        run: |

				          oci_image="$(<"build/${KATA_ASSET}-oci-image")"

				          echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"

				          echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"

				      - uses: oras-project/setup-oras@22ce207df3b08e061f537244349aac6ae1d214f6 # v1.2.4

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          version: "1.2.0"

				      # for pushing attestations to the registry

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - uses: actions/attest-build-provenance@ef244123eb79f2f7a7e75d99086184180e6d0018 # v1.4.4

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}

				          subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}

				          push-to-registry: true

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				      - name: store-extratarballs-artifact ${{ matrix.asset }}

				        if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset }}-modules${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}-modules.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  build-asset-rootfs:

				    name: build-asset-rootfs

				    runs-on: ubuntu-22.04

				    needs: build-asset

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - rootfs-image

				          - rootfs-image-confidential

				          - rootfs-image-mariner

				          - rootfs-image-nvidia-gpu

				          - rootfs-image-nvidia-gpu-confidential

				          - rootfs-initrd

				          - rootfs-initrd-confidential

				          - rootfs-initrd-nvidia-gpu

				          - rootfs-initrd-nvidia-gpu-confidential

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				          KBUILD_SIGN_PIN: ${{ contains(matrix.asset, 'nvidia') && secrets.KBUILD_SIGN_PIN || '' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts:

				    name: remove-rootfs-binary-artifacts

				    runs-on: ubuntu-22.04

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - busybox

				          - coco-guest-components

				          - kernel-nvidia-gpu-modules

				          - kernel-nvidia-gpu-confidential-modules

				          - pause-image

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts-for-release:

				    name: remove-rootfs-binary-artifacts-for-release

				    runs-on: ubuntu-22.04

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - agent

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        if: ${{ inputs.stage == 'release' }}

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  build-asset-shim-v2:

				    name: build-asset-shim-v2

				    runs-on: ubuntu-22.04

				    needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts, remove-rootfs-binary-artifacts-for-release]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build shim-v2

				        id: build

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: shim-v2

				          TAR_OUTPUT: shim-v2.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				          MEASURED_ROOTFS: yes

				      - name: store-artifact shim-v2

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-amd64-shim-v2${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-shim-v2.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  create-kata-tarball:

				    name: create-kata-tarball

				    runs-on: ubuntu-22.04

				    needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-static.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  build-tools-asset:

				    name: build-tools-asset

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - agent-ctl

				          - csi-kata-directvolume

				          - genpolicy

				          - kata-ctl

				          - kata-manager

				          - trace-forwarder

				        stage:

				          - ${{ inputs.stage }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-tools-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-tools-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-tools-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-tools-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  create-kata-tools-tarball:

				    name: create-kata-tools-tarball

				    runs-on: ubuntu-22.04

				    needs: [build-tools-asset]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-tools-artifacts-amd64-*${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-tools-artifacts versions.yaml kata-tools-static.tar.zst

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-static.tar.zst

				          retention-days: 15

				          if-no-files-found: error

									
										336

.github/workflows/build-kata-static-tarball-arm64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,336 @@

				name: CI | Build kata-static tarball for arm64

				on:

				  workflow_call:

				    inputs:

				      stage:

				        required: false

				        type: string

				        default: test

				      tarball-suffix:

				        required: false

				        type: string

				      push-to-registry:

				        required: false

				        type: string

				        default: no

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: false

				      KBUILD_SIGN_PIN:

				        required: true

				permissions: {}

				jobs:

				  build-asset:

				    name: build-asset

				    runs-on: ubuntu-24.04-arm

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - busybox

				          - cloud-hypervisor

				          - firecracker

				          - kernel

				          - kernel-dragonball-experimental

				          - kernel-nvidia-gpu

				          - kernel-cca-confidential

				          - nydus

				          - ovmf

				          - qemu

				          - virtiofsd

				    env:

				      PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				          KBUILD_SIGN_PIN: ${{ contains(matrix.asset, 'nvidia') && secrets.KBUILD_SIGN_PIN || '' }}

				      - name: Parse OCI image name and digest

				        id: parse-oci-segments

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				        run: |

				          oci_image="$(<"build/${KATA_ASSET}-oci-image")"

				          echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"

				          echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"

				      - uses: oras-project/setup-oras@22ce207df3b08e061f537244349aac6ae1d214f6 # v1.2.4

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          version: "1.2.0"

				      # for pushing attestations to the registry

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - uses: actions/attest-build-provenance@ef244123eb79f2f7a7e75d99086184180e6d0018 # v1.4.4

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}

				          subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}

				          push-to-registry: true

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				      - name: store-extratarballs-artifact ${{ matrix.asset }}

				        if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset }}-modules${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}-modules.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  build-asset-rootfs:

				    name: build-asset-rootfs

				    runs-on: ubuntu-24.04-arm

				    needs: build-asset

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - rootfs-image

				          - rootfs-image-nvidia-gpu

				          - rootfs-initrd

				          - rootfs-initrd-nvidia-gpu

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build ${{ matrix.asset }}

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				          KBUILD_SIGN_PIN: ${{ contains(matrix.asset, 'nvidia') && secrets.KBUILD_SIGN_PIN || '' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts:

				    name: remove-rootfs-binary-artifacts

				    runs-on: ubuntu-24.04-arm

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - busybox

				          - kernel-nvidia-gpu-modules

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts-for-release:

				    name: remove-rootfs-binary-artifacts-for-release

				    runs-on: ubuntu-24.04-arm

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - agent

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        if: ${{ inputs.stage == 'release' }}

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  build-asset-shim-v2:

				    name: build-asset-shim-v2

				    runs-on: ubuntu-24.04-arm

				    needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts, remove-rootfs-binary-artifacts-for-release]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build shim-v2

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: shim-v2

				          TAR_OUTPUT: shim-v2.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact shim-v2

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-arm64-shim-v2${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-shim-v2.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  create-kata-tarball:

				    name: create-kata-tarball

				    runs-on: ubuntu-24.04-arm

				    needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-static-tarball-arm64${{ inputs.tarball-suffix }}

				          path: kata-static.tar.zst

				          retention-days: 15

				          if-no-files-found: error

									
										271

.github/workflows/build-kata-static-tarball-ppc64le.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,271 @@

				name: CI | Build kata-static tarball for ppc64le

				on:

				  workflow_call:

				    inputs:

				      stage:

				        required: false

				        type: string

				        default: test

				      tarball-suffix:

				        required: false

				        type: string

				      push-to-registry:

				        required: false

				        type: string

				        default: no

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions: {}

				jobs:

				  build-asset:

				    name: build-asset

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-24.04-ppc64le

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - kernel

				          - qemu

				          - virtiofsd

				        stage:

				          - ${{ inputs.stage }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 1

				          if-no-files-found: error

				  build-asset-rootfs:

				    name: build-asset-rootfs

				    runs-on: ubuntu-24.04-ppc64le

				    needs: build-asset

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - rootfs-initrd

				        stage:

				          - ${{ inputs.stage }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build ${{ matrix.asset }}

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 1

				          if-no-files-found: error

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts:

				    name: remove-rootfs-binary-artifacts

				    runs-on: ubuntu-22.04

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - agent

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        if: ${{ inputs.stage == 'release' }}

				        with:

				          name: kata-artifacts-ppc64le-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  build-asset-shim-v2:

				    name: build-asset-shim-v2

				    runs-on: ubuntu-24.04-ppc64le

				    needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build shim-v2

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: shim-v2

				          TAR_OUTPUT: shim-v2.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact shim-v2

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-ppc64le-shim-v2${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-shim-v2.tar.zst

				          retention-days: 1

				          if-no-files-found: error

				  create-kata-tarball:

				    name: create-kata-tarball

				    runs-on: ubuntu-24.04-ppc64le

				    needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-static-tarball-ppc64le${{ inputs.tarball-suffix }}

				          path: kata-static.tar.zst

				          retention-days: 1

				          if-no-files-found: error

									
										75

.github/workflows/build-kata-static-tarball-riscv64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,75 @@

				name: CI | Build kata-static tarball for riscv64

				on:

				  workflow_call:

				    inputs:

				      stage:

				        required: false

				        type: string

				        default: test

				      tarball-suffix:

				        required: false

				        type: string

				      push-to-registry:

				        required: false

				        type: string

				        default: no

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  build-asset:

				    name: build-asset

				    runs-on: riscv-builder

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    strategy:

				      matrix:

				        asset:

				          - kernel

				          - virtiofsd

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-riscv64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 3

				          if-no-files-found: error

									
										360

.github/workflows/build-kata-static-tarball-s390x.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,360 @@

				name: CI | Build kata-static tarball for s390x

				on:

				  workflow_call:

				    inputs:

				      stage:

				        required: false

				        type: string

				        default: test

				      tarball-suffix:

				        required: false

				        type: string

				      push-to-registry:

				        required: false

				        type: string

				        default: no

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      CI_HKD_PATH:

				        required: true

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions: {}

				jobs:

				  build-asset:

				    name: build-asset

				    runs-on: ubuntu-24.04-s390x

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - coco-guest-components

				          - kernel

				          - kernel-confidential

				          - pause-image

				          - qemu

				          - virtiofsd

				    env:

				      PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: Parse OCI image name and digest

				        id: parse-oci-segments

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        env:

				          ASSET: ${{ matrix.asset }}

				        run: |

				          oci_image="$(<"build/${ASSET}-oci-image")"

				          echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"

				          echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"

				      # for pushing attestations to the registry

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - uses: actions/attest-build-provenance@ef244123eb79f2f7a7e75d99086184180e6d0018 # v1.4.4

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}

				          subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}

				          push-to-registry: true

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  build-asset-rootfs:

				    name: build-asset-rootfs

				    runs-on: s390x

				    needs: build-asset

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - rootfs-image

				          - rootfs-image-confidential

				          - rootfs-initrd

				          - rootfs-initrd-confidential

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  build-asset-boot-image-se:

				    name: build-asset-boot-image-se

				    runs-on: s390x

				    needs: [build-asset, build-asset-rootfs]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Place a host key document

				        run: |

				          mkdir -p "host-key-document"

				          cp "${CI_HKD_PATH}" "host-key-document"

				        env:

				          CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      - name: Build boot-image-se

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "boot-image-se"

				          make boot-image-se-tarball

				          build_dir=$(readlink -f build)

				          sudo cp -r "${build_dir}" "kata-build"

				          sudo chown -R "$(id -u)":"$(id -g)" "kata-build"

				        env:

				          HKD_PATH: "host-key-document"

				      - name: store-artifact boot-image-se

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-s390x${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-boot-image-se.tar.zst

				          retention-days: 1

				          if-no-files-found: error

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts:

				    name: remove-rootfs-binary-artifacts

				    runs-on: ubuntu-22.04

				    needs: [build-asset-rootfs, build-asset-boot-image-se]

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - coco-guest-components

				          - pause-image

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        if: ${{ inputs.stage == 'release' }}

				        with:

				          name: kata-artifacts-s390x-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  build-asset-shim-v2:

				    name: build-asset-shim-v2

				    runs-on: ubuntu-24.04-s390x

				    needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build shim-v2

				        id: build

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: shim-v2

				          TAR_OUTPUT: shim-v2.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				          MEASURED_ROOTFS: no

				      - name: store-artifact shim-v2

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-s390x-shim-v2${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-shim-v2.tar.zst

				          retention-days: 15

				          if-no-files-found: error

				  create-kata-tarball:

				    name: create-kata-tarball

				    runs-on: ubuntu-24.04-s390x

				    needs:

				      - build-asset

				      - build-asset-rootfs

				      - build-asset-boot-image-se

				      - build-asset-shim-v2

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}

				          path: kata-static.tar.zst

				          retention-days: 15

				          if-no-files-found: error

									
										75

.github/workflows/build-kubectl-image.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,75 @@

				name: Build kubectl multi-arch image

				on:

				  schedule:

				    # Run every Sunday at 00:00 UTC

				    - cron: '0 0 * * 0'

				  workflow_dispatch:

				    # Allow manual triggering

				  push:

				    branches:

				      - main

				    paths:

				      - 'tools/packaging/kubectl/Dockerfile'

				      - '.github/workflows/build-kubectl-image.yaml'

				permissions: {}

				env:

				  REGISTRY: quay.io

				  IMAGE_NAME: kata-containers/kubectl

				jobs:

				  build-and-push:

				    name: Build and push multi-arch image

				    runs-on: ubuntu-24.04

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Set up QEMU

				        uses: docker/setup-qemu-action@29109295f81e9208d7d86ff1c6c12d2833863392 # v3.6.0

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0

				      - name: Login to Quay.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ${{ env.REGISTRY }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Get kubectl version

				        id: kubectl-version

				        run: |

				          KUBECTL_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt)

				          echo "version=${KUBECTL_VERSION}" >> "$GITHUB_OUTPUT"

				      - name: Generate image metadata

				        id: meta

				        uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 # v5.7.0

				        with:

				          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

				          tags: |

				            type=raw,value=latest

				            type=raw,value={{date 'YYYYMMDD'}}

				            type=raw,value=${{ steps.kubectl-version.outputs.version }}

				            type=sha,prefix=

				      - name: Build and push multi-arch image

				        uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0

				        with:

				          context: tools/packaging/kubectl/

				          file: tools/packaging/kubectl/Dockerfile

				          platforms: linux/amd64,linux/arm64,linux/s390x,linux/ppc64le

				          push: true

				          tags: ${{ steps.meta.outputs.tags }}

				          labels: ${{ steps.meta.outputs.labels }}

				          cache-from: type=gha

				          cache-to: type=gha,mode=max

									
										32

.github/workflows/cargo-deny-runner.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,32 @@

				name: Cargo Crates Check Runner

				on:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				permissions: {}

				jobs:

				  cargo-deny-runner:

				    name: cargo-deny-runner

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout Code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Generate Action

				        run: bash cargo-deny-generator.sh

				        working-directory: ./.github/cargo-deny-composite-action/

				        env:

				          GOPATH: ${{ github.workspace }}/kata-containers

				      - name: Run Action

				        uses: ./.github/cargo-deny-composite-action

									
										33

.github/workflows/ci-coco-stability.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,33 @@

				name: Kata Containers CoCo Stability Tests Weekly

				on:

				  # Note: This workload is not currently maintained, so skipping it's scheduled runs

				  # schedule:

				  #   - cron: '0 0 * * 0'

				  workflow_dispatch:

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				permissions: {}

				jobs:

				  kata-containers-ci-on-push:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci-weekly.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      pr-number: "weekly"

				      tag: ${{ github.sha }}-weekly

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

									
										35

.github/workflows/ci-devel.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,35 @@

				name: Kata Containers CI (manually triggered)

				on:

				  workflow_dispatch:

				permissions: {}

				jobs:

				  kata-containers-ci-on-push:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      pr-number: "dev"

				      tag: ${{ github.sha }}-dev

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      NGC_API_KEY: ${{ secrets.NGC_API_KEY }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				  build-checks:

				    uses: ./.github/workflows/build-checks.yaml

				    with:

				      instance: ubuntu-22.04

									
										34

.github/workflows/ci-nightly-riscv.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,34 @@

				on:

				  schedule:

				    - cron: '0 5 * * *'

				name: Nightly CI for RISC-V

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  cancel-in-progress: true

				permissions: {}

				jobs:

				  build-kata-static-tarball-riscv:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-riscv64.yaml

				    with:

				      tarball-suffix: -${{ github.sha }}

				      commit-hash: ${{ github.sha }}

				      target-branch: ${{ github.ref_name }}

				  build-checks-preview:

				    strategy:

				      fail-fast: false

				      matrix:

				        instance:

				          - "riscv-builder"

				    uses: ./.github/workflows/build-checks-preview-riscv64.yaml

				    with:

				      instance: ${{ matrix.instance }}

									
										36

.github/workflows/ci-nightly-rust.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,36 @@

				name: Kata Containers Nightly CI (Rust)

				on:

				  schedule:

				    - cron: '0 1 * * *' # Run at 1 AM UTC (1 hour after script-based nightly)

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				permissions: {}

				jobs:

				  kata-containers-ci-on-push-rust:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      pr-number: "nightly-rust"

				      tag: ${{ github.sha }}-nightly-rust

				      target-branch: ${{ github.ref_name }}

				      build-type: "rust" # Use Rust-based build

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      NGC_API_KEY: ${{ secrets.NGC_API_KEY }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

									
										27

.github/workflows/ci-nightly-s390x.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,27 @@

				on:

				  schedule:

				    - cron: '0 5 * * *'

				name: Nightly CI for s390x

				permissions: {}

				jobs:

				  check-internal-test-result:

				    name: check-internal-test-result

				    runs-on: s390x

				    strategy:

				      fail-fast: false

				      matrix:

				        test_title:

				          - kata-vfio-ap-e2e-tests

				          - cc-vfio-ap-e2e-tests

				          - cc-se-e2e-tests-go

				          - cc-se-e2e-tests-rs

				    steps:

				    - name: Fetch a test result for {{ matrix.test_title }}

				      run: |

				        file_name="${TEST_TITLE}-$(date +%Y-%m-%d).log"

				        "/home/${USER}/script/handle_test_log.sh" download "$file_name"

				      env:

				        TEST_TITLE: ${{ matrix.test_title }}

									
										34

.github/workflows/ci-nightly.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,34 @@

				name: Kata Containers Nightly CI

				on:

				  schedule:

				    - cron: '0 0 * * *'

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				permissions: {}

				jobs:

				  kata-containers-ci-on-push:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      pr-number: "nightly"

				      tag: ${{ github.sha }}-nightly

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      NGC_API_KEY: ${{ secrets.NGC_API_KEY }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

									
										54

.github/workflows/ci-on-push.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,54 @@

				name: Kata Containers CI

				on:

				  pull_request_target: # zizmor: ignore[dangerous-triggers] See #11332.

				    branches:

				      - 'main'

				    types:

				      # Adding 'labeled' to the list of activity types that trigger this event

				      # (default: opened, synchronize, reopened) so that we can run this

				      # workflow when the 'ok-to-test' label is added.

				      # Reference: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target

				      - opened

				      - synchronize

				      - reopened

				      - labeled

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  skipper:

				    if: ${{ contains(github.event.pull_request.labels.*.name, 'ok-to-test') }}

				    uses: ./.github/workflows/gatekeeper-skipper.yaml

				    with:

				      commit-hash: ${{ github.event.pull_request.head.sha }}

				      target-branch: ${{ github.event.pull_request.base.ref }}

				  kata-containers-ci-on-push:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_build != 'yes' }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci.yaml

				    with:

				      commit-hash: ${{ github.event.pull_request.head.sha }}

				      pr-number: ${{ github.event.pull_request.number }}

				      tag: ${{ github.event.pull_request.number }}-${{ github.event.pull_request.head.sha }}

				      target-branch: ${{ github.event.pull_request.base.ref }}

				      skip-test: ${{ needs.skipper.outputs.skip_test }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      NGC_API_KEY: ${{ secrets.NGC_API_KEY }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

									
										128

.github/workflows/ci-weekly.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,128 @@

				name: Run the CoCo Kata Containers Stability CI

				on:

				  workflow_call:

				    inputs:

				      commit-hash:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				      KBUILD_SIGN_PIN:

				        required: true

				permissions: {}

				jobs:

				  build-kata-static-tarball-amd64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				  publish-kata-deploy-payload-amd64:

				    needs: build-kata-static-tarball-amd64

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-22.04

				      arch: amd64

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-and-publish-tee-confidential-unencrypted-image:

				    name: build-and-publish-tee-confidential-unencrypted-image

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Set up QEMU

				        uses: docker/setup-qemu-action@29109295f81e9208d7d86ff1c6c12d2833863392 # v3.6.0

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Docker build and push

				        uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0

				        with:

				          tags: ghcr.io/kata-containers/test-images:unencrypted-${{ inputs.pr-number }}

				          push: true

				          context: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/

				          platforms: linux/amd64

				          file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile

				  run-kata-coco-stability-tests:

				    needs: [publish-kata-deploy-payload-amd64, build-and-publish-tee-confidential-unencrypted-image]

				    uses: ./.github/workflows/run-kata-coco-stability-tests.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				      tarball-suffix: -${{ inputs.tag }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				    permissions:

				      contents: read

				      id-token: write

									
										502

.github/workflows/ci.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,502 @@

				name: Run the Kata Containers CI

				on:

				  workflow_call:

				    inputs:

				      commit-hash:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				      skip-test:

				        required: false

				        type: string

				        default: no

				      build-type:

				        description: The build type for kata-deploy. Use 'rust' for Rust-based build, empty or omit for script-based (default).

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				      CI_HKD_PATH:

				        required: true

				      ITA_KEY:

				        required: true

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				      NGC_API_KEY:

				        required: true

				      KBUILD_SIGN_PIN:

				        required: true

				permissions: {}

				jobs:

				  build-kata-static-tarball-amd64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				  publish-kata-deploy-payload-amd64:

				    needs: build-kata-static-tarball-amd64

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-22.04

				      arch: amd64

				      build-type: ${{ inputs.build-type }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-kata-static-tarball-arm64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				  publish-kata-deploy-payload-arm64:

				    needs: build-kata-static-tarball-arm64

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-arm64

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-24.04-arm

				      arch: arm64

				      build-type: ${{ inputs.build-type }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-kata-static-tarball-s390x:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      CI_HKD_PATH: ${{ secrets.ci_hkd_path }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-kata-static-tarball-ppc64le:

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-s390x:

				    needs: build-kata-static-tarball-s390x

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-s390x

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-24.04-s390x

				      arch: s390x

				      build-type: ${{ inputs.build-type }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-ppc64le:

				    needs: build-kata-static-tarball-ppc64le

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-ppc64le

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-24.04-ppc64le

				      arch: ppc64le

				      build-type: ${{ inputs.build-type }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-and-publish-tee-confidential-unencrypted-image:

				    name: build-and-publish-tee-confidential-unencrypted-image

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Set up QEMU

				        uses: docker/setup-qemu-action@29109295f81e9208d7d86ff1c6c12d2833863392 # v3.6.0

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Docker build and push

				        uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0

				        with:

				          tags: ghcr.io/kata-containers/test-images:unencrypted-${{ inputs.pr-number }}

				          push: true

				          context: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/

				          platforms: linux/amd64, linux/s390x

				          file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile

				  publish-csi-driver-amd64:

				    name: publish-csi-driver-amd64

				    needs: build-kata-static-tarball-amd64

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64-${{ inputs.tag }}

				          path: kata-tools-artifacts

				      - name: Install kata-tools

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Copy binary into Docker context

				        run: |

				          # Copy to the location where the Dockerfile expects the binary.

				          mkdir -p src/tools/csi-kata-directvolume/bin/

				          cp /opt/kata/bin/csi-kata-directvolume src/tools/csi-kata-directvolume/bin/directvolplugin

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Docker build and push

				        uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0

				        with:

				          tags: ghcr.io/kata-containers/csi-kata-directvolume:${{ inputs.pr-number }}

				          push: true

				          context: src/tools/csi-kata-directvolume/

				          platforms: linux/amd64

				          file: src/tools/csi-kata-directvolume/Dockerfile

				  run-kata-monitor-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-amd64

				    uses: ./.github/workflows/run-kata-monitor-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  run-k8s-tests-on-aks:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: publish-kata-deploy-payload-amd64

				    uses: ./.github/workflows/run-k8s-tests-on-aks.yaml

				    permissions:

				      contents: read

				      id-token: write # Used for OIDC access to log into Azure

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				  run-k8s-tests-on-arm64:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: publish-kata-deploy-payload-arm64

				    uses: ./.github/workflows/run-k8s-tests-on-arm64.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-arm64${{ inputs.build-type == 'rust' && '-rust' || '' }}

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				  run-k8s-tests-on-nvidia-gpu:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: publish-kata-deploy-payload-amd64

				    uses: ./.github/workflows/run-k8s-tests-on-nvidia-gpu.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      NGC_API_KEY: ${{ secrets.NGC_API_KEY }}

				  run-kata-coco-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs:

				     - publish-kata-deploy-payload-amd64

				     - build-and-publish-tee-confidential-unencrypted-image

				     - publish-csi-driver-amd64

				    uses: ./.github/workflows/run-kata-coco-tests.yaml

				    permissions:

				      contents: read

				      id-token: write # Used for OIDC access to log into Azure

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				  run-k8s-tests-on-zvsi:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: [publish-kata-deploy-payload-s390x, build-and-publish-tee-confidential-unencrypted-image]

				    uses: ./.github/workflows/run-k8s-tests-on-zvsi.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-s390x${{ inputs.build-type == 'rust' && '-rust' || '' }}

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				  run-k8s-tests-on-ppc64le:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: publish-kata-deploy-payload-ppc64le

				    uses: ./.github/workflows/run-k8s-tests-on-ppc64le.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-ppc64le${{ inputs.build-type == 'rust' && '-rust' || '' }}

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				  run-kata-deploy-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: [publish-kata-deploy-payload-amd64]

				    uses: ./.github/workflows/run-kata-deploy-tests.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				  run-basic-amd64-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-amd64

				    uses: ./.github/workflows/basic-ci-amd64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  run-basic-s390x-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-s390x

				    uses: ./.github/workflows/basic-ci-s390x.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  run-cri-containerd-amd64:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-amd64

				    strategy:

				      fail-fast: false

				      matrix:

				        params: [

				          { containerd_version: lts,    vmm: clh              },

				          { containerd_version: lts,    vmm: dragonball       },

				          { containerd_version: lts,    vmm: qemu             },

				          { containerd_version: lts,    vmm: cloud-hypervisor },

				          { containerd_version: lts,    vmm: qemu-runtime-rs  },

				          { containerd_version: active, vmm: clh              },

				          { containerd_version: active, vmm: dragonball       },

				          { containerd_version: active, vmm: qemu             },

				          { containerd_version: active, vmm: cloud-hypervisor },

				          { containerd_version: active, vmm: qemu-runtime-rs  },

				         ]

				    uses: ./.github/workflows/run-cri-containerd-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-22.04

				      arch: amd64

				      containerd_version: ${{ matrix.params.containerd_version }}

				      vmm: ${{ matrix.params.vmm }}

				  run-cri-containerd-s390x:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-s390x

				    strategy:

				      fail-fast: false

				      matrix:

				        params: [

				          { containerd_version: active, vmm: qemu            },

				          { containerd_version: active, vmm: qemu-runtime-rs },

				         ]

				    uses: ./.github/workflows/run-cri-containerd-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: s390x-large

				      arch: s390x

				      containerd_version: ${{ matrix.params.containerd_version }}

				      vmm: ${{ matrix.params.vmm }}

				  run-cri-containerd-tests-ppc64le:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-ppc64le

				    strategy:

				      fail-fast: false

				      matrix:

				        params: [

				          { containerd_version: active, vmm: qemu },

				         ]

				    uses: ./.github/workflows/run-cri-containerd-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ppc64le-small

				      arch: ppc64le

				      containerd_version: ${{ matrix.params.containerd_version }}

				      vmm: ${{ matrix.params.vmm }}

				  run-cri-containerd-tests-arm64:

				    if: false

				    needs: build-kata-static-tarball-arm64

				    strategy:

				      fail-fast: false

				      matrix:

				        params: [

				          { containerd_version: active, vmm: qemu },

				         ]

				    uses: ./.github/workflows/run-cri-containerd-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: arm64-non-k8s

				      arch: arm64

				      containerd_version: ${{ matrix.params.containerd_version }}

				      vmm: ${{ matrix.params.vmm }}

									
										38

.github/workflows/cleanup-resources.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,38 @@

				name: Cleanup dangling Azure resources

				on:

				  schedule:

				    - cron: "0 0 * * *"

				  workflow_dispatch:

				permissions: {}

				jobs:

				  cleanup-resources:

				    name: cleanup-resources

				    runs-on: ubuntu-22.04

				    permissions:

				      id-token: write # Used for OIDC access to log into Azure

				    environment: ci

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Log into Azure

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Install Python dependencies

				        run: |

				          pip3 install --user --upgrade \

				            azure-identity==1.16.0 \

				            azure-mgmt-resource==23.0.1

				      - name: Cleanup resources

				        env:

				          AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				          CLEANUP_AFTER_HOURS: 24 # Clean up resources created more than this many hours ago.

				        run: python3 tests/cleanup_resources.py

									
										100

.github/workflows/codeql.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,100 @@

				# For most projects, this workflow file will not need changing; you simply need

				# to commit it to your repository.

				#

				# You may wish to alter this file to override the set of languages analyzed,

				# or to provide custom queries or build logic.

				#

				# ******** NOTE ********

				# We have attempted to detect the languages in your repository. Please check

				# the `language` matrix defined below to confirm you have the correct set of

				# supported CodeQL languages.

				#

				name: "CodeQL Advanced"

				on:

				  push:

				    branches: [ "main" ]

				  pull_request:

				    branches: [ "main" ]

				  schedule:

				    - cron: '45 0 * * 1'

				permissions: {}

				jobs:

				  analyze:

				    name: Analyze (${{ matrix.language }})

				    # Runner size impacts CodeQL analysis time. To learn more, please see:

				    #   - https://gh.io/recommended-hardware-resources-for-running-codeql

				    #   - https://gh.io/supported-runners-and-hardware-resources

				    #   - https://gh.io/using-larger-runners (GitHub.com only)

				    # Consider using larger runners or machines with greater resources for possible analysis time improvements.

				    runs-on: ubuntu-24.04

				    permissions:

				      # required for all workflows

				      security-events: write

				      # required to fetch internal or private CodeQL packs

				      packages: read

				      # only required for workflows in private repositories

				      actions: read

				      contents: read

				    strategy:

				      fail-fast: false

				      matrix:

				        include:

				        - language: go

				          build-mode: manual

				        - language: python

				          build-mode: none

				        # CodeQL supports the following values keywords for 'language': 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift'

				        # Use `c-cpp` to analyze code written in C, C++ or both

				        # Use 'java-kotlin' to analyze code written in Java, Kotlin or both

				        # Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both

				        # To learn more about changing the languages that are analyzed or customizing the build mode for your analysis,

				        # see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning.

				        # If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how

				        # your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages

				    steps:

				    - name: Checkout repository

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        persist-credentials: false

				    # Add any setup steps before running the `github/codeql-action/init` action.

				    # This includes steps like installing compilers or runtimes (`actions/setup-node`

				    # or others). This is typically only required for manual builds.

				    # - name: Setup runtime (example)

				    #   uses: actions/setup-example@v1

				    # Initializes the CodeQL tools for scanning.

				    - name: Initialize CodeQL

				      uses: github/codeql-action/init@v3

				      with:

				        languages: ${{ matrix.language }}

				        build-mode: ${{ matrix.build-mode }}

				        # If you wish to specify custom queries, you can do so here or in a config file.

				        # By default, queries listed here will override any specified in a config file.

				        # Prefix the list here with "+" to use these queries and those in the config file.

				        # For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs

				        # queries: security-extended,security-and-quality

				    # If the analyze step fails for one of the languages you are analyzing with

				    # "We were unable to automatically build your code", modify the matrix above

				    # to set the build mode to "manual" for that language. Then modify this step

				    # to build your code.

				    # ℹ️ Command-line programs to run using the OS shell.

				    # 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun

				    - if: matrix.build-mode == 'manual' && matrix.language == 'go'

				      shell: bash

				      run: |

				        make -C src/runtime

				    - name: Perform CodeQL Analysis

				      uses: github/codeql-action/analyze@v3

				      with:

				        category: "/language:${{matrix.language}}"

									
										60

.github/workflows/commit-message-check.yaml
									
										vendored
									
												View File
												
				@@ -6,37 +6,54 @@ on:

				      - reopened

				      - synchronize

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				env:

				  error_msg: |+

				    See the document below for help on formatting commits for the project.

				    https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#patch-format

				    https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md#patch-format

				jobs:

				  commit-message-check:

				    runs-on: ubuntu-latest

				    runs-on: ubuntu-22.04

				    env:

				      PR_AUTHOR: ${{ github.event.pull_request.user.login }}

				    name: Commit Message Check

				    steps:

				    - name: Get PR Commits

				      id: 'get-pr-commits'

				      uses: tim-actions/get-pr-commits@v1.0.0

				      uses: tim-actions/get-pr-commits@c64db31d359214d244884dd68f971a110b29ab83 # v1.2.0

				      with:

				        token: ${{ secrets.GITHUB_TOKEN }}

				        # Filter out revert commits

				        # The format of a revert commit is as follows:

				        #

				        # Revert "<original-subject-line>"

				        #

				        # The format of a re-re-vert commit as follows:

				        #

				        # Reapply "<original-subject-line>"

				        filter_out_pattern: '^Revert "|^Reapply "'

				    - name: DCO Check

				      uses: tim-actions/dco@2fd0504dc0d27b33f542867c300c60840c6dcb20

				      uses: tim-actions/dco@f2279e6e62d5a7d9115b0cb8e837b777b1b02e21 # v1.1.0

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				    - name: Commit Body Missing Check

				      if: ${{ success() || failure() }}

				      uses: tim-actions/commit-body-check@v1.0.2

				      uses: tim-actions/commit-body-check@d2e0e8e1f0332b3281c98867c42a2fbe25ad3f15 # v1.0.2

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				    - name: Check Subject Line Length

				      if: ${{ success() || failure() }}

				      uses: tim-actions/commit-message-checker-with-regex@v0.3.1

				      if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@d6d9770051dd6460679d1cab1dcaa8cffc5c2bbd # v0.3.1

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				        pattern: '^.{0,75}(\n.*)*$'

				@@ -44,8 +61,8 @@ jobs:

				        post_error: ${{ env.error_msg }}

				    - name: Check Body Line Length

				      if: ${{ success() || failure() }}

				      uses: tim-actions/commit-message-checker-with-regex@v0.3.1

				      if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@d6d9770051dd6460679d1cab1dcaa8cffc5c2bbd # v0.3.1

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				        # Notes:

				@@ -54,8 +71,12 @@ jobs:

				        #   to be specified at the start of the regex as the action is passed

				        #   the entire commit message.

				        #

				        # - This check will pass if the commit message only contains a subject

				        #   line, as other body message properties are enforced elsewhere.

				        #

				        # - Body lines *can* be longer than the maximum if they start

				        #   with a non-alphabetic character.

				        #   with a non-alphabetic character or if there is no whitespace in

				        #   the line.

				        #

				        #   This allows stack traces, log files snippets, emails, long URLs,

				        #   etc to be specified. Some of these naturally "work" as they start

				@@ -66,24 +87,13 @@ jobs:

				        #

				        # - A SoB comment can be any length (as it is unreasonable to penalise

				        #   people with long names/email addresses :)

				        pattern: '^.+(\n([a-zA-Z].{0,149}|[^a-zA-Z\n].*|Signed-off-by:.*|))+$'

				        error: 'Body line too long (max 72)'

				        pattern: '(^[^\n]+$|^.+(\n([a-zA-Z].{0,150}|[^a-zA-Z\n].*|[^\s\n]*|Signed-off-by:.*|))+$)'

				        error: 'Body line too long (max 150)'

				        post_error: ${{ env.error_msg }}

				    - name: Check Fixes

				      if: ${{ success() || failure() }}

				      uses: tim-actions/commit-message-checker-with-regex@v0.3.1

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				        pattern: '\s*Fixes\s*:?\s*(#\d+|github\.com\/kata-containers\/[a-z-.]*#\d+)|^\s*release\s*:'

				        flags: 'i'

				        error: 'No "Fixes" found'

				        post_error: ${{ env.error_msg }}

				        one_pass_all_pass: 'true'

				    - name: Check Subsystem

				      if: ${{ success() || failure() }}

				      uses: tim-actions/commit-message-checker-with-regex@v0.3.1

				      if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@d6d9770051dd6460679d1cab1dcaa8cffc5c2bbd # v0.3.1

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				        pattern: '^[\s\t]*[^:\s\t]+[\s\t]*:'

									
										43

.github/workflows/darwin-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,43 @@

				on:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				name: Darwin tests

				jobs:

				  test:

				    name: test

				    runs-on: macos-latest

				    steps:

				    - name: Install Protoc

				      run: |

				        f=$(mktemp)

				        curl -sSLo "$f" https://github.com/protocolbuffers/protobuf/releases/download/v28.2/protoc-28.2-osx-aarch_64.zip

				        mkdir -p "$HOME/.local"

				        unzip -d "$HOME/.local" "$f"

				        echo "$HOME/.local/bin" >> "${GITHUB_PATH}"

				    - name: Checkout code

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        persist-credentials: false

				    - name: Install golang

				      run: |

				        ./tests/install_go.sh -f -p

				        echo "/usr/local/go/bin" >> "${GITHUB_PATH}"

				    - name: Install Rust

				      run: ./tests/install_rust.sh

				    - name: Build utils

				      run: ./ci/darwin-test.sh

									
										34

.github/workflows/docs-url-alive-check.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,34 @@

				on:

				  schedule:

				    - cron:  '0 23 * * 0'

				  workflow_dispatch:

				permissions: {}

				name: Docs URL Alive Check

				jobs:

				  test:

				    name: test

				    runs-on: ubuntu-22.04

				    # don't run this action on forks

				    if: github.repository_owner == 'kata-containers'

				    env:

				      target_branch: ${{ github.base_ref }}

				    steps:

				    - name: Set env

				      run: |

				        echo "GOPATH=${GITHUB_WORKSPACE}" >> "$GITHUB_ENV"

				    - name: Checkout code

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        fetch-depth: 0

				        persist-credentials: false

				    - name: Install golang

				      run: |

				        ./tests/install_go.sh -f -p

				        echo "/usr/local/go/bin" >> "${GITHUB_PATH}"

				    - name: Docs URL Alive Check

				      run: |

				        make docs-url-alive-check

									
										32

.github/workflows/docs.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,32 @@

				name: Documentation

				on:

				  push:

				    branches:

				      - main

				permissions: {}

				jobs:

				  deploy-docs:

				    name: deploy-docs

				    permissions:

				      contents: read

				      pages: write

				      id-token: write

				    environment:

				      name: github-pages

				      url: ${{ steps.deployment.outputs.page_url }}

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/configure-pages@v5

				      - uses: actions/checkout@v5

				        with:

				          persist-credentials: false

				      - uses: actions/setup-python@v5

				        with:

				          python-version: 3.x

				      - run: pip install zensical

				      - run: zensical build --clean

				      - uses: actions/upload-pages-artifact@v4

				        with:

				          path: site

				      - uses: actions/deploy-pages@v4

				        id: deployment

									
										55

.github/workflows/gatekeeper-skipper.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,55 @@

				name: Skipper

				# This workflow sets various "skip_*" output values that can be used to

				# determine what workflows/jobs are expected to be executed. Sample usage:

				#

				#   skipper:

				#     uses: ./.github/workflows/gatekeeper-skipper.yaml

				#     with:

				#       commit-hash: ${{ github.event.pull_request.head.sha }}

				#       target-branch: ${{ github.event.pull_request.base.ref }}

				#

				#   your-workflow:

				#     needs: skipper

				#     if: ${{ needs.skipper.outputs.skip_build != 'yes' }}

				on:

				  workflow_call:

				    inputs:

				      commit-hash:

				        required: true

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    outputs:

				      skip_build:

				        value: ${{ jobs.skipper.outputs.skip_build }}

				      skip_test:

				        value: ${{ jobs.skipper.outputs.skip_test }}

				      skip_static:

				        value: ${{ jobs.skipper.outputs.skip_static }}

				permissions: {}

				jobs:

				  skipper:

				    name: skipper

				    runs-on: ubuntu-22.04

				    outputs:

				      skip_build: ${{ steps.skipper.outputs.skip_build }}

				      skip_test: ${{ steps.skipper.outputs.skip_test }}

				      skip_static: ${{ steps.skipper.outputs.skip_static }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - id: skipper

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				        run: |

				          python3 tools/testing/gatekeeper/skips.py | tee -a "$GITHUB_OUTPUT"

				        shell: /usr/bin/bash -x {0}

									
										55

.github/workflows/gatekeeper.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,55 @@

				name: Gatekeeper

				# Gatekeeper uses the "skips.py" to determine which job names/regexps are

				# required for given PR and waits for them to either complete or fail

				# reporting the status.

				on:

				  pull_request_target: # zizmor: ignore[dangerous-triggers] See #11332.

				    types:

				      - opened

				      - synchronize

				      - reopened

				      - edited

				      - labeled

				      - unlabeled

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  gatekeeper:

				    name: gatekeeper

				    runs-on: ubuntu-22.04

				    permissions:

				      actions: read

				      contents: read

				      issues: read

				      pull-requests: read

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ github.event.pull_request.head.sha }}

				          fetch-depth: 0

				          persist-credentials: false

				      - id: gatekeeper

				        env:

				          TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}

				          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          COMMIT_HASH: ${{ github.event.pull_request.head.sha }}

				          GH_PR_NUMBER: ${{ github.event.pull_request.number }}

				        run: |

				          #!/usr/bin/env bash -x

				          mapfile -t lines < <(python3 tools/testing/gatekeeper/skips.py -t)

				          export REQUIRED_JOBS="${lines[0]}"

				          export REQUIRED_REGEXPS="${lines[1]}"

				          export REQUIRED_LABELS="${lines[2]}"

				          echo "REQUIRED_JOBS: $REQUIRED_JOBS"

				          echo "REQUIRED_REGEXPS: $REQUIRED_REGEXPS"

				          echo "REQUIRED_LABELS: $REQUIRED_LABELS"

				          python3 tools/testing/gatekeeper/jobs.py

				          exit $?

				        shell: /usr/bin/bash -x {0}

									
										53

.github/workflows/govulncheck.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,53 @@

				on:

				  workflow_call:

				name: Govulncheck

				permissions: {}

				jobs:

				  govulncheck:

				    name: govulncheck

				    runs-on: ubuntu-22.04

				    strategy:

				      matrix:

				        include:

				          - binary: "kata-runtime"

				            make_target: "runtime"

				          - binary: "containerd-shim-kata-v2"

				            make_target: "containerd-shim-v2"

				          - binary: "kata-monitor"

				            make_target: "monitor"

				      fail-fast: false

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install golang

				        run: |

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "${GITHUB_PATH}"

				      - name: Install govulncheck

				        run: |

				          go install golang.org/x/vuln/cmd/govulncheck@latest

				          echo "${HOME}/go/bin" >> "${GITHUB_PATH}"

				      - name: Build runtime binaries

				        run: |

				          cd src/runtime

				          make "${MAKE_TARGET}"

				        env:

				          MAKE_TARGET: ${{ matrix.make_target }}

				          SKIP_GO_VERSION_CHECK: "1"

				      - name: Run govulncheck on ${{ matrix.binary }}

				        env:

				          BINARY: ${{ matrix.binary }}

				        run: |

				          cd src/runtime

				          bash ../../tests/govulncheck-runner.sh "./${BINARY}"

									
										68

.github/workflows/kata-deploy-push.yaml
									
										vendored
									
												View File
											
				@@ -1,68 +0,0 @@

				name: kata deploy build

				on: [push, pull_request]

				jobs:

				  build-asset:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        asset:

				          - kernel

				          - kernel-experimental

				          - shim-v2

				          - qemu

				          - cloud-hypervisor

				          - firecracker

				          - rootfs-image

				          - rootfs-initrd

				    steps:

				      - uses: actions/checkout@v2

				      - name: Install docker

				        run: |

				          curl -fsSL https://test.docker.com -o test-docker.sh

				          sh test-docker.sh

				      - name: Build ${{ matrix.asset }}

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          sudo cp -r --preserve=all "${build_dir}" "kata-build"

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@v2

				        with:

				          name: kata-artifacts

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          if-no-files-found: error

				  create-kata-tarball:

				    runs-on: ubuntu-latest

				    needs: build-asset

				    steps:

				      - uses: actions/checkout@v2

				      - name: get-artifacts

				        uses: actions/download-artifact@v2

				        with:

				          name: kata-artifacts

				          path: build

				      - name: merge-artifacts

				        run: |

				          make merge-builds

				      - name: store-artifacts

				        uses: actions/upload-artifact@v2

				        with:

				          name: kata-static-tarball

				          path: kata-static.tar.xz

				  make-kata-tarball:

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@v2

				      - name: make kata-tarball

				        run: |

				          make kata-tarball

				          sudo make install-tarball

									
										51

.github/workflows/kata-deploy-test.yaml
									
										vendored
									
												View File
											
				@@ -1,51 +0,0 @@

				on:

				  issue_comment:

				    types: [created, edited]

				name: test-kata-deploy

				jobs:

				  create-and-test-container:

				    if: |

				      github.event.issue.pull_request

				      && github.event_name == 'issue_comment'

				      && github.event.action == 'created'

				      && startsWith(github.event.comment.body, '/test_kata_deploy')

				    runs-on: ubuntu-latest

				    steps:

				      - name: get-PR-ref

				        id: get-PR-ref

				        run: |

				            ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed  's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')

				            echo "reference for PR: " ${ref}

				            echo "##[set-output name=pr-ref;]${ref}"

				      - name: check out

				        uses: actions/checkout@v2

				        with:

				           ref: ${{ steps.get-PR-ref.outputs.pr-ref }}

				      - name: build-container-image

				        id: build-container-image

				        run: |

				            PR_SHA=$(git log --format=format:%H -n1)

				            VERSION="2.0.0"

				            ARTIFACT_URL="https://github.com/kata-containers/kata-containers/releases/download/${VERSION}/kata-static-${VERSION}-x86_64.tar.xz"

				            wget "${ARTIFACT_URL}" -O tools/packaging/kata-deploy/kata-static.tar.xz

				            docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:${PR_SHA} -t quay.io/kata-containers/kata-deploy-ci:${PR_SHA} ./tools/packaging/kata-deploy

				            docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}

				            docker push katadocker/kata-deploy-ci:$PR_SHA

				            docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io

				            docker push quay.io/kata-containers/kata-deploy-ci:$PR_SHA

				            echo "##[set-output name=pr-sha;]${PR_SHA}"

				      - name: test-kata-deploy-ci-in-aks

				        uses: ./tools/packaging/kata-deploy/action

				        with:

				          packaging-sha: ${{ steps.build-container-image.outputs.pr-sha }}

				        env:

				          PKG_SHA: ${{ steps.build-container-image.outputs.pr-sha }}

				          AZ_APPID: ${{ secrets.AZ_APPID }}

				          AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}

				          AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				          AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

									
										295

.github/workflows/main.yaml
									
										vendored
									
												View File
											
				@@ -1,295 +0,0 @@

				name: Publish release tarball

				on: 

				  push: 

				    tags:

				     - '1.*'

				jobs:

				  get-artifact-list:

				    runs-on: ubuntu-latest

				    steps:

				      - name: get the list

				        run: |

				         pushd $GITHUB_WORKSPACE

				         tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				         git checkout $tag

				         popd

				         $GITHUB_WORKSPACE/tools/packaging/artifact-list.sh > artifact-list.txt

				      - name: save-artifact-list

				        uses: actions/upload-artifact@master

				        with:

				          name: artifact-list

				          path: artifact-list.txt

				  build-kernel:

				    runs-on: ubuntu-16.04

				    needs: get-artifact-list

				    env:

				      buildstr: "install_kernel"

				    steps:

				      - uses: actions/checkout@v1

				      - name: get-artifact-list

				        uses: actions/download-artifact@master

				        with:

				          name: artifact-list

				      - run: |

				         sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables

				      - name: build-kernel

				        run: |

				         if grep -q $buildstr ./artifact-list/artifact-list.txt; then

				           $GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr

				           echo "artifact-built=true" >> $GITHUB_ENV

				         else

				           echo "artifact-built=false" >> $GITHUB_ENV

				         fi

				      - name: store-artifacts

				        if: ${{ env.artifact-built }} == 'true'

				        uses: actions/upload-artifact@master

				        with:

				          name: kata-artifacts

				          path: kata-static-kernel.tar.gz

				  build-experimental-kernel:

				    runs-on: ubuntu-16.04

				    needs: get-artifact-list

				    env:

				      buildstr: "install_experimental_kernel"

				    steps:

				      - uses: actions/checkout@v1

				      - name: get-artifact-list

				        uses: actions/download-artifact@master

				        with:

				          name: artifact-list

				      - run: |

				         sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables

				      - name: build-experimental-kernel

				        run: |

				         if grep -q $buildstr ./artifact-list/artifact-list.txt; then

				           $GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr

				           echo "artifact-built=true" >> $GITHUB_ENV

				         else

				           echo "artifact-built=false" >> $GITHUB_ENV

				         fi

				      - name: store-artifacts

				        if: ${{ env.artifact-built }} == 'true'

				        uses: actions/upload-artifact@master

				        with:

				          name: kata-artifacts

				          path: kata-static-experimental-kernel.tar.gz

				  build-qemu:

				    runs-on: ubuntu-16.04

				    needs: get-artifact-list

				    env:

				      buildstr: "install_qemu"

				    steps:

				      - uses: actions/checkout@v1

				      - name: get-artifact-list

				        uses: actions/download-artifact@master

				        with:

				          name: artifact-list

				      - name: build-qemu

				        run: |

				         if grep -q $buildstr ./artifact-list/artifact-list.txt; then

				           $GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr

				           echo "artifact-built=true" >> $GITHUB_ENV

				         else

				           echo "artifact-built=false" >> $GITHUB_ENV

				         fi

				      - name: store-artifacts

				        if: ${{ env.artifact-built }} == 'true'

				        uses: actions/upload-artifact@master

				        with:

				          name: kata-artifacts

				          path: kata-static-qemu.tar.gz

				  # Job for building the image

				  build-image:

				    runs-on: ubuntu-16.04

				    needs: get-artifact-list

				    env:

				      buildstr: "install_image"

				    steps:

				      - uses: actions/checkout@v1

				      - name: get-artifact-list

				        uses: actions/download-artifact@master

				        with:

				          name: artifact-list

				      - name: build-image

				        run: |

				         if grep -q $buildstr ./artifact-list/artifact-list.txt; then

				           $GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr

				           echo "artifact-built=true" >> $GITHUB_ENV

				         else

				           echo "artifact-built=false" >> $GITHUB_ENV

				         fi

				      - name: store-artifacts

				        if: ${{ env.artifact-built }} == 'true'

				        uses: actions/upload-artifact@master

				        with:

				          name: kata-artifacts

				          path: kata-static-image.tar.gz

				  # Job for building firecracker hypervisor

				  build-firecracker:

				    runs-on: ubuntu-16.04

				    needs: get-artifact-list

				    env:

				      buildstr: "install_firecracker"

				    steps:

				      - uses: actions/checkout@v1

				      - name: get-artifact-list

				        uses: actions/download-artifact@master

				        with:

				          name: artifact-list

				      - name: build-firecracker

				        run: |

				         if grep -q $buildstr ./artifact-list/artifact-list.txt; then

				           $GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr

				           echo "artifact-built=true" >> $GITHUB_ENV

				         else

				           echo "artifact-built=false" >> $GITHUB_ENV

				         fi

				      - name: store-artifacts

				        if: ${{ env.artifact-built }} == 'true'

				        uses: actions/upload-artifact@master

				        with:

				          name: kata-artifacts

				          path: kata-static-firecracker.tar.gz

				  # Job for building cloud-hypervisor

				  build-clh:

				    runs-on: ubuntu-16.04

				    needs: get-artifact-list

				    env:

				      buildstr: "install_clh"

				    steps:

				      - uses: actions/checkout@v1

				      - name: get-artifact-list

				        uses: actions/download-artifact@master

				        with:

				          name: artifact-list

				      - name: build-clh

				        run: |

				         if grep -q $buildstr ./artifact-list/artifact-list.txt; then

				           $GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr

				           echo "artifact-built=true" >> $GITHUB_ENV

				         else

				           echo "artifact-built=false" >> $GITHUB_ENV

				         fi

				      - name: store-artifacts

				        if: ${{ env.artifact-built }} == 'true'

				        uses: actions/upload-artifact@master

				        with:

				          name: kata-artifacts

				          path: kata-static-clh.tar.gz

				  # Job for building kata components

				  build-kata-components:

				    runs-on: ubuntu-16.04

				    needs: get-artifact-list

				    env:

				      buildstr: "install_kata_components"

				    steps:

				      - uses: actions/checkout@v1

				      - name: get-artifact-list

				        uses: actions/download-artifact@master

				        with:

				          name: artifact-list

				      - name: build-kata-components

				        run: |

				         if grep -q $buildstr ./artifact-list/artifact-list.txt; then

				           $GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr

				           echo "artifact-built=true" >> $GITHUB_ENV

				         else

				           echo "artifact-built=false" >> $GITHUB_ENV

				         fi

				      - name: store-artifacts

				        if: ${{ env.artifact-built }} == 'true'

				        uses: actions/upload-artifact@master

				        with:

				          name: kata-artifacts

				          path: kata-static-kata-components.tar.gz

				  gather-artifacts:

				    runs-on: ubuntu-16.04

				    needs: [build-experimental-kernel, build-kernel, build-qemu, build-image, build-firecracker, build-kata-components, build-clh]

				    steps:

				      - uses: actions/checkout@v1

				      - name: get-artifacts

				        uses: actions/download-artifact@master

				        with:

				          name: kata-artifacts

				      - name: colate-artifacts

				        run: |

				          $GITHUB_WORKSPACE/.github/workflows/gather-artifacts.sh

				      - name: store-artifacts

				        uses: actions/upload-artifact@master

				        with:

				          name: release-candidate

				          path: kata-static.tar.xz

				  kata-deploy:

				    needs: gather-artifacts

				    runs-on: ubuntu-latest

				    steps:

				      - name: get-artifacts

				        uses: actions/download-artifact@master

				        with:

				          name: release-candidate

				      - name: build-and-push-kata-deploy-ci

				        id: build-and-push-kata-deploy-ci

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          git clone https://github.com/kata-containers/packaging

				          pushd packaging

				          git checkout $tag

				          pkg_sha=$(git rev-parse HEAD)

				          popd

				          mv release-candidate/kata-static.tar.xz ./packaging/kata-deploy/kata-static.tar.xz

				          docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha -t quay.io/kata-containers/kata-deploy-ci:$pkg_sha ./packaging/kata-deploy

				          docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}

				          docker push katadocker/kata-deploy-ci:$pkg_sha

				          docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io

				          docker push quay.io/kata-containers/kata-deploy-ci:$pkg_sha

				          echo "::set-output name=PKG_SHA::${pkg_sha}"

				      - name: test-kata-deploy-ci-in-aks

				        uses: ./packaging/kata-deploy/action

				        with:

				          packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}

				        env:

				          PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}

				          AZ_APPID: ${{ secrets.AZ_APPID }}

				          AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}

				          AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				          AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      - name: push-tarball

				        run: |

				          # tag the container image we created and push to DockerHub

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag}

				          docker push katadocker/kata-deploy:${tag}

				  upload-static-tarball:

				    needs: kata-deploy

				    runs-on: ubuntu-latest

				    steps:

				      - name: download-artifacts

				        uses: actions/download-artifact@master

				        with:

				          name: release-candidate

				      - name: install hub

				        run: |

				          HUB_VER=$(curl -s "https://api.github.com/repos/github/hub/releases/latest" | jq -r .tag_name | sed 's/^v//')

				          wget -q -O- https://github.com/github/hub/releases/download/v$HUB_VER/hub-linux-amd64-$HUB_VER.tgz | \

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && sudo mv hub /usr/local/bin/hub

				      - name: push static tarball to github

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tarball="kata-static-$tag-x86_64.tar.xz"

				          repo="https://github.com/kata-containers/runtime.git"

				          mv release-candidate/kata-static.tar.xz "release-candidate/${tarball}"

				          git clone "${repo}"

				          cd runtime

				          echo "uploading asset '${tarball}' to '${repo}' tag: ${tag}"

				          GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "../release-candidate/${tarball}" "${tag}"

									
										78

.github/workflows/move-issues-to-in-progress.yaml
									
										vendored
									
												View File
											
				@@ -1,78 +0,0 @@

				# Copyright (c) 2020 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				name: Move issues to "In progress" in backlog project when referenced by a PR

				on:

				  pull_request_target:

				    types:

				      - opened

				      - reopened

				jobs:

				  move-linked-issues-to-in-progress:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Install hub

				        run: |

				          HUB_ARCH="amd64"

				          HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\

				            jq -r .tag_name | sed 's/^v//')

				          curl -sL \

				            "https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && \

				          sudo install hub /usr/local/bin

				      - name: Install hub extension script

				        run: |

				          # Clone into a temporary directory to avoid overwriting

				          # any existing github directory.

				          pushd $(mktemp -d) &>/dev/null

				          git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts

				          sudo install hub-util.sh /usr/local/bin

				          popd &>/dev/null

				      - name: Checkout code to allow hub to communicate with the project

				        uses: actions/checkout@v2

				      - name: Move issue to "In progress"

				        env:

				          GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}

				        run: |

				          pr=${{ github.event.pull_request.number }}

				          linked_issue_urls=$(hub-util.sh \

				            list-issues-for-pr "$pr" |\

				            grep -v "^\#"  |\

				            cut -d';' -f3 || true)

				          # PR doesn't have any linked issues

				          # (it should, but maybe a new user forgot to add a "Fixes: #XXX" commit).

				          [ -z "$linked_issue_urls" ] && {

				            echo "::error::No linked issues for PR $pr"

				            exit 1

				          }

				          project_name="Issue backlog"

				          project_type="org"

				          project_column="In progress"

				          for issue_url in $(echo "$linked_issue_urls")

				          do

				            issue=$(echo "$issue_url"| awk -F\/ '{print $NF}' || true)

				            [ -z "$issue" ] && {

				              echo "::error::Cannot determine issue number from $issue_url for PR $pr"

				              exit 1

				            }

				            # Move the issue to the correct column on the project board

				            hub-util.sh \

				              move-issue \

				              "$issue" \

				              "$project_name" \

				              "$project_type" \

				              "$project_column"

				          done

									
										35

.github/workflows/nydus-snapshotter-version-in-sync.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,35 @@

				name: nydus-snapshotter-version-sync

				on:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  nydus-snapshotter-version-check:

				    name: nydus-snapshotter-version-check

				    runs-on: ubuntu-22.04

				    steps:

				    - name: Checkout code

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        persist-credentials: false

				    - name: Ensure nydus-snapshotter-version is in sync inside our repo

				      run: |

				        dockerfile_version=$(grep "ARG NYDUS_SNAPSHOTTER_VERSION" tools/packaging/kata-deploy/Dockerfile | cut -f2 -d'=')

				        versions_version=$(yq ".externals.nydus-snapshotter.version | explode(.)" versions.yaml)

				        if [[ "${dockerfile_version}" != "${versions_version}" ]]; then

				          echo "nydus-snapshotter version must be the same in the following places: "

				          echo "- versions.yaml: ${versions_version}"

				          echo "- tools/packaging/kata-deploy/Dockerfile: ${dockerfile_version}"

				          exit 1

				        fi

									
										43

.github/workflows/osv-scanner.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,43 @@

				# A sample workflow which sets up periodic OSV-Scanner scanning for vulnerabilities,

				# in addition to a PR check which fails if new vulnerabilities are introduced.

				#

				# For more examples and options, including how to ignore specific vulnerabilities,

				# see https://google.github.io/osv-scanner/github-action/

				name: OSV-Scanner

				on:

				  workflow_dispatch:

				  pull_request:

				    branches: [ "main" ]

				  schedule:

				    - cron: '0 1 * * 0'

				  push:

				    branches: [ "main" ]

				permissions: {}

				jobs:

				  scan-scheduled:

				    permissions:

				      actions: read # # Required to upload SARIF file to CodeQL

				      contents: read  # Read commit contents

				      security-events: write  # Require writing security events to upload SARIF file to security tab

				    if: ${{ github.event_name == 'push' || github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}

				    uses: "google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@b00f71e051ddddc6e46a193c31c8c0bf283bf9e6" # v2.1.0

				    with:

				      scan-args: |-

				        -r

				        ./

				  scan-pr:

				    permissions:

				      actions: read # Required to upload SARIF file to CodeQL

				      contents: read  # Read commit contents

				      security-events: write  # Require writing security events to upload SARIF file to security tab

				    if: ${{ github.event_name == 'pull_request' }}

				    uses: "google/osv-scanner-action/.github/workflows/osv-scanner-reusable-pr.yml@b00f71e051ddddc6e46a193c31c8c0bf283bf9e6" # v2.1.0

				    with:

				      # Example of specifying custom arguments

				      scan-args: |-

				        -r

				        ./

									
										207

.github/workflows/payload-after-push.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,207 @@

				name: CI | Publish Kata Containers payload

				on:

				  push:

				    branches:

				      - main

				  workflow_dispatch:

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				jobs:

				  build-assets-amd64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      push-to-registry: yes

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				  build-assets-arm64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      push-to-registry: yes

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				  build-assets-s390x:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      push-to-registry: yes

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-assets-ppc64le:

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      push-to-registry: yes

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-amd64:

				    needs: build-assets-amd64

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      registry: quay.io

				      repo: kata-containers/kata-deploy-ci

				      tag: kata-containers-latest-amd64

				      target-branch: ${{ github.ref_name }}

				      runner: ubuntu-22.04

				      arch: amd64

				      build-type: "" # Use script-based build (default)

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-arm64:

				    needs: build-assets-arm64

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      registry: quay.io

				      repo: kata-containers/kata-deploy-ci

				      tag: kata-containers-latest-arm64

				      target-branch: ${{ github.ref_name }}

				      runner: ubuntu-24.04-arm

				      arch: arm64

				      build-type: "" # Use script-based build (default)

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-s390x:

				    needs: build-assets-s390x

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      registry: quay.io

				      repo: kata-containers/kata-deploy-ci

				      tag: kata-containers-latest-s390x

				      target-branch: ${{ github.ref_name }}

				      runner: s390x

				      arch: s390x

				      build-type: "" # Use script-based build (default)

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-ppc64le:

				    needs: build-assets-ppc64le

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      registry: quay.io

				      repo: kata-containers/kata-deploy-ci

				      tag: kata-containers-latest-ppc64le

				      target-branch: ${{ github.ref_name }}

				      runner: ppc64le-small

				      arch: ppc64le

				      build-type: "" # Use script-based build (default)

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-manifest:

				    name: publish-manifest

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read

				      packages: write

				    needs: [publish-kata-deploy-payload-amd64, publish-kata-deploy-payload-arm64, publish-kata-deploy-payload-s390x, publish-kata-deploy-payload-ppc64le]

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Push multi-arch manifest

				        run: |

				          ./tools/packaging/release/release.sh publish-multiarch-manifest

				        env:

				          KATA_DEPLOY_IMAGE_TAGS: "kata-containers-latest"

				          KATA_DEPLOY_REGISTRIES: "quay.io/kata-containers/kata-deploy-ci"

				  upload-helm-chart-tarball:

				    name: upload-helm-chart-tarball

				    needs: publish-manifest

				    runs-on: ubuntu-22.04

				    permissions:

				      packages: write # needed to push the helm chart to ghcr.io

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Install helm

				        uses: azure/setup-helm@fe7b79cd5ee1e45176fcad797de68ecaf3ca4814 # v4.2.0

				        id: install

				      - name: Login to the OCI registries

				        env:

				          QUAY_DEPLOYER_USERNAME: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				          GITHUB_TOKEN: ${{ github.token }}

				        run: |

				          echo "${QUAY_DEPLOYER_PASSWORD}" | helm registry login quay.io --username "${QUAY_DEPLOYER_USERNAME}" --password-stdin

				          echo "${GITHUB_TOKEN}" | helm registry login ghcr.io --username "${GITHUB_ACTOR}" --password-stdin

				      - name: Push helm chart to the OCI registries

				        run: |

				          echo "Adjusting the Chart.yaml and values.yaml"

				          yq eval '.version = "0.0.0-dev" | .appVersion = "0.0.0-dev"' -i tools/packaging/kata-deploy/helm-chart/kata-deploy/Chart.yaml

				          yq eval '.image.reference = "quay.io/kata-containers/kata-deploy-ci" | .image.tag = "kata-containers-latest"' -i tools/packaging/kata-deploy/helm-chart/kata-deploy/values.yaml

				          echo "Generating the chart package"

				          helm dependencies update tools/packaging/kata-deploy/helm-chart/kata-deploy

				          helm package tools/packaging/kata-deploy/helm-chart/kata-deploy

				          echo "Pushing the chart to the OCI registries"

				          helm push "kata-deploy-0.0.0-dev.tgz" oci://quay.io/kata-containers/kata-deploy-charts

				          helm push "kata-deploy-0.0.0-dev.tgz" oci://ghcr.io/kata-containers/kata-deploy-charts

									
										115

.github/workflows/publish-kata-deploy-payload.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,115 @@

				name: CI | Publish kata-deploy payload

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				      runner:

				        default: 'ubuntu-22.04'

				        description: The runner to execute the workflow on. Defaults to 'ubuntu-22.04'.

				        required: false

				        type: string

				      arch:

				        description: The arch of the tarball.

				        required: true

				        type: string

				      build-type:

				        description: The build type for kata-deploy. Use 'rust' for Rust-based build, empty or omit for script-based (default).

				        required: false

				        type: string

				        default: ""

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions: {}

				jobs:

				  kata-payload:

				    name: kata-payload

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ${{ inputs.runner }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Remove unnecessary directories to free up space

				        run: |

				          sudo rm -rf /usr/local/.ghcup

				          sudo rm -rf /opt/hostedtoolcache/CodeQL

				          sudo rm -rf /usr/local/lib/android

				          sudo rm -rf /usr/share/dotnet

				          sudo rm -rf /opt/ghc

				          sudo rm -rf /usr/local/share/boost

				          sudo rm -rf /usr/lib/jvm

				          sudo rm -rf /usr/share/swift

				          sudo rm -rf /usr/local/share/powershell

				          sudo rm -rf /usr/local/julia*

				          sudo rm -rf /opt/az

				          sudo rm -rf /usr/local/share/chromium

				          sudo rm -rf /opt/microsoft

				          sudo rm -rf /opt/google

				          sudo rm -rf /usr/lib/firefox

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tarball for ${{ inputs.arch }}

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-${{ inputs.arch}}${{ inputs.tarball-suffix }}

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.registry == 'quay.io' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Login to Kata Containers ghcr.io

				        if: ${{ inputs.registry == 'ghcr.io' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: build-and-push-kata-payload for ${{ inputs.arch }}

				        id: build-and-push-kata-payload

				        env:

				          REGISTRY: ${{ inputs.registry }}

				          REPO: ${{ inputs.repo }}

				          TAG: ${{ inputs.tag }}

				          BUILD_TYPE: ${{ inputs.build-type }}

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				          "$(pwd)/kata-static.tar.zst" \

				          "${REGISTRY}/${REPO}" \

				          "${TAG}" \

				          "${BUILD_TYPE}"

									
										82

.github/workflows/release-amd64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,82 @@

				name: Publish Kata release artifacts for amd64

				on:

				  workflow_call:

				    inputs:

				      target-arch:

				        required: true

				        type: string

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				      KBUILD_SIGN_PIN:

				        required: true

				permissions: {}

				jobs:

				  build-kata-static-tarball-amd64:

				    uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml

				    with:

				      push-to-registry: yes

				      stage: release

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				  kata-deploy:

				    name: kata-deploy

				    needs: build-kata-static-tarball-amd64

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64

				      - name: build-and-push-kata-deploy-ci-amd64

				        id: build-and-push-kata-deploy-ci-amd64

				        env:

				          TARGET_ARCH: ${{ inputs.target-arch }}

				        run: |

				          # We need to do such trick here as the format of the $GITHUB_REF

				          # is "refs/tags/<tag>"

				          tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)

				          if [ "${tag}" = "main" ]; then

				              tag=$(./tools/packaging/release/release.sh release-version)

				              tags=("${tag}" "latest")

				          else

				              tags=("${tag}")

				          fi

				          for tag in "${tags[@]}"; do

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.zst "ghcr.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.zst "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				          done

									
										82

.github/workflows/release-arm64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,82 @@

				name: Publish Kata release artifacts for arm64

				on:

				  workflow_call:

				    inputs:

				      target-arch:

				        required: true

				        type: string

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				      KBUILD_SIGN_PIN:

				        required: true

				permissions: {}

				jobs:

				  build-kata-static-tarball-arm64:

				    uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml

				    with:

				      push-to-registry: yes

				      stage: release

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				  kata-deploy:

				    name: kata-deploy

				    needs: build-kata-static-tarball-arm64

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-24.04-arm

				    steps:

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-arm64

				      - name: build-and-push-kata-deploy-ci-arm64

				        id: build-and-push-kata-deploy-ci-arm64

				        env:

				          TARGET_ARCH: ${{ inputs.target-arch }}

				        run: |

				          # We need to do such trick here as the format of the $GITHUB_REF

				          # is "refs/tags/<tag>"

				          tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)

				          if [ "${tag}" = "main" ]; then

				              tag=$(./tools/packaging/release/release.sh release-version)

				              tags=("${tag}" "latest")

				          else

				              tags=("${tag}")

				          fi

				          for tag in "${tags[@]}"; do

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.zst "ghcr.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.zst "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				          done

									
										79

.github/workflows/release-ppc64le.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,79 @@

				name: Publish Kata release artifacts for ppc64le

				on:

				  workflow_call:

				    inputs:

				      target-arch:

				        required: true

				        type: string

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions: {}

				jobs:

				  build-kata-static-tarball-ppc64le:

				    uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml

				    with:

				      push-to-registry: yes

				      stage: release

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				  kata-deploy:

				    name: kata-deploy

				    needs: build-kata-static-tarball-ppc64le

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-24.04-ppc64le

				    steps:

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-ppc64le

				      - name: build-and-push-kata-deploy-ci-ppc64le

				        id: build-and-push-kata-deploy-ci-ppc64le

				        env:

				          TARGET_ARCH: ${{ inputs.target-arch }}

				        run: |

				          # We need to do such trick here as the format of the $GITHUB_REF

				          # is "refs/tags/<tag>"

				          tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)

				          if [ "${tag}" = "main" ]; then

				              tag=$(./tools/packaging/release/release.sh release-version)

				              tags=("${tag}" "latest")

				          else

				              tags=("${tag}")

				          fi

				          for tag in "${tags[@]}"; do

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.zst "ghcr.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.zst "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				          done

									
										83

.github/workflows/release-s390x.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,83 @@

				name: Publish Kata release artifacts for s390x

				on:

				  workflow_call:

				    inputs:

				      target-arch:

				        required: true

				        type: string

				    secrets:

				      CI_HKD_PATH:

				        required: true

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions: {}

				jobs:

				  build-kata-static-tarball-s390x:

				    uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml

				    with:

				      push-to-registry: yes

				      stage: release

				    secrets:

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				  kata-deploy:

				    name: kata-deploy

				    needs: build-kata-static-tarball-s390x

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-24.04-s390x

				    steps:

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x

				      - name: build-and-push-kata-deploy-ci-s390x

				        id: build-and-push-kata-deploy-ci-s390x

				        env:

				          TARGET_ARCH: ${{ inputs.target-arch }}

				        run: |

				          # We need to do such trick here as the format of the $GITHUB_REF

				          # is "refs/tags/<tag>"

				          tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)

				          if [ "${tag}" = "main" ]; then

				              tag=$(./tools/packaging/release/release.sh release-version)

				              tags=("${tag}" "latest")

				          else

				              tags=("${tag}")

				          fi

				          for tag in "${tags[@]}"; do

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.zst "ghcr.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.zst "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				          done

									
										431

.github/workflows/release.yaml
									
										vendored
									
												View File
												
				@@ -1,178 +1,309 @@

				name: Publish Kata 2.x release artifacts

				name: Release Kata Containers

				on:

				  push:

				    tags:

				     - '2.*'

				  workflow_dispatch

				permissions: {}

				jobs:

				  build-asset:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        asset:

				          - cloud-hypervisor

				          - firecracker

				          - kernel

				          - qemu

				          - rootfs-image

				          - rootfs-initrd

				          - shim-v2

				  release:

				    name: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release create` command

				    steps:

				      - uses: actions/checkout@v2

				      - name: Install docker

				        run: |

				          curl -fsSL https://test.docker.com -o test-docker.sh

				          sh test-docker.sh

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Build ${{ matrix.asset }}

				      - name: Create a new release

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-binaries-in-docker.sh --build="${KATA_ASSET}"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          sudo cp -r "${build_dir}" "kata-build"

				          ./tools/packaging/release/release.sh create-new-release

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          GH_TOKEN: ${{ github.token }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@v2

				        with:

				          name: kata-artifacts

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          if-no-files-found: error

				  build-and-push-assets-amd64:

				    needs: release

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/release-amd64.yaml

				    with:

				      target-arch: amd64

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				  create-kata-tarball:

				    runs-on: ubuntu-latest

				    needs: build-asset

				  build-and-push-assets-arm64:

				    needs: release

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/release-arm64.yaml

				    with:

				      target-arch: arm64

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

				  build-and-push-assets-s390x:

				    needs: release

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/release-s390x.yaml

				    with:

				      target-arch: s390x

				    secrets:

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-and-push-assets-ppc64le:

				    needs: release

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/release-ppc64le.yaml

				    with:

				      target-arch: ppc64le

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-multi-arch-images:

				    name: publish-multi-arch-images

				    runs-on: ubuntu-22.04

				    needs: [build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le]

				    permissions:

				      contents: write # needed for the `gh release` commands

				      packages: write # needed to push the multi-arch manifest to ghcr.io

				    steps:

				      - uses: actions/checkout@v2

				      - name: get-artifacts

				        uses: actions/download-artifact@v2

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          name: kata-artifacts

				          path: kata-artifacts

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts

				      - name: store-artifacts

				        uses: actions/upload-artifact@v2

				        with:

				          name: kata-static-tarball

				          path: kata-static.tar.xz

				          persist-credentials: false

				  kata-deploy:

				    needs: create-kata-tarball

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@v2

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v2

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          name: kata-static-tarball

				      - name: build-and-push-kata-deploy-ci

				        id: build-and-push-kata-deploy-ci

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Get the image tags

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          pushd $GITHUB_WORKSPACE

				          git checkout $tag

				          pkg_sha=$(git rev-parse HEAD)

				          popd

				          mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz

				          docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha -t quay.io/kata-containers/kata-deploy-ci:$pkg_sha $GITHUB_WORKSPACE/tools/packaging/kata-deploy

				          docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}

				          docker push katadocker/kata-deploy-ci:$pkg_sha

				          docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io

				          docker push quay.io/kata-containers/kata-deploy-ci:$pkg_sha

				          mkdir -p packaging/kata-deploy

				          ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action

				          echo "::set-output name=PKG_SHA::${pkg_sha}"

				      - name: test-kata-deploy-ci-in-aks

				        uses: ./packaging/kata-deploy/action

				        with:

				          packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}

				          release_version=$(./tools/packaging/release/release.sh release-version)

				          echo "KATA_DEPLOY_IMAGE_TAGS=$release_version latest" >> "$GITHUB_ENV"

				      - name: Publish multi-arch manifest on quay.io & ghcr.io

				        run: |

				          ./tools/packaging/release/release.sh publish-multiarch-manifest

				        env:

				          PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}

				          AZ_APPID: ${{ secrets.AZ_APPID }}

				          AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}

				          AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				          AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      - name: push-tarball

				        run: |

				          # tag the container image we created and push to DockerHub

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tags=($tag)

				          tags+=($([[ "$tag" =~ "alpha"|"rc" ]] && echo "latest" || echo "stable"))

				          for tag in ${tags[@]}; do \

				            docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag} && \

				            docker tag quay.io/kata-containers/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} quay.io/kata-containers/kata-deploy:${tag} && \

				            docker push katadocker/kata-deploy:${tag} && \

				            docker push quay.io/kata-containers/kata-deploy:${tag}; \

				          done

				          KATA_DEPLOY_REGISTRIES: "quay.io/kata-containers/kata-deploy ghcr.io/kata-containers/kata-deploy"

				  upload-static-tarball:

				    needs: kata-deploy

				    runs-on: ubuntu-latest

				  upload-multi-arch-static-tarball:

				    name: upload-multi-arch-static-tarball

				    needs: [build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le]

				    permissions:

				      contents: write # needed for the `gh release` commands

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/checkout@v2

				      - name: download-artifacts

				        uses: actions/download-artifact@v2

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          name: kata-static-tarball

				      - name: install hub

				          persist-credentials: false

				      - name: Set KATA_STATIC_TARBALL env var

				        run: |

				          HUB_VER=$(curl -s "https://api.github.com/repos/github/hub/releases/latest" | jq -r .tag_name | sed 's/^v//')

				          wget -q -O- https://github.com/github/hub/releases/download/v$HUB_VER/hub-linux-amd64-$HUB_VER.tgz | \

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && sudo mv hub /usr/local/bin/hub

				      - name: push static tarball to github

				          tarball=$(pwd)/kata-static.tar.zst

				          echo "KATA_STATIC_TARBALL=${tarball}" >> "$GITHUB_ENV"

				      - name: Download amd64 artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64

				      - name: Upload amd64 static tarball to GitHub

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tarball="kata-static-$tag-x86_64.tar.xz"

				          mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"

				          pushd $GITHUB_WORKSPACE

				          echo "uploading asset '${tarball}' for tag: ${tag}"

				          GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"

				          popd

				          ./tools/packaging/release/release.sh upload-kata-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: amd64

				      - name: Download arm64 artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-arm64

				      - name: Upload arm64 static tarball to GitHub

				        run: |

				          ./tools/packaging/release/release.sh upload-kata-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: arm64

				      - name: Download s390x artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x

				      - name: Upload s390x static tarball to GitHub

				        run: |

				          ./tools/packaging/release/release.sh upload-kata-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: s390x

				      - name: Download ppc64le artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-ppc64le

				      - name: Upload ppc64le static tarball to GitHub

				        run: |

				          ./tools/packaging/release/release.sh upload-kata-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: ppc64le

				      - name: Set KATA_TOOLS_STATIC_TARBALL env var

				        run: |

				          tarball=$(pwd)/kata-tools-static.tar.zst

				          echo "KATA_TOOLS_STATIC_TARBALL=${tarball}" >> "$GITHUB_ENV"

				      - name: Download amd64 tools artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64

				      - name: Upload amd64 static tarball tools to GitHub

				        run: |

				          ./tools/packaging/release/release.sh upload-kata-tools-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: amd64

				  upload-versions-yaml:

				    name: upload-versions-yaml

				    needs: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Upload versions.yaml to GitHub

				        run: |

				          ./tools/packaging/release/release.sh upload-versions-yaml-file

				        env:

				          GH_TOKEN: ${{ github.token }}

				  upload-cargo-vendored-tarball:

				    needs: upload-static-tarball

				    runs-on: ubuntu-latest

				    name: upload-cargo-vendored-tarball

				    needs: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				    steps:

				      - uses: actions/checkout@v2

				      - name: generate-and-upload-tarball

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Generate and upload vendored code tarball

				        run: |

				          pushd $GITHUB_WORKSPACE/src/agent

				          cargo vendor >> .cargo/config

				          popd

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tarball="kata-containers-$tag-vendor.tar.gz"

				          pushd $GITHUB_WORKSPACE

				          tar -cvzf "${tarball}" src/agent/.cargo/config src/agent/vendor

				          GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}" 

				          popd

				          ./tools/packaging/release/release.sh upload-vendored-code-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				  upload-libseccomp-tarball:

				    needs: upload-cargo-vendored-tarball

				    runs-on: ubuntu-latest

				    name: upload-libseccomp-tarball

				    needs: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				    steps:

				      - uses: actions/checkout@v2

				      - name: download-and-upload-tarball

				        env:

				          GITHUB_TOKEN: ${{ secrets.GIT_UPLOAD_TOKEN }}

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Download libseccomp tarball and upload it to GitHub

				        run: |

				          pushd $GITHUB_WORKSPACE

				          ./ci/install_yq.sh

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          versions_yaml="versions.yaml"

				          version=$(yq read ${versions_yaml} "externals.libseccomp.version")

				          repo_url=$(yq read ${versions_yaml} "externals.libseccomp.url")

				          download_url="${repo_url}/releases/download/v${version}"

				          tarball="libseccomp-${version}.tar.gz"

				          asc="${tarball}.asc"

				          curl -sSLO "${download_url}/${tarball}"

				          curl -sSLO "${download_url}/${asc}"

				          # "-m" option should be empty to re-use the existing release title

				          # without opening a text editor.

				          # For the details, check https://hub.github.com/hub-release.1.html.

				          hub release edit -m "" -a "${tarball}" "${tag}"

				          hub release edit -m "" -a "${asc}" "${tag}"

				          popd

				          ./tools/packaging/release/release.sh upload-libseccomp-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				  upload-helm-chart-tarball:

				    name: upload-helm-chart-tarball

				    needs: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				      packages: write # needed to push the helm chart to ghcr.io

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Install helm

				        uses: azure/setup-helm@fe7b79cd5ee1e45176fcad797de68ecaf3ca4814 # v4.2.0

				        id: install

				      - name: Generate and upload helm chart tarball

				        run: |

				          ./tools/packaging/release/release.sh upload-helm-chart-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: Login to the OCI registries

				        env:

				          QUAY_DEPLOYER_USERNAME: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				          GITHUB_TOKEN: ${{ github.token }}

				        run: |

				          echo "${QUAY_DEPLOYER_PASSWORD}" | helm registry login quay.io --username "${QUAY_DEPLOYER_USERNAME}" --password-stdin

				          echo "${GITHUB_TOKEN}" | helm registry login ghcr.io --username "${GITHUB_ACTOR}" --password-stdin

				      - name: Push helm chart to the OCI registries

				        run: |

				          release_version=$(./tools/packaging/release/release.sh release-version)

				          helm push "kata-deploy-${release_version}.tgz" oci://quay.io/kata-containers/kata-deploy-charts

				          helm push "kata-deploy-${release_version}.tgz" oci://ghcr.io/kata-containers/kata-deploy-charts

				  publish-release:

				    name: publish-release

				    needs: [ build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le, publish-multi-arch-images, upload-multi-arch-static-tarball, upload-versions-yaml, upload-cargo-vendored-tarball, upload-libseccomp-tarball ]

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Publish a release

				        run: |

				          ./tools/packaging/release/release.sh publish-release

				        env:

				          GH_TOKEN: ${{ github.token }}

									
										51

.github/workflows/require-pr-porting-labels.yaml
									
										vendored
									
												View File
											
				@@ -1,51 +0,0 @@

				# Copyright (c) 2020 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				name: Ensure PR has required porting labels

				on:

				  pull_request_target:

				    types:

				      - opened

				      - reopened

				      - labeled

				      - unlabeled

				    branches:

				      - main

				jobs:

				  check-pr-porting-labels:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Install hub

				        run: |

				          HUB_ARCH="amd64"

				          HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\

				            jq -r .tag_name | sed 's/^v//')

				          curl -sL \

				            "https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && \

				          sudo install hub /usr/local/bin

				      - name: Checkout code to allow hub to communicate with the project

				        uses: actions/checkout@v2

				      - name: Install porting checker script

				        run: |

				          # Clone into a temporary directory to avoid overwriting

				          # any existing github directory.

				          pushd $(mktemp -d) &>/dev/null

				          git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts

				          sudo install pr-porting-checks.sh /usr/local/bin

				          popd &>/dev/null

				      - name: Stop PR being merged unless it has a correct set of porting labels

				        env:

				          GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}

				        run: |

				          pr=${{ github.event.number }}

				          repo=${{ github.repository }}

				          pr-porting-checks.sh "$pr" "$repo"

									
										75

.github/workflows/run-cri-containerd-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,75 @@

				name: CI | Run cri-containerd tests

				permissions: {}

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				      runner:

				        description: The runner to execute the workflow on.

				        required: true

				        type: string

				      arch:

				        description: The arch of the tarball.

				        required: true

				        type: string

				      containerd_version:

				        description: The version of containerd for testing.

				        required: true

				        type: string

				      vmm:

				        description: The kata hypervisor for testing.

				        required: true

				        type: string

				jobs:

				  run-cri-containerd:

				    name: run-cri-containerd-${{ inputs.arch }} (${{ inputs.containerd_version }}, ${{ inputs.vmm }})

				    strategy:

				      fail-fast: false

				    runs-on: ${{ inputs.runner }}

				    env:

				      CONTAINERD_VERSION: ${{ inputs.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ inputs.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        timeout-minutes: 15

				        run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball for ${{ inputs.arch }}

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-${{ inputs.arch }}${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts

				      - name: Run cri-containerd tests for ${{ inputs.arch }}

				        timeout-minutes: 10

				        run: bash tests/integration/cri-containerd/gha-run.sh run

									
										159

.github/workflows/run-k8s-tests-on-aks.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,159 @@

				name: CI | Run kubernetes tests on AKS

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				permissions: {}

				jobs:

				  run-k8s-tests:

				    name: run-k8s-tests

				    strategy:

				      fail-fast: false

				      matrix:

				        host_os:

				          - ubuntu

				        vmm:

				          - clh

				          - dragonball

				          - qemu

				          - qemu-runtime-rs

				          - cloud-hypervisor

				        instance-type:

				          - small

				          - normal

				        include:

				          - host_os: cbl-mariner

				            vmm: clh

				            instance-type: small

				            genpolicy-pull-method: oci-distribution

				          - host_os: cbl-mariner

				            vmm: clh

				            instance-type: small

				            genpolicy-pull-method: containerd

				          - host_os: cbl-mariner

				            vmm: clh

				            instance-type: normal

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read

				      id-token: write # Used for OIDC access to log into Azure

				    environment: ci

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HOST_OS: ${{ matrix.host_os }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: "vanilla"

				      K8S_TEST_HOST_TYPE: ${{ matrix.instance-type }}

				      GENPOLICY_PULL_METHOD: ${{ matrix.genpolicy-pull-method }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				      - name: Install kata-tools

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Download Azure CLI

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Log into the Azure account

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Create AKS cluster

				        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # v3.0.2

				        with:

				          timeout_minutes: 15

				          max_attempts: 20

				          retry_on: error

				          retry_wait_seconds: 10

				          command: bash tests/integration/kubernetes/gha-run.sh create-cluster

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Install `kubectl`

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Download credentials for the Kubernetes CLI to use them

				        run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

				      - name: Deploy Kata

				        timeout-minutes: 20

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks

				      - name: Run tests

				        timeout-minutes: 60

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Refresh OIDC token in case access token expired

				        if: always()

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Delete AKS cluster

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh delete-cluster

									
										90

.github/workflows/run-k8s-tests-on-arm64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,90 @@

				name: CI | Run kubernetes tests on arm64

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  run-k8s-tests-on-arm64:

				    name: run-k8s-tests-on-arm64

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu

				        k8s:

				          - kubeadm

				    runs-on: arm64-k8s

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				      K8S_TEST_HOST_TYPE: all

				      TARGET_ARCH: "aarch64"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Deploy Kata

				        timeout-minutes: 20

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Run tests

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Collect artifacts ${{ matrix.vmm }}

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh collect-artifacts

				        continue-on-error: true

				      - name: Archive artifacts ${{ matrix.vmm }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: k8s-tests-${{ matrix.vmm }}-${{ matrix.k8s }}-${{ inputs.tag }}

				          path: /tmp/artifacts

				          retention-days: 1

				      - name: Delete kata-deploy

				        if: always()

				        timeout-minutes: 15

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup

									
										130

.github/workflows/run-k8s-tests-on-nvidia-gpu.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,130 @@

				name: CI | Run NVIDIA GPU kubernetes tests on amd64

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: true

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      NGC_API_KEY:

				        required: true

				permissions: {}

				jobs:

				  run-nvidia-gpu-tests-on-amd64:

				    name: run-${{ matrix.environment.name }}-tests-on-amd64

				    strategy:

				      fail-fast: false

				      matrix:

				        environment: [

				          { name: nvidia-gpu,     vmm: qemu-nvidia-gpu,     runner: amd64-nvidia-a100 },

				          { name: nvidia-gpu-snp, vmm: qemu-nvidia-gpu-snp, runner: amd64-nvidia-h100-snp },

				        ]

				    runs-on: ${{ matrix.environment.runner }}

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.environment.vmm }}

				      KUBERNETES: kubeadm

				      KBS: ${{ matrix.environment.name == 'nvidia-gpu-snp' && 'true' || 'false' }}

				      K8S_TEST_HOST_TYPE: baremetal

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				      - name: Install kata-tools

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Uninstall previous `kbs-client`

				        if: matrix.environment.name != 'nvidia-gpu'

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client

				      - name: Deploy CoCo KBS

				        if: matrix.environment.name != 'nvidia-gpu'

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				        env:

				          NVIDIA_VERIFIER_MODE: remote

				          KBS_INGRESS: nodeport

				      - name: Install `kbs-client`

				        if: matrix.environment.name != 'nvidia-gpu'

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				      - name: Deploy Kata

				        timeout-minutes: 20

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Run tests ${{ matrix.environment.vmm }}

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-nv-tests

				        env:

				          NGC_API_KEY: ${{ secrets.NGC_API_KEY }}

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Collect artifacts ${{ matrix.environment.vmm }}

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh collect-artifacts

				        continue-on-error: true

				      - name: Archive artifacts ${{ matrix.environment.vmm }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: k8s-tests-${{ matrix.environment.vmm }}-kubeadm-${{ inputs.tag }}

				          path: /tmp/artifacts

				          retention-days: 1

				      - name: Delete kata-deploy

				        if: always()

				        timeout-minutes: 15

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup

				      - name: Delete CoCo KBS

				        if: always() && matrix.environment.name != 'nvidia-gpu'

				        run: |

				          bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

									
										81

.github/workflows/run-k8s-tests-on-ppc64le.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,81 @@

				name: CI | Run kubernetes tests on Power(ppc64le)

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  run-k8s-tests:

				    name: run-k8s-tests

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu

				        k8s:

				          - kubeadm

				    runs-on: ppc64le-k8s

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				      TARGET_ARCH: "ppc64le"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install golang

				        run: |

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "$GITHUB_PATH"

				      - name: Prepare the runner for k8s test suite

				        run: bash "${HOME}/scripts/k8s_cluster_prepare.sh"

				      - name: Check if cluster is healthy to run the tests

				        run: bash "${HOME}/scripts/k8s_cluster_check.sh"

				      - name: Deploy Kata

				        timeout-minutes: 20

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-kubeadm

				      - name: Run tests

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

									
										147

.github/workflows/run-k8s-tests-on-zvsi.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,147 @@

				name: CI | Run kubernetes tests on IBM Cloud Z virtual server instance (zVSI)

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				permissions: {}

				jobs:

				  run-k8s-tests:

				    name: run-k8s-tests

				    strategy:

				      fail-fast: false

				      matrix:

				        snapshotter:

				          - overlayfs

				          - devmapper

				          - nydus

				        vmm:

				          - qemu

				          - qemu-runtime-rs

				          - qemu-coco-dev

				        k8s:

				          - kubeadm

				        include:

				          - snapshotter: devmapper

				            pull-type: default

				            deploy-cmd: configure-snapshotter

				          - snapshotter: nydus

				            pull-type: guest-pull

				            deploy-cmd: deploy-snapshotter

				        exclude:

				          - snapshotter: overlayfs

				            vmm: qemu

				          - snapshotter: overlayfs

				            vmm: qemu-coco-dev

				          - snapshotter: devmapper

				            vmm: qemu-runtime-rs

				          - snapshotter: devmapper

				            vmm: qemu-coco-dev

				          - snapshotter: nydus

				            vmm: qemu

				          - snapshotter: nydus

				            vmm: qemu-runtime-rs

				    runs-on: s390x-large

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HOST_OS: "ubuntu"

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				      PULL_TYPE: ${{ matrix.pull-type }}

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      TARGET_ARCH: "s390x"

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Set SNAPSHOTTER to empty if overlayfs

				        run: echo "SNAPSHOTTER=" >> "$GITHUB_ENV"

				        if: ${{ matrix.snapshotter == 'overlayfs' }}

				      - name: Set KBS and KBS_INGRESS if qemu-coco-dev

				        run: |

				          echo "KBS=true" >> "$GITHUB_ENV"

				          echo "KBS_INGRESS=nodeport" >> "$GITHUB_ENV"

				        if: ${{ matrix.vmm == 'qemu-coco-dev' }}

				      # qemu-runtime-rs only works with overlayfs

				      # See: https://github.com/kata-containers/kata-containers/issues/10066

				      - name: Configure the ${{ matrix.snapshotter }} snapshotter

				        env:

				          DEPLOY_CMD: ${{ matrix.deploy-cmd }}

				        run: bash tests/integration/kubernetes/gha-run.sh "${DEPLOY_CMD}"

				        if: ${{ matrix.snapshotter != 'overlayfs' }}

				      - name: Deploy Kata

				        timeout-minutes: 20

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-zvsi

				      - name: Uninstall previous `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client

				        if: ${{ matrix.vmm == 'qemu-coco-dev' }}

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				        if: ${{ matrix.vmm == 'qemu-coco-dev' }}

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				        if: ${{ matrix.vmm == 'qemu-coco-dev' }}

				      - name: Run tests

				        timeout-minutes: 60

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Delete kata-deploy

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-zvsi

				      - name: Delete CoCo KBS

				        if: always()

				        run: |

				          if [ "${KBS}" == "true" ]; then

				            bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

				          fi

									
										157

.github/workflows/run-kata-coco-stability-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,157 @@

				name: CI | Run Kata CoCo k8s Stability Tests

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				      tarball-suffix:

				        required: false

				        type: string

				    secrets:

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				permissions: {}

				jobs:

				  # Generate jobs for testing CoCo on non-TEE environments

				  run-stability-k8s-tests-coco-nontee:

				    name: run-stability-k8s-tests-coco-nontee

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-coco-dev

				          - qemu-coco-dev-runtime-rs

				        snapshotter:

				          - nydus

				        pull-type:

				          - guest-pull

				    runs-on: ubuntu-22.04

				    permissions:

				      id-token: write # Used for OIDC access to log into Azure

				    environment: ci

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      # Some tests rely on that variable to run (or not)

				      KBS: "true"

				      # Set the KBS ingress handler (empty string disables handling)

				      KBS_INGRESS: "aks"

				      KUBERNETES: "vanilla"

				      PULL_TYPE: ${{ matrix.pull-type }}

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				      - name: Install kata-tools

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Log into the Azure account

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Create AKS cluster

				        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # v3.0.2

				        with:

				          timeout_minutes: 15

				          max_attempts: 20

				          retry_on: error

				          retry_wait_seconds: 10

				          command: bash tests/integration/kubernetes/gha-run.sh create-cluster

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Install `kubectl`

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Download credentials for the Kubernetes CLI to use them

				        run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

				      - name: Deploy Snapshotter

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				      - name: Run stability tests

				        timeout-minutes: 300

				        run: bash tests/stability/gha-stability-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Refresh OIDC token in case access token expired

				        if: always()

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Delete AKS cluster

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh delete-cluster

									
										363

.github/workflows/run-kata-coco-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,363 @@

				name: CI | Run kata coco tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				      ITA_KEY:

				        required: true

				permissions: {}

				jobs:

				  run-k8s-tests-on-tee:

				    name: run-k8s-tests-on-tee

				    strategy:

				      fail-fast: false

				      matrix:

				        include:

				          - runner: tdx

				            vmm: qemu-tdx

				          - runner: sev-snp

				            vmm: qemu-snp

				    runs-on: ${{ matrix.runner }}

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: "vanilla"

				      KBS: "true"

				      K8S_TEST_HOST_TYPE: "baremetal"

				      KBS_INGRESS: "nodeport"

				      SNAPSHOTTER: "nydus"

				      PULL_TYPE: "guest-pull"

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      GH_ITA_KEY: ${{ secrets.ITA_KEY }}

				      AUTO_GENERATE_POLICY: "yes"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				      - name: Install kata-tools

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Deploy Kata

				        timeout-minutes: 20

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata

				      - name: Uninstall previous `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				        env:

				          ITA_KEY: ${{ env.KATA_HYPERVISOR == 'qemu-tdx' && env.GH_ITA_KEY || '' }}

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				      - name: Deploy CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver

				      - name: Run tests

				        timeout-minutes: 100

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Delete kata-deploy

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup

				      - name: Delete CoCo KBS

				        if: always()

				        run: |

				          [[ "${KATA_HYPERVISOR}" == "qemu-tdx" ]] && echo "ITA_KEY=${GH_ITA_KEY}" >> "${GITHUB_ENV}"

				          bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

				      - name: Delete CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver

				  # Generate jobs for testing CoCo on non-TEE environments

				  run-k8s-tests-coco-nontee:

				    name: run-k8s-tests-coco-nontee

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-coco-dev

				          - qemu-coco-dev-runtime-rs

				        snapshotter:

				          - nydus

				        pull-type:

				          - guest-pull

				        include:

				          - pull-type: experimental-force-guest-pull

				            vmm: qemu-coco-dev

				            snapshotter: ""

				    runs-on: ubuntu-22.04

				    permissions:

				      id-token: write # Used for OIDC access to log into Azure

				    environment: ci

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      # Some tests rely on that variable to run (or not)

				      KBS: "true"

				      # Set the KBS ingress handler (empty string disables handling)

				      KBS_INGRESS: "aks"

				      KUBERNETES: "vanilla"

				      PULL_TYPE: ${{ matrix.pull-type }}

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      EXPERIMENTAL_FORCE_GUEST_PULL: ${{ matrix.pull-type == 'experimental-force-guest-pull' && matrix.vmm || '' }}

				      # Caution: current ingress controller used to expose the KBS service

				      # requires much vCPUs, lefting only a few for the tests. Depending on the

				      # host type chose it will result on the creation of a cluster with

				      # insufficient resources.

				      K8S_TEST_HOST_TYPE: "all"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				      - name: Install kata-tools

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Log into the Azure account

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Create AKS cluster

				        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # v3.0.2

				        with:

				          timeout_minutes: 15

				          max_attempts: 20

				          retry_on: error

				          retry_wait_seconds: 10

				          command: bash tests/integration/kubernetes/gha-run.sh create-cluster

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Install `kubectl`

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Download credentials for the Kubernetes CLI to use them

				        run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

				      - name: Deploy Kata

				        timeout-minutes: 20

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks

				        env:

				          USE_EXPERIMENTAL_SETUP_SNAPSHOTTER: ${{ env.SNAPSHOTTER == 'nydus' }}

				          AUTO_GENERATE_POLICY: ${{ env.PULL_TYPE == 'experimental-force-guest-pull' && 'no' || 'yes' }}

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				      - name: Deploy CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver

				      - name: Run tests

				        timeout-minutes: 80

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Refresh OIDC token in case access token expired

				        if: always()

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Delete AKS cluster

				        if: always()

				        timeout-minutes: 15

				        run: bash tests/integration/kubernetes/gha-run.sh delete-cluster

				  # Generate jobs for testing CoCo on non-TEE environments with erofs-snapshotter

				  run-k8s-tests-coco-nontee-with-erofs-snapshotter:

				    name: run-k8s-tests-coco-nontee-with-erofs-snapshotter

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-coco-dev

				        snapshotter:

				          - erofs

				        pull-type:

				          - default

				    runs-on: ubuntu-24.04

				    environment: ci

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      # Some tests rely on that variable to run (or not)

				      KBS: "false"

				      # Set the KBS ingress handler (empty string disables handling)

				      KBS_INGRESS: ""

				      KUBERNETES: "vanilla"

				      CONTAINER_ENGINE: "containerd"

				      CONTAINER_ENGINE_VERSION: "v2.2"

				      PULL_TYPE: ${{ matrix.pull-type }}

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      USE_EXPERIMENTAL_SETUP_SNAPSHOTTER: "true"

				      K8S_TEST_HOST_TYPE: "all"

				      # We are skipping the auto generated policy tests for now,

				      # but those should be enabled as soon as we work on that.

				      AUTO_GENERATE_POLICY: "no"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tools-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-tools-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-tools-artifacts

				      - name: Install kata-tools

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-tools-artifacts

				      - name: Remove unnecessary directories to free up space

				        run: |

				          sudo rm -rf /usr/local/.ghcup

				          sudo rm -rf /opt/hostedtoolcache/CodeQL

				          sudo rm -rf /usr/local/lib/android

				          sudo rm -rf /usr/share/dotnet

				          sudo rm -rf /opt/ghc

				          sudo rm -rf /usr/local/share/boost

				          sudo rm -rf /usr/lib/jvm

				          sudo rm -rf /usr/share/swift

				          sudo rm -rf /usr/local/share/powershell

				          sudo rm -rf /usr/local/julia*

				          sudo rm -rf /opt/az

				          sudo rm -rf /usr/local/share/chromium

				          sudo rm -rf /opt/microsoft

				          sudo rm -rf /opt/google

				          sudo rm -rf /usr/lib/firefox

				      - name: Deploy kubernetes

				        timeout-minutes: 15

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-k8s

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Deploy Kata

				        timeout-minutes: 20

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata

				      - name: Deploy CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver

				      - name: Run tests

				        timeout-minutes: 80

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

									
										119

.github/workflows/run-kata-deploy-tests-on-aks.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,119 @@

				name: CI | Run kata-deploy tests on AKS

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				permissions: {}

				jobs:

				  run-kata-deploy-tests:

				    name: run-kata-deploy-tests

				    strategy:

				      fail-fast: false

				      matrix:

				        host_os:

				          - ubuntu

				        vmm:

				          - clh

				          - dragonball

				          - qemu

				          - qemu-runtime-rs

				        include:

				          - host_os: cbl-mariner

				            vmm: clh

				    runs-on: ubuntu-22.04

				    environment: ci

				    permissions:

				      id-token: write # Used for OIDC access to log into Azure

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HOST_OS: ${{ matrix.host_os }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: "vanilla"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Log into the Azure account

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Create AKS cluster

				        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # v3.0.2

				        with:

				          timeout_minutes: 15

				          max_attempts: 20

				          retry_on: error

				          retry_wait_seconds: 10

				          command: bash tests/integration/kubernetes/gha-run.sh create-cluster

				      - name: Install `bats`

				        run: bash tests/functional/kata-deploy/gha-run.sh install-bats

				      - name: Install `kubectl`

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Download credentials for the Kubernetes CLI to use them

				        run: bash tests/functional/kata-deploy/gha-run.sh get-cluster-credentials

				      - name: Run tests

				        run: bash tests/functional/kata-deploy/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Refresh OIDC token in case access token expired

				        if: always()

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Delete AKS cluster

				        if: always()

				        run: bash tests/functional/kata-deploy/gha-run.sh delete-cluster

									
										90

.github/workflows/run-kata-deploy-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,90 @@

				name: CI | Run kata-deploy tests

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  run-kata-deploy-tests:

				    name: run-kata-deploy-tests

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu

				        k8s:

				          - k0s

				          - k3s

				          - rke2

				          - microk8s

				    runs-on: ubuntu-22.04

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Remove unnecessary directories to free up space

				        run: |

				          sudo rm -rf /usr/local/.ghcup

				          sudo rm -rf /opt/hostedtoolcache/CodeQL

				          sudo rm -rf /usr/local/lib/android

				          sudo rm -rf /usr/share/dotnet

				          sudo rm -rf /opt/ghc

				          sudo rm -rf /usr/local/share/boost

				          sudo rm -rf /usr/lib/jvm

				          sudo rm -rf /usr/share/swift

				          sudo rm -rf /usr/local/share/powershell

				          sudo rm -rf /usr/local/julia*

				          sudo rm -rf /opt/az

				          sudo rm -rf /usr/local/share/chromium

				          sudo rm -rf /opt/microsoft

				          sudo rm -rf /opt/google

				          sudo rm -rf /usr/lib/firefox

				      - name: Deploy ${{ matrix.k8s }}

				        run:  bash tests/functional/kata-deploy/gha-run.sh deploy-k8s

				      - name: Install `bats`

				        run: bash tests/functional/kata-deploy/gha-run.sh install-bats

				      - name: Run tests

				        run: bash tests/functional/kata-deploy/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

									
										70

.github/workflows/run-kata-monitor-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,70 @@

				name: CI | Run kata-monitor tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  run-monitor:

				    name: run-monitor

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu

				        container_engine:

				          - crio

				          - containerd

				        # TODO: enable when https://github.com/kata-containers/kata-containers/issues/9853 is fixed

				        #include:

				        #  - container_engine: containerd

				        #    containerd_version: lts

				        exclude:

				          # TODO: enable with containerd when https://github.com/kata-containers/kata-containers/issues/9761 is fixed

				          - container_engine: containerd

				            vmm: qemu

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINER_ENGINE: ${{ matrix.container_engine }}

				      #CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/functional/kata-monitor/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/functional/kata-monitor/gha-run.sh install-kata kata-artifacts

				      - name: Run kata-monitor tests

				        run: bash tests/functional/kata-monitor/gha-run.sh run

									
										128

.github/workflows/run-metrics.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,128 @@

				name: CI | Run test metrics

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  run-metrics:

				    name: run-metrics

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # the all the tests are not flaky, otherwise we'll fail

				      # all the tests due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        vmm: ['clh', 'qemu']

				      max-parallel: 1

				    runs-on: metrics

				    env:

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      K8S_TEST_HOST_TYPE: "baremetal"

				      KUBERNETES: kubeadm

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-kubeadm

				      - name: Install check metrics

				        run: bash tests/metrics/gha-run.sh install-checkmetrics

				      - name: enabling the hypervisor

				        run: bash tests/metrics/gha-run.sh enabling-hypervisor

				      - name: run launch times test

				        timeout-minutes: 15

				        continue-on-error: true

				        run: bash tests/metrics/gha-run.sh run-test-launchtimes

				      - name: run memory foot print test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-memory-usage

				      - name: run memory usage inside container test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-memory-usage-inside-container

				      - name: run blogbench test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-blogbench

				      - name: run tensorflow test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-tensorflow

				      - name: run fio test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-fio

				      - name: run iperf test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-iperf

				      - name: run latency test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-latency

				      - name: check metrics

				        run:  bash tests/metrics/gha-run.sh check-metrics

				      - name: make metrics tarball ${{ matrix.vmm }}

				        run: bash tests/metrics/gha-run.sh make-tarball-results

				      - name: archive metrics results ${{ matrix.vmm }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: metrics-artifacts-${{ matrix.vmm }}

				          path: results-${{ matrix.vmm }}.tar.gz

				          retention-days: 1

				          if-no-files-found: error

				      - name: Delete kata-deploy

				        timeout-minutes: 10

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-kubeadm

									
										54

.github/workflows/run-runk-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,54 @@

				name: CI | Run runk tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions: {}

				jobs:

				  run-runk:

				    name: run-runk

				    # Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether

				    if: false

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: lts

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/runk/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts

				      - name: Run runk tests

				        run: bash tests/integration/runk/gha-run.sh run

									
										60

.github/workflows/scorecard.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,60 @@

				# This workflow uses actions that are not certified by GitHub. They are provided

				# by a third-party and are governed by separate terms of service, privacy

				# policy, and support documentation.

				name: Scorecard supply-chain security

				on:

				  # For Branch-Protection check. Only the default branch is supported. See

				  # https://github.com/ossf/scorecard/blob/main/docs/checks.md#branch-protection

				  branch_protection_rule:

				  push:

				    branches: [ "main" ]

				  workflow_dispatch:

				permissions: {}

				jobs:

				  analysis:

				    name: Scorecard analysis

				    runs-on: ubuntu-latest

				    # `publish_results: true` only works when run from the default branch. conditional can be removed if disabled.

				    if: github.event.repository.default_branch == github.ref_name || github.event_name == 'pull_request'

				    permissions:

				      # Needed to upload the results to code-scanning dashboard.

				      security-events: write

				      # Needed to publish results and get a badge (see publish_results below).

				      id-token: write

				    steps:

				      - name: "Checkout code"

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: "Run analysis"

				        uses: ossf/scorecard-action@f49aabe0b5af0936a0987cfb85d86b75731b0186 # v2.4.1

				        with:

				          results_file: results.sarif

				          results_format: sarif

				          # Public repositories:

				          #   - Publish results to OpenSSF REST API for easy access by consumers

				          #   - Allows the repository to include the Scorecard badge.

				          #   - See https://github.com/ossf/scorecard-action#publishing-results.

				          publish_results: true

				      # Upload the results as artifacts (optional). Commenting out will disable uploads of run results in SARIF

				      # format to the repository Actions tab.

				      - name: "Upload artifact"

				        uses: actions/upload-artifact@4cec3d8aa04e39d1a68397de0c4cd6fb9dce8ec1 # v4.6.1

				        with:

				          name: SARIF file

				          path: results.sarif

				          retention-days: 5

				      # Upload the results to GitHub's code scanning dashboard (optional).

				      # Commenting out will disable upload of results to your repo's Code Scanning dashboard

				      - name: "Upload to code-scanning"

				        uses: github/codeql-action/upload-sarif@v3

				        with:

				          sarif_file: results.sarif

									
										32

.github/workflows/shellcheck.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,32 @@

				# https://github.com/marketplace/actions/shellcheck

				name: Check shell scripts

				on:

				  workflow_dispatch:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  shellcheck:

				    name: shellcheck

				    runs-on: ubuntu-24.04

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Run ShellCheck

				        uses: ludeeus/action-shellcheck@00cae500b08a931fb5698e11e79bfbd38e612a38 # v2.0.0

				        with:

				          ignore_paths: "**/vendor/**"

									
										35

.github/workflows/shellcheck_required.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,35 @@

				# https://github.com/marketplace/actions/shellcheck

				name: Shellcheck required

				on:

				  workflow_dispatch:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  shellcheck-required:

				    name: shellcheck-required

				    runs-on: ubuntu-24.04

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Run ShellCheck

				        uses: ludeeus/action-shellcheck@00cae500b08a931fb5698e11e79bfbd38e612a38 # v2.0.0

				        with:

				          severity: error

				          ignore_paths: "**/vendor/**"

									
										39

.github/workflows/snap-release.yaml
									
										vendored
									
												View File
											
				@@ -1,39 +0,0 @@

				name: Release Kata 2.x in snapcraft store

				on:

				  push:

				    tags:

				      - '2.*'

				jobs:

				  release-snap:

				    runs-on: ubuntu-20.04

				    steps:

				      - name: Check out Git repository

				        uses: actions/checkout@v2

				        with:

				          fetch-depth: 0

				      - name: Install Snapcraft

				        uses: samuelmeuli/action-snapcraft@v1

				        with:

				          snapcraft_token: ${{ secrets.snapcraft_token }}

				      - name: Build snap

				        run: |

				          sudo apt-get install -y git git-extras

				          kata_url="https://github.com/kata-containers/kata-containers"

				          latest_version=$(git ls-remote --tags ${kata_url}  | egrep -o "refs.*" | egrep -v "\-alpha|\-rc|{}" | egrep -o "[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+" | sort -V -r | head -1)

				          current_version="$(echo ${GITHUB_REF} | cut -d/ -f3)"

				          # Check semantic versioning format (x.y.z) and if the current tag is the latest tag

				          if echo "${current_version}" | grep -q "^[[:digit:]]\+\.[[:digit:]]\+\.[[:digit:]]\+$" && echo -e "$latest_version\n$current_version" | sort -C -V; then

				            # Current version is the latest version, build it

				            snapcraft -d snap --destructive-mode

				          fi

				      - name: Upload snap

				        run: |

				          snap_version="$(echo ${GITHUB_REF} | cut -d/ -f3)"

				          snap_file="kata-containers_${snap_version}_amd64.snap"

				          # Upload the snap if it exists

				          if [ -f ${snap_file} ]; then

				            snapcraft upload --release=stable ${snap_file}

				          fi

									
										17

.github/workflows/snap.yaml
									
										vendored
									
												View File
											
				@@ -1,17 +0,0 @@

				name: snap CI

				on: ["pull_request"]

				jobs:

				  test:

				    runs-on: ubuntu-20.04

				    steps:

				      - name: Check out

				        uses: actions/checkout@v2

				        with:

				          fetch-depth: 0

				      - name: Install Snapcraft

				        uses: samuelmeuli/action-snapcraft@v1

				      - name: Build snap

				        run: |

				          snapcraft -d snap --destructive-mode

									
										20

.github/workflows/stale.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,20 @@

				name: 'Automatically close stale PRs'

				on:

				  schedule:

				    - cron: '0 0 * * *'

				  workflow_dispatch:

				permissions: {}

				jobs:

				  stale:

				    name: stale

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/stale@5bef64f19d7facfb25b37b414482c7164d639639 # v9.1.0

				        with:

				          stale-pr-message: 'This PR has been opened without with no activity for 180 days. Comment on the issue otherwise it will be closed in 7 days'

				          days-before-pr-stale: 180

				          days-before-pr-close: 7

				          days-before-issue-stale: -1

				          days-before-issue-close: -1

									
										36

.github/workflows/static-checks-self-hosted.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,36 @@

				on:

				  pull_request:

				    types:

				      - opened

				      - synchronize

				      - reopened

				      - labeled # a workflow runs only when the 'ok-to-test' label is added

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				name: Static checks self-hosted

				jobs:

				  skipper:

				    if: ${{ contains(github.event.pull_request.labels.*.name, 'ok-to-test') }}

				    uses: ./.github/workflows/gatekeeper-skipper.yaml

				    with:

				      commit-hash: ${{ github.event.pull_request.head.sha }}

				      target-branch: ${{ github.event.pull_request.base.ref }}

				  build-checks:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    strategy:

				      fail-fast: false

				      matrix:

				        instance:

				          - "ubuntu-24.04-arm"

				          - "ubuntu-24.04-s390x"

				          - "ubuntu-24.04-ppc64le"

				    uses: ./.github/workflows/build-checks.yaml

				    with:

				      instance: ${{ matrix.instance }}

									
										268

.github/workflows/static-checks.yaml
									
										vendored
									
												View File
												
				@@ -5,94 +5,188 @@ on:

				      - edited

				      - reopened

				      - synchronize

				      - labeled

				      - unlabeled

				  workflow_dispatch:

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				name: Static checks

				jobs:

				  test:

				    strategy:

				      matrix:

				        go-version: [1.15.x, 1.16.x]

				        os: [ubuntu-20.04]

				    runs-on: ${{ matrix.os }}

				    env:

				      TRAVIS: "true"

				      TRAVIS_BRANCH: ${{ github.base_ref }}

				      TRAVIS_PULL_REQUEST_BRANCH: ${{ github.head_ref }}

				      TRAVIS_PULL_REQUEST_SHA : ${{ github.event.pull_request.head.sha }}

				      RUST_BACKTRACE: "1"

				      target_branch: ${{ github.base_ref }}

				  skipper:

				    uses: ./.github/workflows/gatekeeper-skipper.yaml

				    with:

				      commit-hash: ${{ github.event.pull_request.head.sha }}

				      target-branch: ${{ github.event.pull_request.base.ref }}

				  check-kernel-config-version:

				    name: check-kernel-config-version

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    runs-on: ubuntu-22.04

				    steps:

				    - name: Install Go

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      uses: actions/setup-go@v2

				      with:

				        go-version: ${{ matrix.go-version }}

				      env:

				        GOPATH: ${{ runner.workspace }}/kata-containers

				    - name: Setup GOPATH

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        echo "TRAVIS_BRANCH: ${TRAVIS_BRANCH}"

				        echo "TRAVIS_PULL_REQUEST_BRANCH: ${TRAVIS_PULL_REQUEST_BRANCH}"

				        echo "TRAVIS_PULL_REQUEST_SHA: ${TRAVIS_PULL_REQUEST_SHA}"

				        echo "TRAVIS: ${TRAVIS}"

				    - name: Set env

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV

				        echo "${{ github.workspace }}/bin" >> $GITHUB_PATH

				    - name: Checkout code

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      uses: actions/checkout@v2

				      with:

				        fetch-depth: 0

				        path: ./src/github.com/${{ github.repository }}

				    - name: Setup travis references

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        echo "TRAVIS_BRANCH=${TRAVIS_BRANCH:-$(echo $GITHUB_REF | awk 'BEGIN { FS = \"/\" } ; { print $3 }')}"

				        target_branch=${TRAVIS_BRANCH}

				    - name: Setup

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh

				      env:

				        GOPATH: ${{ runner.workspace }}/kata-containers

				    - name: Installing rust

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_rust.sh

				        PATH=$PATH:"$HOME/.cargo/bin"

				        rustup target add x86_64-unknown-linux-musl

				        rustup component add rustfmt clippy

				    - name: Setup seccomp

				      run: |

				        libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)

				        gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"

				        echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"

				        echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV

				        echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV

				    # Check whether the vendored code is up-to-date & working as the first thing

				    - name: Check vendored code

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && make vendor

				    - name: Static Checks

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && make static-checks

				    - name: Run Compiler Checks

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && make check

				    - name: Run Unit Tests

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && make test

				    - name: Run Unit Tests As Root User

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && sudo -E PATH="$PATH" make test

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Ensure the kernel config version has been updated

				        run: |

				          kernel_dir="tools/packaging/kernel/"

				          kernel_version_file="${kernel_dir}kata_config_version"

				          modified_files=$(git diff --name-only origin/"$GITHUB_BASE_REF"..HEAD)

				          if git diff --name-only origin/"$GITHUB_BASE_REF"..HEAD "${kernel_dir}" | grep "${kernel_dir}"; then

				            echo "Kernel directory has changed, checking if $kernel_version_file has been updated"

				            if echo "$modified_files" | grep -v "README.md" | grep "${kernel_dir}" >>"/dev/null"; then

				              echo "$modified_files" | grep "$kernel_version_file" >>/dev/null || ( echo "Please bump version in $kernel_version_file" && exit 1)

				            else

				              echo "Readme file changed, no need for kernel config version update."

				            fi

				            echo "Check passed"

				          fi

				  build-checks:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    uses: ./.github/workflows/build-checks.yaml

				    with:

				      instance: ubuntu-22.04

				  build-checks-depending-on-kvm:

				    name: build-checks-depending-on-kvm

				    runs-on: ubuntu-22.04

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    strategy:

				      fail-fast: false

				      matrix:

				        component:

				          - runtime-rs

				        include:

				          - component: runtime-rs

				            command: "sudo -E env PATH=$PATH LIBC=gnu SUPPORT_VIRTUALIZATION=true make test"

				          - component: runtime-rs

				            component-path: src/dragonball

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install system deps

				        run: |

				          sudo apt-get update && sudo apt-get install -y build-essential musl-tools

				      - name: Install yq

				        run: |

				          sudo -E ./ci/install_yq.sh

				        env:

				          INSTALL_IN_GOPATH: false

				      - name: Install rust

				        run: |

				          export PATH="$PATH:/usr/local/bin"

				          ./tests/install_rust.sh

				      - name: Running `${{ matrix.command }}` for ${{ matrix.component }}

				        run: |

				          export PATH="$PATH:${HOME}/.cargo/bin"

				          cd "${COMPONENT_PATH}"

				          eval "${COMMAND}"

				        env:

				          COMMAND: ${{ matrix.command }}

				          COMPONENT_PATH: ${{ matrix.component-path }}

				          RUST_BACKTRACE: "1"

				          RUST_LIB_BACKTRACE: "0"

				  static-checks:

				    name: static-checks

				    runs-on: ubuntu-22.04

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    strategy:

				      fail-fast: false

				      matrix:

				        cmd:

				          - "make static-checks"

				    env:

				      GOPATH: ${{ github.workspace }}

				    permissions:

				      contents: read  # for checkout

				      packages: write # for push to ghcr.io

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				          path: ./src/github.com/${{ github.repository }}

				      - name: Install yq

				        run: |

				          cd "${GOPATH}/src/github.com/${GITHUB_REPOSITORY}"

				          ./ci/install_yq.sh

				        env:

				          INSTALL_IN_GOPATH: false

				      - name: Install golang

				        run: |

				          cd "${GOPATH}/src/github.com/${GITHUB_REPOSITORY}"

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "$GITHUB_PATH"

				      - name: Install system dependencies

				        run: |

				          sudo apt-get update && sudo apt-get -y install moreutils hunspell hunspell-en-gb hunspell-en-us pandoc

				      - name: Install open-policy-agent

				        run: |

				          cd "${GOPATH}/src/github.com/${GITHUB_REPOSITORY}"

				          ./tests/install_opa.sh

				      - name: Install regorus

				        env:

				          ARTEFACT_REPOSITORY: "${{ github.repository }}"

				          ARTEFACT_REGISTRY_USERNAME: "${{ github.actor }}"

				          ARTEFACT_REGISTRY_PASSWORD: "${{ secrets.GITHUB_TOKEN }}"

				        run: |

				          "${GOPATH}/src/github.com/${GITHUB_REPOSITORY}/tests/install_regorus.sh"

				      - name: Run check

				        env:

				          CMD: ${{ matrix.cmd }}

				        run: |

				          export PATH="${PATH}:${GOPATH}/bin"

				          cd "${GOPATH}/src/github.com/${GITHUB_REPOSITORY}" && ${CMD}

				  govulncheck:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    uses: ./.github/workflows/govulncheck.yaml

				  codegen:

				    name: codegen

				    runs-on: ubuntu-22.04

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    permissions:

				      contents: read  # for checkout

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: generate

				        run: make -C src/agent generate-protocols

				      - name: check for diff

				        run: |

				          diff=$(git diff)

				          if [[ -z "${diff}" ]]; then

				            echo "No diff detected."

				            exit 0

				          fi

				          cat << EOF >> "${GITHUB_STEP_SUMMARY}"

				          Run \`make -C src/agent generate-protocols\` to update protobuf bindings.

				          \`\`\`diff

				          ${diff}

				          \`\`\`

				          EOF

				          echo "::error::Golang protobuf bindings need to be regenerated (see Github step summary for diff)."

				          exit 1

									
										29

.github/workflows/zizmor.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,29 @@

				name: GHA security analysis

				on:

				  pull_request:

				permissions: {}

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  zizmor:

				    name: zizmor

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Run zizmor

				        uses: zizmorcore/zizmor-action@e673c3917a1aef3c65c972347ed84ccd013ecda4 # v0.2.0

				        with:

				          advanced-security: false

				          annotations: true

				          persona: auditor

				          version: v1.13.0

									
										3

.github/zizmor.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,3 @@

				rules:

				  undocumented-permissions:

				    disable: true

12

.gitignore vendored

View File

@@ -4,9 +4,19 @@
 **/*.rej
 **/target
 **/.vscode
 **/.idea
 **/.fleet
 **/*.swp
 **/*.swo
 pkg/logging/Cargo.lock
 src/agent/src/version.rs
 src/agent/kata-agent.service
 src/agent/protocols/src/*.rs
 !src/agent/protocols/src/lib.rs
 build
 src/tools/log-parser/kata-log-parser
 tools/packaging/static-build/agent/install_libseccomp.sh
 .envrc
 .direnv
 **/.DS_Store
 site/

83

CODEOWNERS

View File

@@ -1,4 +1,4 @@
 # Copyright (c) 2019 Intel Corporation
 # Copyright (c) 2019-2023 Intel Corporation
 #
 # SPDX-License-Identifier: Apache-2.0
 #
@@ -9,4 +9,83 @@
 # Order in this file is important. Only the last match will be
 # used. See https://help.github.com/articles/about-code-owners/
 *.md    @kata-containers/documentation
 /CODEOWNERS			@kata-containers/codeowners
 VERSION				@kata-containers/release
 # The versions database needs careful handling
 versions.yaml			@kata-containers/release @kata-containers/ci @kata-containers/tests
 Makefile*			@kata-containers/build
 *.mak				@kata-containers/build
 *.mk				@kata-containers/build
 # Documentation related files could also appear anywhere
 # else in the repo.
 *.md				@kata-containers/documentation
 *.drawio			@kata-containers/documentation
 *.jpg				@kata-containers/documentation
 *.png				@kata-containers/documentation
 *.svg				@kata-containers/documentation
 *.bash				@kata-containers/shell
 *.sh				@kata-containers/shell
 **/completions/			@kata-containers/shell
 Dockerfile*			@kata-containers/docker
 /ci/				@kata-containers/ci
 *.bats				@kata-containers/tests
 /tests/				@kata-containers/tests
 *.rs				@kata-containers/rust
 *.go				@kata-containers/golang
 /utils/				@kata-containers/utils
 # FIXME: Maybe a new "protocol" team would be better?
 #
 # All protocol changes must be reviewed.
 # Note, we include all subdirs, including the vendor dir, as at present there are no .proto files
 # in the vendor dir. Later we may have to extend this matching rule if that changes.
 /src/libs/protocols/*.proto	@kata-containers/architecture-committee @kata-containers/builder @kata-containers/packaging
 # GitHub Actions
 /.github/workflows/		@kata-containers/action-admins @kata-containers/ci
 /ci/				@kata-containers/ci @kata-containers/tests
 /docs/				@kata-containers/documentation
 /src/agent/			@kata-containers/agent
 /src/runtime*/			@kata-containers/runtime
 /src/runtime/			@kata-containers/golang
 src/runtime-rs/			@kata-containers/rust
 src/libs/			@kata-containers/rust
 src/dragonball/			@kata-containers/dragonball
 /tools/osbuilder/		@kata-containers/builder
 /tools/packaging/		@kata-containers/packaging
 /tools/packaging/kernel/	@kata-containers/kernel
 /tools/packaging/kata-deploy/	@kata-containers/kata-deploy
 /tools/packaging/qemu/		@kata-containers/qemu
 /tools/packaging/release/	@kata-containers/release
 **/vendor/			@kata-containers/vendoring
 # Handle arch specific files last so they match more specifically than
 # the kernel packaging files.
 **/*aarch64*			@kata-containers/arch-aarch64
 **/*arm64*			@kata-containers/arch-aarch64
 **/*amd64*			@kata-containers/arch-amd64
 **/*x86-64*			@kata-containers/arch-amd64
 **/*x86_64*			@kata-containers/arch-amd64
 **/*ppc64*			@kata-containers/arch-ppc64le
 **/*s390x*			@kata-containers/arch-s390x

									
										2

CONTRIBUTING.md
									
												View File
												
				@@ -2,4 +2,4 @@

				## This repo is part of [Kata Containers](https://katacontainers.io)

				For details on how to contribute to the Kata Containers project, please see the main [contributing document](https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md).

				For details on how to contribute to the Kata Containers project, please see the main [contributing document](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md).

6287

Cargo.lock generated Normal file

View File

File diff suppressed because it is too large Load Diff

									
										140

Cargo.toml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,140 @@

				[workspace.package]

				authors = ["The Kata Containers community <kata-dev@lists.katacontainers.io>"]

				edition = "2018"

				license = "Apache-2.0"

				rust-version = "1.88"

				[workspace]

				members = [

				  # Dragonball

				  "src/dragonball",

				  "src/dragonball/dbs_acpi",

				  "src/dragonball/dbs_address_space",

				  "src/dragonball/dbs_allocator",

				  "src/dragonball/dbs_arch",

				  "src/dragonball/dbs_boot",

				  "src/dragonball/dbs_device",

				  "src/dragonball/dbs_interrupt",

				  "src/dragonball/dbs_legacy_devices",

				  "src/dragonball/dbs_pci",

				  "src/dragonball/dbs_tdx",

				  "src/dragonball/dbs_upcall",

				  "src/dragonball/dbs_utils",

				  "src/dragonball/dbs_virtio_devices",

				  # runtime-rs

				  "src/runtime-rs",

				  "src/runtime-rs/crates/agent",

				  "src/runtime-rs/crates/hypervisor",

				  "src/runtime-rs/crates/persist",

				  "src/runtime-rs/crates/resource",

				  "src/runtime-rs/crates/runtimes",

				  "src/runtime-rs/crates/service",

				  "src/runtime-rs/crates/shim",

				  "src/runtime-rs/crates/shim-ctl",

				  "src/runtime-rs/tests/utils",

				]

				resolver = "2"

				# TODO: Add all excluded crates to root workspace

				exclude = [

				  "src/agent",

				  "src/tools",

				  "src/libs",

				  # kata-deploy binary is standalone and has its own Cargo.toml for now

				  "tools/packaging/kata-deploy/binary",

				  # We are cloning and building rust packages under

				  # "tools/packaging/kata-deploy/local-build/build" folder, which may mislead

				  # those packages to think they are part of the kata root workspace

				  "tools/packaging/kata-deploy/local-build/build",

				]

				[workspace.dependencies]

				# Rust-VMM crates

				event-manager = "0.2.1"

				kvm-bindings = "0.6.0"

				kvm-ioctls = "=0.12.1"

				linux-loader = "0.8.0"

				seccompiler = "0.5.0"

				vfio-bindings = "0.3.0"

				vfio-ioctls = "0.1.0"

				virtio-bindings = "0.1.0"

				virtio-queue = "0.7.0"

				vm-fdt = "0.2.0"

				vm-memory = "0.10.0"

				vm-superio = "0.5.0"

				vmm-sys-util = "0.11.0"

				# Local dependencies from Dragonball Sandbox crates

				dragonball = { path = "src/dragonball" }

				dbs-acpi = { path = "src/dragonball/dbs_acpi" }

				dbs-address-space = { path = "src/dragonball/dbs_address_space" }

				dbs-allocator = { path = "src/dragonball/dbs_allocator" }

				dbs-arch = { path = "src/dragonball/dbs_arch" }

				dbs-boot = { path = "src/dragonball/dbs_boot" }

				dbs-device = { path = "src/dragonball/dbs_device" }

				dbs-interrupt = { path = "src/dragonball/dbs_interrupt" }

				dbs-legacy-devices = { path = "src/dragonball/dbs_legacy_devices" }

				dbs-pci = { path = "src/dragonball/dbs_pci" }

				dbs-tdx = { path = "src/dragonball/dbs_tdx" }

				dbs-upcall = { path = "src/dragonball/dbs_upcall" }

				dbs-utils = { path = "src/dragonball/dbs_utils" }

				dbs-virtio-devices = { path = "src/dragonball/dbs_virtio_devices" }

				# Local dependencies from runtime-rs

				agent = { path = "src/runtime-rs/crates/agent" }

				hypervisor = { path = "src/runtime-rs/crates/hypervisor" }

				persist = { path = "src/runtime-rs/crates/persist" }

				resource = { path = "src/runtime-rs/crates/resource" }

				runtimes = { path = "src/runtime-rs/crates/runtimes" }

				service = { path = "src/runtime-rs/crates/service" }

				tests_utils = { path = "src/runtime-rs/tests/utils" }

				ch-config = { path = "src/runtime-rs/crates/hypervisor/ch-config" }

				common = { path = "src/runtime-rs/crates/runtimes/common" }

				linux_container = { path = "src/runtime-rs/crates/runtimes/linux_container" }

				virt_container = { path = "src/runtime-rs/crates/runtimes/virt_container" }

				wasm_container = { path = "src/runtime-rs/crates/runtimes/wasm_container" }

				# Local dependencies from `src/lib`

				kata-sys-util = { path = "src/libs/kata-sys-util" }

				kata-types = { path = "src/libs/kata-types", features = ["safe-path"] }

				logging = { path = "src/libs/logging" }

				protocols = { path = "src/libs/protocols", features = ["async"] }

				runtime-spec = { path = "src/libs/runtime-spec" }

				safe-path = { path = "src/libs/safe-path" }

				shim-interface = { path = "src/libs/shim-interface" }

				test-utils = { path = "src/libs/test-utils" }

				# Outside dependencies

				actix-rt = "2.7.0"

				anyhow = "1.0"

				async-trait = "0.1.48"

				containerd-shim = { version = "0.10.0", features = ["async"] }

				containerd-shim-protos = { version = "0.10.0", features = ["async"] }

				go-flag = "0.1.0"

				hyper = "0.14.20"

				hyperlocal = "0.8.0"

				lazy_static = "1.4"

				libc = "0.2"

				log = "0.4.14"

				netns-rs = "0.1.0"

				# Note: nix needs to stay sync'd with libs versions

				nix = "0.26.4"

				oci-spec = { version = "0.8.1", features = ["runtime"] }

				protobuf = "3.7.2"

				rand = "0.8.4"

				serde = { version = "1.0.145", features = ["derive"] }

				serde_json = "1.0.91"

				sha2 = "0.10.9"

				slog = "2.5.2"

				slog-scope = "4.4.0"

				strum = { version = "0.24.0", features = ["derive"] }

				tempfile = "3.19.1"

				thiserror = "1.0"

				tokio = "1.46.1"

				tracing = "0.1.41"

				tracing-opentelemetry = "0.18.0"

				ttrpc = "0.8.4"

				url = "2.5.4"

									
										93

Glossary.md
									
												View File
												
				@@ -1,94 +1,3 @@

				# Glossary

				[A](#a), [B](#b), [C](#c), [D](#d), [E](#e), [F](#f), [G](#g), [H](#h), [I](#i), [J](#j), [K](#k), [L](#l), [M](#m), [N](#n), [O](#o), [P](#p), [Q](#q), [R](#r), [S](#s), [T](#t), [U](#u), [V](#v), [W](#w), [X](#x), [Y](#y), [Z](#z)

				## A

				### Auto Scaling

				a method used in cloud computing, whereby the amount of computational resources in a server farm, typically measured in terms of the number of active servers, which vary automatically based on the load on the farm.

				## B

				## C

				### Container Security Solutions

				The process of implementing security tools and policies that will give you the assurance that everything in your container is running as intended, and only as intended.

				### Container Software

				A standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.

				### Container Runtime Interface

				A plugin interface which enables Kubelet to use a wide variety of container runtimes, without the need to recompile.

				### Container Virtualization

				A container is a virtual runtime environment that runs on top of a single operating system (OS) kernel and emulates an operating system rather than the underlying hardware.

				## D

				## E

				## F

				## G

				## H

				## I

				### Infrastructure Architecture

				A structured and modern approach for supporting an organization and facilitating innovation within an enterprise.

				## J

				## K

				### Kata Containers

				Kata containers is an open source project delivering increased container security and Workload isolation through an implementation of lightweight virtual machines.

				## L

				## M

				## N

				## O

				## P

				### Pod Containers

				A Group of one or more containers , with shared storage/network, and a specification for how to run the containers.

				### Private Cloud

				A computing model that offers a proprietary environment dedicated to a single business entity.

				### Public Cloud

				Computing services offered by third-party providers over the public Internet, making them available to anyone who wants to use or purchase them.

				## Q

				## R

				## S

				### Serverless Containers

				An architecture in which code is executed on-demand. Serverless workloads are typically in the cloud, but on-premises serverless platforms exist, too.

				## T

				## U

				## V

				### Virtual Machine Monitor

				Computer software, firmware or hardware that creates and runs virtual machines.

				### Virtual Machine Software

				A software program or operating system that not only exhibits the behavior of a separate computer, but is also capable of performing tasks such as running applications and programs like a separate computer.

				## W

				## X

				## Y

				## Z

				See the [project glossary hosted in the wiki](https://github.com/kata-containers/kata-containers/wiki/Glossary).

									
										42

Makefile
									
												View File
												
				@@ -1,4 +1,4 @@

				# Copyright (c) 2020 Intel Corporation

				# Copyright (c) 2020-2023 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				@@ -6,22 +6,32 @@

				# List of available components

				COMPONENTS =

				COMPONENTS += libs

				COMPONENTS += agent

				COMPONENTS += dragonball

				COMPONENTS += runtime

				COMPONENTS += trace-forwarder

				COMPONENTS += runtime-rs

				# List of available tools

				TOOLS =

				TOOLS += agent-ctl

				TOOLS += kata-ctl

				TOOLS += log-parser

				TOOLS += runk

				TOOLS += trace-forwarder

				STANDARD_TARGETS = build check clean install test vendor

				STANDARD_TARGETS = build check clean install static-checks-build test vendor

				# Variables for the build-and-publish-kata-debug target

				KATA_DEBUG_REGISTRY ?= ""

				KATA_DEBUG_TAG ?= ""

				default: all

				include utils.mk

				include ./tools/packaging/kata-deploy/local-build/Makefile

				all: build

				# Create the rules

				$(eval $(call create_all_rules,$(COMPONENTS),$(TOOLS),$(STANDARD_TARGETS)))

				@@ -31,7 +41,23 @@ generate-protocols:

					make -C src/agent generate-protocols

				# Some static checks rely on generated source files of components.

				static-checks: build

					bash ci/static-checks.sh

				static-checks: static-checks-build

					bash tests/static-checks.sh

				.PHONY: all default static-checks binary-tarball install-binary-tarball

				docs-url-alive-check:

					bash ci/docs-url-alive-check.sh

				build-and-publish-kata-debug:

					bash tools/packaging/kata-debug/kata-debug-build-and-upload-payload.sh ${KATA_DEBUG_REGISTRY} ${KATA_DEBUG_TAG} 

				docs-serve:

					docker run --rm -p 8000:8000 -v ./docs:/docs/docs -v ${PWD}/zensical.toml:/zensical.toml:ro zensical/zensical serve --config-file /zensical.toml -a 0.0.0.0:8000

				.PHONY: \

					all \

					kata-tarball \

					install-tarball \

					default \

					static-checks \

					docs-url-alive-check \

					docs-serve

									
										104

README.md
									
												View File
												
				@@ -1,4 +1,7 @@

				<img src="https://www.openstack.org/assets/kata/kata-vertical-on-white.png" width="150">

				<img src="https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-images-prod/openstack-logo/kata/SVG/kata-1.svg" width="900">

				[![CI | Publish Kata Containers payload](https://github.com/kata-containers/kata-containers/actions/workflows/payload-after-push.yaml/badge.svg)](https://github.com/kata-containers/kata-containers/actions/workflows/payload-after-push.yaml) [![Kata Containers Nightly CI](https://github.com/kata-containers/kata-containers/actions/workflows/ci-nightly.yaml/badge.svg)](https://github.com/kata-containers/kata-containers/actions/workflows/ci-nightly.yaml)

				[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/kata-containers/kata-containers/badge)](https://scorecard.dev/viewer/?uri=github.com/kata-containers/kata-containers)

				# Kata Containers

				@@ -17,16 +20,74 @@ standard implementation of lightweight Virtual Machines (VMs) that feel and

				perform like containers, but provide the workload isolation and security

				advantages of VMs.

				## License

				The code is licensed under the Apache 2.0 license.

				See [the license file](LICENSE) for further details.

				## Platform support

				Kata Containers currently runs on 64-bit systems supporting the following

				technologies:

				| Architecture | Virtualization technology |

				|-|-|

				| `x86_64`, `amd64` | [Intel](https://www.intel.com) VT-x, AMD SVM |

				| `aarch64` ("`arm64`")| [ARM](https://www.arm.com) Hyp |

				| `ppc64le` | [IBM](https://www.ibm.com) Power |

				| `s390x` | [IBM](https://www.ibm.com) Z & LinuxONE SIE |

				### Hardware requirements

				The [Kata Containers runtime](src/runtime) provides a command to

				determine if your host system is capable of running and creating a

				Kata Container:

				```bash

				$ kata-runtime check

				```

				> **Notes:**

				>

				> - This command runs a number of checks including connecting to the

				>   network to determine if a newer release of Kata Containers is

				>   available on GitHub. If you do not wish this to check to run, add

				>   the `--no-network-checks` option.

				>

				> - By default, only a brief success / failure message is printed.

				>   If more details are needed, the `--verbose` flag can be used to display the

				>   list of all the checks performed.

				>

				> - If the command is run as the `root` user additional checks are

				>   run (including checking if another incompatible hypervisor is running).

				>   When running as `root`, network checks are automatically disabled.

				## Getting started

				See the [installation documentation](docs/install).

				## Documentation

				See the [official documentation](docs)

				(including [installation guides](docs/install),

				[the developer guide](docs/Developer-Guide.md),

				[design documents](docs/design) and more).

				See the [official documentation](docs) including:

				- [Installation guides](docs/install)

				- [Developer guide](docs/Developer-Guide.md)

				- [Design documents](docs/design)

				  - [Architecture overview](docs/design/architecture)

				  - [Architecture 3.0 overview](docs/design/architecture_3.0/)

				## Configuration

				Kata Containers uses a single

				[configuration file](src/runtime/README.md#configuration)

				which contains a number of sections for various parts of the Kata

				Containers system including the [runtime](src/runtime), the

				[agent](src/agent) and the [hypervisor](#hypervisors).

				## Hypervisors

				See the [hypervisors document](docs/hypervisors.md) and the

				[Hypervisor specific configuration details](src/runtime/README.md#hypervisor-specific-configuration).

				## Community

				@@ -48,6 +109,8 @@ Please raise an issue

				## Developers

				See the [developer guide](docs/Developer-Guide.md).

				### Components

				### Main components

				@@ -57,9 +120,11 @@ The table below lists the core parts of the project:

				| Component | Type | Description |

				|-|-|-|

				| [runtime](src/runtime) | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |

				| [runtime-rs](src/runtime-rs) | core | The Rust version runtime. |

				| [agent](src/agent) | core | Management process running inside the virtual machine / POD that sets up the container environment. |

				| [`dragonball`](src/dragonball) | core | An optional built-in VMM brings out-of-the-box Kata Containers experience with optimizations on container workloads |

				| [documentation](docs) | documentation | Documentation common to all components (such as design and install documentation). |

				| [tests](https://github.com/kata-containers/tests) | tests | Excludes unit tests which live with the main code. |

				| [tests](tests) | tests | Excludes unit tests which live with the main code. |

				### Additional components

				@@ -70,22 +135,29 @@ The table below lists the remaining parts of the project:

				| [packaging](tools/packaging) | infrastructure | Scripts and metadata for producing packaged binaries<br/>(components, hypervisors, kernel and rootfs). |

				| [kernel](https://www.kernel.org) | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored [here](tools/packaging/kernel). |

				| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |

				| [`agent-ctl`](tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |

				| [`trace-forwarder`](src/trace-forwarder) | utility | Agent tracing helper. |

				| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |

				| [kata-debug](tools/packaging/kata-debug/README.md) | infrastructure | Utility tool to gather Kata Containers debug information from Kubernetes clusters. |

				| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |

				| [`kata-ctl`](src/tools/kata-ctl) | utility | Tool that provides advanced commands and debug facilities. |

				| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |

				| [`runk`](src/tools/runk) | utility | Standard OCI container runtime based on the agent. |

				| [`ci`](.github/workflows) | CI | Continuous Integration configuration files and scripts. |

				| [`ocp-ci`](ci/openshift-ci/README.md) | CI | Continuous Integration configuration for the OpenShift pipelines. |

				| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |

				| [`Webhook`](tools/testing/kata-webhook/README.md) | utility | Example of a simple admission controller webhook to annotate pods with the Kata runtime class |

				### Packaging and releases

				Kata Containers is now

				[available natively for most distributions](docs/install/README.md#packaged-installation-methods).

				However, packaging scripts and metadata are still used to generate snap and GitHub releases. See

				the [components](#components) section for further details.

				## General tests

				See the [tests documentation](tests/README.md).

				## Metrics tests

				See the [metrics documentation](tests/metrics/README.md).

				## Glossary of Terms

				See the [glossary of terms](Glossary.md) related to Kata Containers.

				---

				[kernel]: https://www.kernel.org

				[github-katacontainers.io]: https://github.com/kata-containers/www.katacontainers.io

				See the [glossary of terms](https://github.com/kata-containers/kata-containers/wiki/Glossary) related to Kata Containers.

2

VERSION

View File

@@ -1 +1 @@
 .3.0-rc0
 .24.0

									
										416

ci/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,416 @@

				# Kata Containers CI

				> [!WARNING]

				> While this project's CI has several areas for improvement, it is constantly

				> evolving. This document attempts to describe its current state, but due to

				> ongoing changes, you may notice some outdated information here. Feel free to

				> modify/improve this document as you use the CI and notice anything odd. The

				> community appreciates it!

				## Introduction

				The Kata Containers CI relies on [GitHub Actions][gh-actions], where the actions

				themselves can be found in the `.github/workflows` directory, and they may call

				helper scripts, which are located under the `tests` directory, to actually

				perform the tasks required for each test case.

				## The different workflows

				There are a few different sets of workflows that are running as part of our CI,

				and here we're going to cover the ones that are less likely to get rotten.  With

				this said, it's fair to advise that if the reader finds something that got

				rotten, opening an issue to the project pointing to the problem is a nice way to

				help, and providing a fix for the issue is a very encouraging way to help.

				### Jobs that run automatically when a PR is raised

				These are a bunch of tests that will automatically run as soon as a PR is

				opened, they're mostly running on "cost free" runners, and they do some

				pre-checks to evaluate that your PR may be okay to start getting reviewed.

				Mind, though, that the community expects the contributors to, at least, build

				their code before submitting a PR, which the community sees as a very fair

				request.

				Without getting into the weeds with details on this, those jobs are the ones

				responsible for ensuring that:

				- The commit message is in the expected format

				- There's no missing Developer's Certificate of Origin

				- Static checks are passing

				### Jobs that require a maintainer's approval to run

				There are some tests, and our so-called "CI".  These require a

				maintainer's approval to run as parts of those jobs will be running on "paid

				runners", which are currently using Azure infrastructure.

				Once a maintainer of the project gives "the green light" (currently by adding an

				`ok-to-test` label to the PR, soon to be changed to commenting "/test" as part

				of a PR review), the following tests will be executed:

				- Build all the components (runs on free cost runners, or bare-metal depending on the architecture)

				- Create a tarball with all the components (runs on free cost runners, or bare-metal depending on the architecture)

				- Create a kata-deploy payload with the tarball generated in the previous step (runs on free costs runner, or bare-metal depending on the architecture)

				- Run the following tests:

				  - Tests depending on the generated tarball

				    - Metrics (runs on bare-metal)

				    - `docker` (runs on cost free runners)

				    - `nerdctl` (runs on cost free runners)

				    - `kata-monitor` (runs on cost free runners)

				    - `cri-containerd` (runs on cost free runners)

				    - `nydus` (runs on cost free runners)

				    - `vfio` (runs on cost free runners)

				  - Tests depending on the generated kata-deploy payload

				    - kata-deploy (runs on cost free runners)

				      - Tests are performed using different "Kubernetes flavors", such as k0s, k3s, rke2, and Azure Kubernetes Service (AKS).

				    - Kubernetes (runs in Azure small and medium instances depending on what's required by each test, and on TEE bare-metal machines)

				      - Tests are performed with different runtime engines, such as CRI-O and containerd.

				      - Tests are performed with different snapshotters for containerd, namely OverlayFS and devmapper.

				      - Tests are performed with all the supported hypervisors, which are Cloud Hypervisor, Dragonball, Firecracker, and QEMU.

				For all the tests relying on Azure instances, real money is being spent, so the

				community asks for the maintainers to be mindful about those, and avoid abusing

				them to merely debug issues.

				## The different runners

				In the previous section we've mentioned using different runners, now in this section we'll go through each type of runner used.

				- Cost free runners:  Those are the runners provided by GitHub itself, and

				  those are fairly small machines with virtualization capabilities enabled.

				- Azure small instances: Those are runners which have virtualization

				  capabilities enabled, 2 CPUs, and 8GB of RAM.  These runners have a "-smaller"

				  suffix to their name.

				- Azure normal instances: Those are runners which have virtualization

				  capabilities enabled, 4 CPUs, and 16GB of RAM.  These runners are usually

				  `garm` ones with no "-smaller" suffix.

				- Bare-metal runners: Those are runners provided by community contributors,

				  and they may vary in architecture, size and virtualization capabilities.

				  Builder runners don't actually require any virtualization capabilities, while

				  runners which will be actually performing the tests must have virtualization

				  capabilities and a reasonable amount for CPU and RAM available (at least

				  matching the Azure normal instances).

				## Adding new tests

				Before someone decides to add a new test, we strongly recommend them to go

				through [GitHub Actions Documentation][gh-actions],

				which will provide you a very sensible background on how to read and understand

				current tests we have, and also become familiar with how to write a new test.

				On the Kata Containers land, there are basically two sets of tests: "standalone"

				and "part of something bigger".

				The "standalone" tests, for example the commit message check, won't be covered

				here as they're better covered by the GitHub Actions documentation pasted above.

				The "part of something bigger" is the more complicated one and not so

				straightforward to add, so we'll be focusing our efforts on describing the

				addition of those.

				> [!NOTE]

				> TODO: Currently, this document refers to "tests" when it actually means the

				> jobs (or workflows) of GitHub. In an ideal world, except in some specific cases,

				> new tests should be added without the need to add new workflows. In the

				> not-too-distant future (hopefully), we will improve the workflows to support

				> this.

				### Adding a new test that's "part of something bigger"

				The first important thing here is to align expectations, and we must say that

				the community strongly prefers receiving tests that already come with:

				- Instructions how to run them

				- A proven run where it's passing

				There are several ways to achieve those two requirements, and an example of that

				can be seen in PR #8115.

				With the expectations aligned, adding a test consists in:

				- Adding a new yaml file for your test, and ensure it's called from the

				  "bigger" yaml. See the [Kata Monitor test example][monitor-ex01].

				- Adding the helper scripts needed for your test to run. Again, use the [Kata Monitor script as example][monitor-ex02].

				Following those examples, the community advice during the review, and even

				asking the community directly on Slack are the best ways to get your test

				accepted.

				## Required tests

				In our CI we have two categories of jobs - required and non-required:

				- Required jobs need to all pass for a PR to be merged normally and

				should cover all the core features on Kata Containers that we want to

				ensure don't have regressions.

				- The non-required jobs are for unstable tests, or for features that

				are experimental and not-fully supported. We'd like those tests to also

				pass on all PRs ideally, but don't block merging if they don't as it's

				not necessarily an indication of the PR code causing regressions.

				### Transitioning between required and non-required status

				Required jobs that fail block merging of PRs, so we want to ensure that

				jobs are stable and maintained before we make them required.

				The [Kata Containers CI Dashboard](https://kata-containers.github.io/)

				is a useful resource to check when collecting evidence of job stability.

				At time of writing it reports the last ten days of Kata CI nightly test

				results for each job. This isn't perfect as it doesn't currently capture

				results on PRs, but is a good guideline for stability.

				> [!NOTE]

				> Below are general guidelines about jobs being marked as

				> required/non-required, but they are subject to change and the Kata

				> Architecture Committee may overrule these guidelines at their

				> discretion.

				#### Initial marking as required

				For new jobs, or jobs that haven't been marked as required recently,

				the criteria to be initially marked as required is ten days

				of passing tests, with no relevant PR failures reported in that time.

				Required jobs also need one or more nominated maintainers that are

				responsible for the stability of their jobs. Maintainers can be registered

				in [`maintainers.yml`](https://github.com/kata-containers/kata-containers.github.io/blob/main/maintainers.yml)

				and will then show on the CI Dashboard.

				To add transparency to making jobs required/non-required and to keep the

				GitHub UI in sync with the [Gatekeeper job](../tools/testing/gatekeeper),

				the process to update a job's required state is as follows:

				1. Create a PR to update `maintainers.yml`, if new maintainers are being

				declared on a CI job.

				1. Create a PR which updates

				[`required-tests.yaml`](../tools/testing/gatekeeper/required-tests.yaml)

				adding the new job and listing the evidence that the job meets the

				requirements above. Ensure that all maintainers and

				@kata-containers/architecture-committee are notified to give them the

				opportunity to review the PR. See

				[#11015](https://github.com/kata-containers/kata-containers/pull/11015)

				as an example.

				1. The maintainers and Architecture Committee get a chance to review the PR.

				It can be discussed in an AC meeting to get broader input.

				1. Once the PR has been merged, a Kata Containers admin should be notified

				to ensure that the GitHub UI is updated to reflect the change in

				`required-tests.yaml`.

				#### Expectation of required job maintainers

				Due to the nature of the Kata Containers community having contributors

				spread around the world, required jobs being blocked due to infrastructure,

				or test issues can have a big impact on work. As such, the expectation is

				that when a problem with a required job is noticed/reported, the maintainers

				have one working day to acknowledge the issue, perform an initial

				investigation and then either fix it, or get it marked as non-required

				whilst the investigation and/or fix it done.

				### Re-marking of required status

				Once a job has been removed from the required list, it requires two

				consecutive successful nightly test runs before being made required

				again.

				## Running tests

				### Running the tests as part of the CI

				If you're a maintainer of the project, you'll be able to kick in the tests by

				yourself.  With the current approach, you just need to add the `ok-to-test`

				label and the tests will automatically start.  We're moving, though, to use a

				`/test` command as part of a GitHub review comment, which will simplify this

				process.

				If you're not a maintainer, please, send a message on Slack or wait till one of

				the maintainers reviews your PR.  Maintainers will then kick in the tests on

				your behalf.

				In case a test fails and there's the suspicion it happens due to flakiness in

				the test itself, please, create an issue for us, and then re-run (or asks

				maintainers to re-run) the tests following these steps:

				- Locate which tests is failing

				- Click in "details"

				- In the top right corner, click in "Re-run jobs"

				- And then in "Re-run failed jobs"

				- And finally click in the green "Re-run jobs" button

				> [!NOTE]

				> TODO: We need figures here

				### Running the tests locally

				In this section, aligning expectations is also something very important, as one

				will not be able to run the tests exactly in the same way the tests are running

				in the CI, as one most likely won't have access to an Azure subscription.

				However, we're trying our best here to provide you with instructions on how to

				run the tests in an environment that's "close enough" and will help you to debug

				issues you find with the current tests, or even provide a proof-of-concept to

				the new test you're trying to add.

				The basic steps, which we will cover in details down below are:

				 1. Create a VM matching the configuration of the target runner

				 2. Generate the artifacts you'll need for the test, or download them from a

				    current failed run

				 3. Follow the steps provided in the action itself to run the tests.

				Although the general overview looks easy, we know that some tricks need to be

				shared, and we'll go through the general process of debugging one non-Kubernetes

				and one Kubernetes specific test for educational purposes.

				One important thing to note is that "Create a VM" can be done in innumerable

				different ways, using the tools of your choice.  For the sake of simplicity on

				this guide, we'll be using `kcli`, which we strongly recommend in case you're a

				non-experienced user, and happen to be developing on a Linux box.

				For both non-Kubernetes and Kubernetes cases, we'll be using PR #8070 as an

				example, which at the time this document is being written serves us very well

				the purpose, as you can see that we have `nerdctl` and Kubernetes tests failing.

				## Debugging tests

				### Debugging a non Kubernetes test

				As shown above, the `nerdctl` test is failing.

				As a developer you can go ahead to the details of the job, and expand the job

				that's failing in order to gather more information.

				But when that doesn't help, we need to set up our own environment to debug

				what's going on.

				Taking a look at the `nerdctl` test, which is located here, you can easily see

				that it runs-on a `garm-ubuntu-2304-smaller` virtual machine.

				The important parts to understand are `ubuntu-2304`, which is the OS where the

				test is running on; and "smaller", which means we're running it on a machine

				with 2 CPUs and 8GB of RAM.

				With this information, we can go ahead and create a similar VM locally using `kcli`.

				```bash

				$ sudo kcli create vm -i ubuntu2304 -P disks=[60] -P numcpus=2 -P memory=8192 -P cpumodel=host-passthrough debug-nerdctl-pr8070

				```

				In order to run the tests, you'll need the "kata-tarball" artifacts, which you

				can build your own using "make kata-tarball" (see below), or simply get them

				from the PR where the tests failed.  To download them, click on the "Summary"

				button that's on the top left corner, and then scroll down till you see the

				artifacts, as shown below.

				Unfortunately GitHub doesn't give us a link that we can download those from

				inside the VM, but we can download them on our local box, and then `scp` the

				tarball to the newly created VM that will be used for debugging purposes.

				> [!NOTE]

				> Those artifacts are only available (for 15 days) when all jobs are finished.

				Once you have the `kata-static.tar.zst` in your VM, you can login to the VM with

				`kcli ssh debug-nerdctl-pr8070`, go ahead and then clone your development branch

				```bash

				$ git clone --branch feat_add-fc-runtime-rs https://github.com/nubificus/kata-containers

				```

				Add the upstream as a remote, set up your git, and rebase your branch atop of the upstream main one

				```bash

				$ git remote add upstream https://github.com/kata-containers/kata-containers

				$ git remote update

				$ git config --global user.email "you@example.com"

				$ git config --global user.name "Your Name"

				$ git rebase upstream/main

				```

				Now copy the `kata-static.tar.zst` into your `kata-containers/kata-artifacts` directory

				```bash

				$ mkdir kata-artifacts

				$ cp ../kata-static.tar.zst kata-artifacts/

				```

				> [!NOTE]

				> If you downloaded the .zip from GitHub you need to uncompress first to see `kata-static.tar.zst`

				And finally run the tests following what's in the yaml file for the test you're

				debugging.

				In our case, the `run-nerdctl-tests-on-garm.yaml`.

				When looking at the file you'll notice that some environment variables are set,

				such as `KATA_HYPERVISOR`, and should be aware that, for this particular example,

				the important steps to follow are:

				Install the dependencies

				Install kata

				Run the tests

				Let's now run the steps mentioned above exporting the expected environment variables

				```bash

				$ export KATA_HYPERVISOR=dragonball

				$ bash ./tests/integration/nerdctl/gha-run.sh install-dependencies

				$ bash ./tests/integration/nerdctl/gha-run.sh install-kata

				$ bash tests/integration/nerdctl/gha-run.sh run

				```

				And with this you should've been able to reproduce exactly the same issue found

				in the CI, and from now on you can build your own code, use your own binaries,

				and have fun debugging and hacking!

				### Debugging a Kubernetes test

				Steps for debugging the Kubernetes tests are very similar to the ones for

				debugging non-Kubernetes tests, with the caveat that what you'll need, this

				time, is not the `kata-static.tar.zst` tarball, but rather a payload to be used

				with kata-deploy.

				In order to generate your own kata-deploy image you can generate your own

				`kata-static.tar.zst` and then take advantage of the following script.  Be aware

				that the image generated and uploaded must be accessible by the VM where you'll

				be performing your tests.

				In case you want to take advantage of the payload that was already generated

				when you faced the CI failure, which is considerably easier, take a look at the

				failed job, then click in "Deploy Kata" and expand the "Final kata-deploy.yaml

				that is used in the test" section.  From there you can see exactly what you'll

				have to use when deploying kata-deploy in your local cluster.

				> [!NOTE]

				> TODO: WAINER TO FINISH THIS PART BASED ON HIS PR TO RUN A LOCAL CI

				## Adding new runners

				Any admin of the project is able to add or remove GitHub runners, and those are

				the folks you should rely on.

				If you need a new runner added, please, tag @ac in the Kata Containers slack,

				and someone from that group will be able to help you.

				If you're part of that group and you're looking for information on how to help

				someone, this is simple, and must be done in private. Basically what you have to

				do is:

				- Go to the kata-containers/kata-containers repo

				- Click on the Settings button, located in the top right corner

				- On the left panel, under "Code and automation", click on "Actions"

				- Click on "Runners"

				If you want to add a new self-hosted runner:

				- In the top right corner there's a green button called "New self-hosted runner"

				If you want to remove a current self-hosted runner:

				- For each runner there's a "..." menu, where you can just click and the

				  "Remove runner" option will show up

				## Known limitations

				As the GitHub actions are structured right now we cannot: Test the addition of a

				GitHub action that's not triggered by a pull_request event as part of the PR.

				[gh-actions]: https://docs.github.com/en/actions

				[monitor-ex01]: https://github.com/kata-containers/kata-containers/commit/a3fb067f1bccde0cbd3fd4d5de12dfb3d8c28b60

				[monitor-ex02]: https://github.com/kata-containers/kata-containers/commit/489caf1ad0fae27cfd00ba3c9ed40e3d512fa492

									
										51

ci/darwin-test.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,51 @@

				#!/usr/bin/env bash

				#

				# Copyright (c) 2022 Apple Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				set -e

				cidir=$(dirname "$0")

				runtimedir=${cidir}/../src/runtime

				genpolicydir=${cidir}/../src/tools/genpolicy

				build_working_packages() {

					# working packages:

					device_api=${runtimedir}/pkg/device/api

					device_config=${runtimedir}/pkg/device/config

					device_drivers=${runtimedir}/pkg/device/drivers

					device_manager=${runtimedir}/pkg/device/manager

					rc_pkg_dir=${runtimedir}/pkg/resourcecontrol/

					utils_pkg_dir=${runtimedir}/virtcontainers/utils

					# broken packages :( :

					#katautils=$runtimedir/pkg/katautils

					#oci=$runtimedir/pkg/oci

					#vc=$runtimedir/virtcontainers

					pkgs=(

						"${device_api}"

						"${device_config}"

						"${device_drivers}"

						"${device_manager}"

						"${utils_pkg_dir}"

						"${rc_pkg_dir}")

					for pkg in "${pkgs[@]}"; do

						echo building "${pkg}"

						pushd "${pkg}" &>/dev/null

						go build

						go test

						popd &>/dev/null

					done

				}

				build_working_packages

				build_genpolicy() {

					echo "building genpolicy"

					pushd "${genpolicydir}" &>/dev/null

					make TRIPLE=aarch64-apple-darwin build

				}

				build_genpolicy

									
										12

ci/docs-url-alive-check.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,12 @@

				#!/bin/bash

				#

				# Copyright (c) 2021 Easystack Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				set -e

				cidir=$(dirname "$0")

				source "${cidir}/../tests/common.bash"

				run_docs_url_alive_check

									
										184

ci/gh-util.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,184 @@

				#!/bin/bash

				# Copyright (c) 2020 Intel Corporation

				# Copyright (c) 2024 IBM Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -o errexit

				set -o errtrace

				set -o nounset

				set -o pipefail

				[[ -n "${DEBUG:-}" ]] && set -o xtrace

				script_name=${0##*/}

				#---------------------------------------------------------------------

				die()

				{

				    echo >&2 "$*"

				    exit 1

				}

				usage()

				{

				    cat <<EOF

				Usage: ${script_name} [OPTIONS] [command] [arguments]

				Description: Utility to expand the abilities of the GitHub CLI tool, gh.

				Command descriptions:

				  list-issues-for-pr     List issues linked to a PR.

				  list-labels-for-issue  List labels, in json format for an issue

				Commands and arguments:

				  list-issues-for-pr <pr>

				  list-labels-for-issue <issue>

				Options:

				 -h                 Show this help statement.

				 -r <owner/repo>    Optional <org/repo> specification. Default: 'kata-containers/kata-containers'

				Examples:

				- List issues for a Pull Request 123 in kata-containers/kata-containers repo

				  $ ${script_name} list-issues-for-pr 123

				EOF

				}

				list_issues_for_pr()

				{

				    local pr="${1:-}"

				    local repo="${2:-kata-containers/kata-containers}"

				    [[ -z "${pr}" ]] && die "need PR"

				    local commits

					commits=$(gh pr view "${pr}" --repo "${repo}" --json commits --jq .commits[].messageBody)

				    [[ -z "${commits}" ]] && die "cannot determine commits for PR ${pr}"

				    # Extract the issue number(s) from the commits.

				    #

				    # This needs to be careful to take account of lines like this:

				    #

				    # fixes 99

				    # fixes: 77

				    # fixes #123.

				    # Fixes: #1, #234, #5678.

				    #

				    # Note the exclusion of lines starting with whitespace which is

				    # specifically to ignore vendored git log comments, which are whitespace

				    # indented and in the format:

				    #

				    #     "<git-commit> <git-commit-msg>"

				    #

				    local issues

					issues=$(echo "${commits}" |\

				        grep -v -E "^( |	)" |\

				        grep -i -E "fixes:* *(#*[0-9][0-9]*)" |\

				        tr ' ' '\n' |\

				        grep "[0-9][0-9]*" |\

				        sed 's/[.,\#]//g' |\

				        sort -nu || true)

				    [[ -z "${issues}" ]] && die "cannot determine issues for PR ${pr}"

				    echo "# Issues linked to PR"

				    echo "#"

				    echo "# Fields: issue_number"

				    local issue

				    echo "${issues}" | while read -r issue

				    do

				        printf "%s\n" "${issue}"

				    done

				}

				list_labels_for_issue()

				{

				    local issue="${1:-}"

				    [[ -z "${issue}" ]] && die "need issue number"

				    local labels

					labels=$(gh issue view "${issue}" --repo kata-containers/kata-containers --json labels)

				    [[ -z "${labels}" ]] && die "cannot determine labels for issue ${issue}"

				    echo "${labels}"

				}

				setup()

				{

				    for cmd in gh jq

				    do

				        command -v "${cmd}" &>/dev/null || die "need command: ${cmd}"

				    done

				}

				handle_args()

				{

				    setup

				    local opt

				    while getopts "hr:" opt "$@"

				    do

				        case "${opt}" in

				            h) usage && exit 0 ;;

				            r) repo="${OPTARG}" ;;

							*) echo "use '-h' to get list of supprted aruments" && exit 1 ;;

				        esac

				    done

				    shift $((OPTIND - 1))

				    local repo="${repo:-kata-containers/kata-containers}"

				    local cmd="${1:-}"

				    case "${cmd}" in

				        list-issues-for-pr) ;;

				        list-labels-for-issue) ;;

				        "") usage && exit 0 ;;

				        *) die "invalid command: '${cmd}'" ;;

				    esac

				    # Consume the command name

				    shift

				    local issue=""

				    local pr=""

				    case "${cmd}" in

				        list-issues-for-pr)

				            pr="${1:-}"

				            list_issues_for_pr "${pr}" "${repo}"

				            ;;

				        list-labels-for-issue)

				            issue="${1:-}"

				            list_labels_for_issue "${issue}"

				            ;;

				        *) die "impossible situation: cmd: '${cmd}'" ;;

				    esac

				    exit 0

				}

				main()

				{

				    handle_args "$@"

				}

				main "$@"

									
										11

ci/go-test.sh
									
												View File
											
				@@ -1,11 +0,0 @@

				#

				# Copyright (c) 2020 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -e

				cidir=$(dirname "$0")

				source "${cidir}/lib.sh"

				run_go_test

									
										22

ci/install_go.sh
									
												View File
											
				@@ -1,22 +0,0 @@

				#!/bin/bash

				#

				# Copyright (c) 2019 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -e

				cidir=$(dirname "$0")

				source "${cidir}/lib.sh"

				clone_tests_repo

				new_goroot=/usr/local/go

				pushd "${tests_repo_dir}"

				# Force overwrite the current version of golang

				[ -z "${GOROOT}" ] || rm -rf "${GOROOT}"

				.ci/install_go.sh -p -f -d "$(dirname ${new_goroot})"

				[ -z "${GOROOT}" ] || sudo ln -sf "${new_goroot}" "${GOROOT}"

				go version

				popd

									
										146

ci/install_libseccomp.sh
									
												View File
												
				@@ -1,4 +1,4 @@

				#!/bin/bash

				#!/usr/bin/env bash

				#

				# Copyright 2021 Sony Group Corporation

				#

				@@ -7,103 +7,129 @@

				set -o errexit

				cidir=$(dirname "$0")

				source "${cidir}/lib.sh"

				script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

				clone_tests_repo

				source "${script_dir}/../tests/common.bash"

				source "${tests_repo_dir}/.ci/lib.sh"

				# Path to the ORAS cache helper for downloading tarballs (sourced when needed)

				# Use ORAS_CACHE_HELPER env var (set by build.sh in Docker) or fallback to repo path

				oras_cache_helper="${ORAS_CACHE_HELPER:-${script_dir}/../tools/packaging/scripts/download-with-oras-cache.sh}"

				# The following variables if set on the environment will change the behavior

				# of gperf and libseccomp configure scripts, that may lead this script to

				# fail. So let's ensure they are unset here.

				unset PREFIX DESTDIR

				arch=$(uname -m)

				arch=${ARCH:-$(uname -m)}

				workdir="$(mktemp -d --tmpdir build-libseccomp.XXXXX)"

				# Variables for libseccomp

				# Currently, specify the libseccomp version directly without using `versions.yaml`

				# because the current Snap workflow is incomplete.

				# After solving the issue, replace this code by using the `versions.yaml`.

				# libseccomp_version=$(get_version "externals.libseccomp.version")

				# libseccomp_url=$(get_version "externals.libseccomp.url")

				libseccomp_version="2.5.1"

				libseccomp_url="https://github.com/seccomp/libseccomp"

				libseccomp_version="${LIBSECCOMP_VERSION:-""}"

				if [[ -z "${libseccomp_version}" ]]; then

					libseccomp_version=$(get_from_kata_deps ".externals.libseccomp.version")

				fi

				libseccomp_url="${LIBSECCOMP_URL:-""}"

				if [[ -z "${libseccomp_url}" ]]; then

					libseccomp_url=$(get_from_kata_deps ".externals.libseccomp.url")

				fi

				libseccomp_tarball="libseccomp-${libseccomp_version}.tar.gz"

				libseccomp_tarball_url="${libseccomp_url}/releases/download/v${libseccomp_version}/${libseccomp_tarball}"

				cflags="-O2"

				# Variables for gperf

				# Currently, specify the gperf version directly without using `versions.yaml`

				# because the current Snap workflow is incomplete.

				# After solving the issue, replace this code by using the `versions.yaml`.

				# gperf_version=$(get_version "externals.gperf.version")

				# gperf_url=$(get_version "externals.gperf.url")

				gperf_version="3.1"

				gperf_url="https://ftp.gnu.org/gnu/gperf"

				gperf_version="${GPERF_VERSION:-""}"

				if [[ -z "${gperf_version}" ]]; then

					gperf_version=$(get_from_kata_deps ".externals.gperf.version")

				fi

				gperf_url="${GPERF_URL:-""}"

				if [[ -z "${gperf_url}" ]]; then

					gperf_url=$(get_from_kata_deps ".externals.gperf.url")

				fi

				gperf_tarball="gperf-${gperf_version}.tar.gz"

				gperf_tarball_url="${gperf_url}/${gperf_tarball}"

				# We need to build the libseccomp library from sources to create a static library for the musl libc.

				# However, ppc64le and s390x have no musl targets in Rust. Hence, we do not set cflags for the musl libc.

				if ([ "${arch}" != "ppc64le" ] && [ "${arch}" != "s390x" ]); then

				    # Set FORTIFY_SOURCE=1 because the musl-libc does not have some functions about FORTIFY_SOURCE=2

				    cflags="-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -O2"

				# Use ORAS cache for gperf downloads (gperf upstream can be unreliable)

				USE_ORAS_CACHE="${USE_ORAS_CACHE:-yes}"

				# We need to build the libseccomp library from sources to create a static

				# library for the musl libc.

				# However, ppc64le, riscv64 and s390x have no musl targets in Rust. Hence, we do

				# not set cflags for the musl libc.

				if [[ "${arch}" != "ppc64le" ]] && [[ "${arch}" != "riscv64" ]] && [[ "${arch}" != "s390x" ]]; then

					# Set FORTIFY_SOURCE=1 because the musl-libc does not have some functions about FORTIFY_SOURCE=2

					cflags="-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -O2"

				fi

				die() {

				    msg="$*"

				    echo "[Error] ${msg}" >&2

				    exit 1

					msg="$*"

					echo "[Error] ${msg}" >&2

					exit 1

				}

				finish() {

				    rm -rf "${workdir}"

					rm -rf "${workdir}"

				}

				trap finish EXIT

				build_and_install_gperf() {

				    echo "Build and install gperf version ${gperf_version}"

				    mkdir -p "${gperf_install_dir}"

				    curl -sLO "${gperf_tarball_url}"

				    tar -xf "${gperf_tarball}"

				    pushd "gperf-${gperf_version}"

				    ./configure --prefix="${gperf_install_dir}"

				    make

				    make install

				    export PATH=$PATH:"${gperf_install_dir}"/bin

				    popd

				    echo "Gperf installed successfully"

					echo "Build and install gperf version ${gperf_version}"

					mkdir -p "${gperf_install_dir}"

					# Use ORAS cache if available and enabled

					if [[ "${USE_ORAS_CACHE}" == "yes" ]] && [[ -f "${oras_cache_helper}" ]]; then

						echo "Using ORAS cache for gperf download"

						source "${oras_cache_helper}"

						local cached_tarball

						cached_tarball=$(download_component gperf "$(pwd)")

						if [[ -f "${cached_tarball}" ]]; then

							gperf_tarball="${cached_tarball}"

						else

							echo "ORAS cache download failed, falling back to direct download"

							curl -sLO "${gperf_tarball_url}"

						fi

					else

						curl -sLO "${gperf_tarball_url}"

					fi

					tar -xf "${gperf_tarball}"

					pushd "gperf-${gperf_version}"

					# Unset $CC for configure, we will always use native for gperf

					CC="" ./configure --prefix="${gperf_install_dir}"

					make

					make install

					export PATH=${PATH}:"${gperf_install_dir}"/bin

					popd

					echo "Gperf installed successfully"

				}

				build_and_install_libseccomp() {

				    echo "Build and install libseccomp version ${libseccomp_version}"

				    mkdir -p "${libseccomp_install_dir}"

				    curl -sLO "${libseccomp_tarball_url}"

				    tar -xf "${libseccomp_tarball}"

				    pushd "libseccomp-${libseccomp_version}"

				    ./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static

				    make

				    make install

				    popd

				    echo "Libseccomp installed successfully"

					echo "Build and install libseccomp version ${libseccomp_version}"

					mkdir -p "${libseccomp_install_dir}"

					curl -sLO "${libseccomp_tarball_url}"

					tar -xf "${libseccomp_tarball}"

					pushd "libseccomp-${libseccomp_version}"

					[[ "${arch}" == $(uname -m) ]] && cc_name="" || cc_name="${arch}-linux-gnu-gcc"

					CC=${cc_name} ./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static --host="${arch}"

					make

					make install

					popd

					echo "Libseccomp installed successfully"

				}

				main() {

				    local libseccomp_install_dir="${1:-}"

				    local gperf_install_dir="${2:-}"

					local libseccomp_install_dir="${1:-}"

					local gperf_install_dir="${2:-}"

				    if [ -z "${libseccomp_install_dir}" ] || [ -z "${gperf_install_dir}" ]; then

				        die "Usage: ${0} <libseccomp-install-dir> <gperf-install-dir>"

				    fi

					if [[ -z "${libseccomp_install_dir}" ]] || [[ -z "${gperf_install_dir}" ]]; then

						die "Usage: ${0} <libseccomp-install-dir> <gperf-install-dir>"

					fi

				    pushd "$workdir"

				    # gperf is required for building the libseccomp.

				    build_and_install_gperf

				    build_and_install_libseccomp

				    popd

					pushd "${workdir}"

					# gperf is required for building the libseccomp.

					build_and_install_gperf

					build_and_install_libseccomp

					popd

				}

				main "$@"

									
										24

ci/install_musl.sh
									
												View File
											
				@@ -1,24 +0,0 @@

				#!/bin/bash

				# Copyright (c) 2020 Ant Group

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				set -e

				install_aarch64_musl() {

					local arch=$(uname -m)

					if [ "${arch}" == "aarch64" ]; then

						local musl_tar="${arch}-linux-musl-native.tgz"

						local musl_dir="${arch}-linux-musl-native"

						pushd /tmp

						if curl -sLO --fail https://musl.cc/${musl_tar}; then

							tar -zxf ${musl_tar}

							mkdir -p /usr/local/musl/

							cp -r ${musl_dir}/* /usr/local/musl/

						fi

						popd

					fi

				}

				install_aarch64_musl

									
										16

ci/install_rust.sh
									
												View File
											
				@@ -1,16 +0,0 @@

				#!/bin/bash

				# Copyright (c) 2019 Ant Financial

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				set -e

				cidir=$(dirname "$0")

				source "${cidir}/lib.sh"

				clone_tests_repo

				pushd ${tests_repo_dir}

				.ci/install_rust.sh ${1:-}

				popd

									
										19

ci/install_vc.sh
									
												View File
											
				@@ -1,19 +0,0 @@

				#!/bin/bash

				#

				# Copyright (c) 2018 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -e

				cidir=$(dirname "$0")

				vcdir="${cidir}/../src/runtime/virtcontainers/"

				source "${cidir}/lib.sh"

				export CI_JOB="${CI_JOB:-default}"

				clone_tests_repo

				if [ "${CI_JOB}" != "PODMAN" ]; then

					echo "Install virtcontainers"

					make -C "${vcdir}" && chronic sudo make -C "${vcdir}" install

				fi

									
										59

ci/install_yq.sh
									
												View File
												
				@@ -5,28 +5,57 @@

				# SPDX-License-Identifier: Apache-2.0

				#

				[[ -n "${DEBUG}" ]] && set -o xtrace

				# If we fail for any reason a message will be displayed

				die() {

					msg="$*"

					echo "ERROR: $msg" >&2

					echo "ERROR: ${msg}" >&2

					exit 1

				}

				function verify_yq_exists() {

					local yq_path=$1

					local yq_version=$2

					local expected="yq (https://github.com/mikefarah/yq/) version ${yq_version}"

					if [[ -x  "${yq_path}" ]] && [[ "$(${yq_path} --version)"X == "${expected}"X ]]; then

						return 0

					else

						return 1

					fi

				}

				# Install the yq yaml query package from the mikefarah github repo

				# Install via binary download, as we may not have golang installed at this point

				function install_yq() {

					local yq_pkg="github.com/mikefarah/yq"

					local yq_version=3.4.1

					local yq_version=v4.44.5

					local precmd=""

					local yq_path=""

					INSTALL_IN_GOPATH=${INSTALL_IN_GOPATH:-true}

					if [ "${INSTALL_IN_GOPATH}"  == "true" ];then

					if [[ "${INSTALL_IN_GOPATH}" == "true" ]]; then

						GOPATH=${GOPATH:-${HOME}/go}

						mkdir -p "${GOPATH}/bin"

						local yq_path="${GOPATH}/bin/yq"

						yq_path="${GOPATH}/bin/yq"

					else

						yq_path="/usr/local/bin/yq"

					fi

					[ -x  "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq version ${yq_version}"X ] && return

					if verify_yq_exists "${yq_path}" "${yq_version}"; then

						echo "yq is already installed in correct version"

						return

					fi

					if [[ "${yq_path}" == "/usr/local/bin/yq" ]]; then

						# Check if we need sudo to install yq

						if [[ ! -w "/usr/local/bin" ]]; then

							# Check if we have sudo privileges

							if ! sudo -n true 2>/dev/null; then

								die "Please provide sudo privileges to install yq"

							else

								precmd="sudo"

							fi

						fi

					fi

					read -r -a sysInfo <<< "$(uname -sm)"

				@@ -43,6 +72,19 @@ function install_yq() {

					"aarch64")

						goarch=arm64

						;;

					"arm64")

						# If we're on an apple silicon machine, just assign amd64. 

						# The version of yq we use doesn't have a darwin arm build, 

						# but Rosetta can come to the rescue here.

						if [[ ${goos} == "Darwin" ]]; then

							goarch=amd64

						else 

							goarch=arm64

						fi

						;;

					"riscv64")

						goarch=riscv64

						;;

					"ppc64le")

						goarch=ppc64le

						;;

				@@ -64,10 +106,9 @@ function install_yq() {

					fi

					## NOTE: ${var,,} => gives lowercase value of var

					local yq_url="https://${yq_pkg}/releases/download/${yq_version}/yq_${goos,,}_${goarch}"

					curl -o "${yq_path}" -LSsf "${yq_url}"

					[ $? -ne 0 ] && die "Download ${yq_url} failed"

					chmod +x "${yq_path}"

					local yq_url="https://${yq_pkg}/releases/download/${yq_version}/yq_${goos}_${goarch}"

					${precmd} curl -o "${yq_path}" -LSsf "${yq_url}" || die "Download ${yq_url} failed"

					${precmd} chmod +x "${yq_path}"

					if ! command -v "${yq_path}" >/dev/null; then

						die "Cannot not get ${yq_path} executable"

									
										46

ci/lib.sh
									
												View File
											
				@@ -1,46 +0,0 @@

				#

				# Copyright (c) 2018 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -o nounset

				export tests_repo="${tests_repo:-github.com/kata-containers/tests}"

				export tests_repo_dir="$GOPATH/src/$tests_repo"

				export branch="${target_branch:-main}"

				# Clones the tests repository and checkout to the branch pointed out by

				# the global $branch variable.

				# If the clone exists and `CI` is exported then it does nothing. Otherwise

				# it will clone the repository or `git pull` the latest code.

				#

				clone_tests_repo()

				{

					if [ -d "$tests_repo_dir" ]; then

						[ -n "${CI:-}" ] && return

						pushd "${tests_repo_dir}"

						git checkout "${branch}"

						git pull

						popd

					else

						git clone -q "https://${tests_repo}" "$tests_repo_dir"

						pushd "${tests_repo_dir}"

						git checkout "${branch}"

						popd

					fi

				}

				run_static_checks()

				{

					clone_tests_repo

					# Make sure we have the targeting branch

					git remote set-branches --add origin "${branch}"

					git fetch -a

					bash "$tests_repo_dir/.ci/static-checks.sh" "github.com/kata-containers/kata-containers"

				}

				run_go_test()

				{

					clone_tests_repo

					bash "$tests_repo_dir/.ci/go-test.sh"

				}

									
										157

ci/openshift-ci/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,157 @@

				OpenShift CI

				============

				This directory contains scripts used by

				[the OpenShift CI](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers/kata-containers)

				pipelines to monitor selected functional tests on OpenShift.

				There are 2 pipelines, history and logs can be accessed here:

				* [main - currently supported OCP](https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-kata-containers-kata-containers-main-e2e-tests)

				* [next - currently under development OCP](https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-kata-containers-kata-containers-main-next-e2e-tests)

				Running openshift-tests on OCP with kata-containers manually

				============================================================

				To run openshift-tests (or other suites) with kata-containers one can use

				the kata-webhook. To deploy everything you can mimic the CI pipeline by:

				```bash

				#!/bin/bash -e

				# Setup your kubectl and check it's accessible by

				kubectl nodes

				# Deploy kata (set KATA_DEPLOY_IMAGE to override the default kata-deploy-ci:latest image)

				./test.sh

				# Deploy the webhook

				KATA_RUNTIME=kata-qemu cluster/deploy_webhook.sh

				```

				This should ensure kata-containers as well as kata-webhook are installed and

				working. Before running the openshift-tests it's (currently) recommended to

				ignore some security features by:

				```bash

				#!/bin/bash -e

				oc adm policy add-scc-to-group privileged system:authenticated system:serviceaccounts

				oc adm policy add-scc-to-group anyuid system:authenticated system:serviceaccounts

				oc label --overwrite ns default pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline

				```

				Now you should be ready to run the openshift-tests. Our CI only uses a subset

				of tests, to get the current ``TEST_SKIPS`` see

				[the pipeline config](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers/kata-containers).

				Following steps require the [openshift tests](https://github.com/openshift/origin)

				being cloned and built in the current directory:

				```bash

				#!/bin/bash -e

				# Define tests to be skipped (see the pipeline config for the current version)

				TEST_SKIPS="\[sig-node\] Security Context should support seccomp runtime/default\|\[sig-node\] Variable Expansion should allow substituting values in a volume subpath\|\[k8s.io\] Probing container should be restarted with a docker exec liveness probe with timeout\|\[sig-node\] Pods Extended Pod Container lifecycle evicted pods should be terminal\|\[sig-node\] PodOSRejection \[NodeConformance\] Kubelet should reject pod when the node OS doesn't match pod's OS\|\[sig-network\].*for evicted pods\|\[sig-network\].*HAProxy router should override the route\|\[sig-network\].*HAProxy router should serve a route\|\[sig-network\].*HAProxy router should serve the correct\|\[sig-network\].*HAProxy router should run\|\[sig-network\].*when FIPS.*the HAProxy router\|\[sig-network\].*bond\|\[sig-network\].*all sysctl on whitelist\|\[sig-network\].*sysctls should not affect\|\[sig-network\] pods should successfully create sandboxes by adding pod to network"

				# Get the list of tests to be executed

				TESTS="$(./openshift-tests run --dry-run --provider "${TEST_PROVIDER}" "${TEST_SUITE}")"

				# Store the list of tests in /tmp/tsts file

				echo "${TESTS}" | grep -v "$TEST_SKIPS" > /tmp/tsts

				# Remove previously-existing temporarily files as well as previous results

				OUT=RESULTS/tmp

				rm -Rf /tmp/*test* /tmp/e2e-*

				rm -R $OUT

				mkdir -p $OUT

				# Run the tests ignoring the monitor health checks

				./openshift-tests run --provider azure -o "$OUT/job.log" --junit-dir "$OUT" --file /tmp/tsts --max-parallel-tests 5 --cluster-stability Disruptive --run '^\[sig-node\].*|^\[sig-network\]'

				```

				[!NOTE]

				Note we are ignoring the cluster stability checks because our public cloud is

				not that stable and running with VMs instead of containers results in minor

				stability issues. Some of the old monitor stability tests do not reflect

				the ``--cluster-stability`` setting, one should simply ignore these. If you

				get a message like ``invariant was violated`` or ``error: failed due to a

				MonitorTest failure``, it's usually an indication that only those kind of

				tests failed but the real tests passed. See

				[wrapped-openshift-tests.sh](https://github.com/openshift/release/blob/master/ci-operator/config/kata-containers/kata-containers/wrapped-openshift-tests.sh)

				for details how our pipeline deals with that.

				[!TIP]

				To compare multiple results locally one can use

				[junit2html](https://github.com/inorton/junit2html) tool.

				Best-effort kata-containers cleanup

				===================================

				If you need to cleanup the cluster after testing, you can use the

				``cleanup.sh`` script from the current directory. It tries to delete all

				resources created by ``test.sh`` as well as ``cluster/deploy_webhook.sh``

				ignoring all failures. The primary purpose of this script is to allow

				soft-cleanup after deployment to test different versions without

				re-provisioning everything.

				[!WARNING]

				Do not rely on this script in production, return codes are not checked!**

				Bisecting e2e tests failures

				============================

				Let's say the OCP pipeline passed running with

				``quay.io/kata-containers/kata-deploy-ci:kata-containers-d7afd31fd40e37a675b25c53618904ab57e74ccd-amd64``

				but failed running with

				``quay.io/kata-containers/kata-deploy-ci:kata-containers-9f512c016e75599a4a921bd84ea47559fe610057-amd64``

				and you'd like to know which PR caused the regression. You can either run with

				all the 60 tags between or you can utilize the [bisecter](https://github.com/ldoktor/bisecter)

				to optimize the number of steps in between.

				Before running the bisection you need a reproducer script. Sample one called

				``sample-test-reproducer.sh`` is provided in this directory but you might

				want to copy and modify it, especially:

				* ``OCP_DIR`` - directory where your openshift/release is located (can be exported)

				* ``E2E_TEST`` - openshift-test(s) to be executed (can be exported)

				* behaviour of SETUP (returning 125 skips the current image tag, returning

				  >=128 interrupts the execution, everything else reports the tag as failure

				* what should be executed (perhaps running the setup is enough for you or

				  you might want to be looking for specific failures...)

				* use ``timeout`` to interrupt execution in case you know things should be faster

				Executing that script with the GOOD commit should pass

				``./sample-test-reproducer.sh quay.io/kata-containers/kata-deploy-ci:kata-containers-d7afd31fd40e37a675b25c53618904ab57e74ccd-amd64``

				and fail when executed with the BAD commit

				``./sample-test-reproducer.sh quay.io/kata-containers/kata-deploy-ci:kata-containers-9f512c016e75599a4a921bd84ea47559fe610057-amd64``.

				To get the list of all tags in between those two PRs you can use the

				``bisect-range.sh`` script

				```bash

				./bisect-range.sh d7afd31fd40e37a675b25c53618904ab57e74ccd 9f512c016e75599a4a921bd84ea47559fe610057

				```

				[!NOTE]

				The tagged images are only built per PR, not for individual commits. See

				[kata-deploy-ci](https://quay.io/kata-containers/kata-deploy-ci) to see the

				available images.

				To find out which PR caused this regression, you can either manually try the

				individual commits or you can simply execute:

				```bash

				bisecter start "$(./bisect-range.sh d7afd31fd40 9f512c016)"

				OCP_DIR=/path/to/openshift/release bisecter run ./sample-test-reproducer.sh

				```

				[!NOTE]

				If you use ``KATA_WITH_SYSTEM_QEMU=yes`` you might want to deploy once with

				it and skip it for the cleanup. That way you might (in most cases) test

				all images with a single MCP update instead of per-image MCP update.

				[!TIP]

				You can check the bisection progress during/after execution by running

				``bisecter log`` from the current directory. Before starting a new

				bisection you need to execute ``bisecter reset``.

				Peer pods

				=========

				It's possible to run similar testing on peer-pods using cloud-api-adaptor.

				Our CI configuration to run inside azure's OCP is in ``peer-pods-azure.sh``

				and can be used to replace the `test.sh` step in snippets above.

									
										30

ci/openshift-ci/bisect-range.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,30 @@

				#!/bin/bash

				# Copyright (c) 2024 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				if [[ "$#" -gt 2 ]] || [[ "$#" -lt 1 ]] ; then

					echo "Usage: $0 GOOD [BAD]"

					echo "Prints list of available kata-deploy-ci tags between GOOD and BAD commits (by default BAD is the latest available tag)"

					exit 255

				fi

				GOOD="$1"

				[[ -n "$2" ]] && BAD="$2"

				ARCH=amd64

				REPO="quay.io/kata-containers/kata-deploy-ci"

				TAGS=$(skopeo list-tags "docker://${REPO}")

				# For testing

				#echo "$TAGS" > tags

				#TAGS=$(cat tags)

				# Only amd64

				TAGS=$(echo "${TAGS}" | jq '.Tags' | jq "map(select(endswith(\"${ARCH}\")))" | jq -r '.[]')

				# Sort by git

				SORTED=""

				[[ -n "${BAD}" ]] && LOG_ARGS="${GOOD}~1..${BAD}" || LOG_ARGS="${GOOD}~1.."

				for TAG in $(git log --merges --pretty=format:%H --reverse "${LOG_ARGS}"); do

					[[ "${TAGS}" =~ ${TAG} ]] && SORTED+="

				kata-containers-${TAG}-${ARCH}"

				done

				# Comma separated tags with repo

				echo "${SORTED}" | tail -n +2 | sed -e "s@^@${REPO}:@" | paste -s -d, -

									
										61

ci/openshift-ci/cleanup.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,61 @@

				#!/bin/bash

				#

				# Copyright (c) 2024 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# This script tries to removes most of the resources added by `test.sh` script

				# from the cluster.

				scripts_dir=$(dirname "$0")

				deployments_dir=${scripts_dir}/cluster/deployments

				# shellcheck disable=SC1091 # import based on variable

				source "${scripts_dir}/lib.sh"

				# Set your katacontainers repo dir location

				[[ -z "${katacontainers_repo_dir}" ]] && echo "Please set katacontainers_repo_dir variable to your kata repo"

				# Set to 'yes' if you want to configure SELinux to permissive on the cluster

				# workers.

				#

				SELINUX_PERMISSIVE=${SELINUX_PERMISSIVE:-no}

				# Enable workaround for OCP 4.13 https://github.com/kata-containers/kata-containers/pull/9206

				#

				WORKAROUND_9206_CRIO=${WORKAROUND_9206_CRIO:-no}

				# Ignore errors as we want best-effort-approach here

				trap - ERR

				# Delete webhook resources

				oc delete -f "${scripts_dir}/../../tools/testing/kata-webhook/deploy"

				oc delete -f "${scripts_dir}/cluster/deployments/configmap_kata-webhook.yaml.in"

				# Delete potential smoke-test resources

				oc delete -f "${scripts_dir}/smoke/service.yaml"

				oc delete -f "${scripts_dir}/smoke/service_kubernetes.yaml"

				oc delete -f "${scripts_dir}/smoke/http-server.yaml"

				# Delete test.sh resources

				oc delete -f "${deployments_dir}/relabel_selinux.yaml"

				if [[ "${WORKAROUND_9206_CRIO}" == "yes" ]]; then

					oc delete -f "${deployments_dir}/workaround-9206-crio-ds.yaml"

					oc delete -f "${deployments_dir}/workaround-9206-crio.yaml"

				fi

				[[ ${SELINUX_PERMISSIVE} == "yes" ]] && oc delete -f "${deployments_dir}/machineconfig_selinux.yaml.in"

				# Delete kata-containers

				pushd "${katacontainers_repo_dir}/tools/packaging/kata-deploy" || { echo "Failed to push to ${katacontainers_repo_dir}/tools/packaging/kata-deploy"; exit 125; }

				oc delete -f kata-deploy/base/kata-deploy.yaml

				oc -n kube-system wait --timeout=10m --for=delete -l name=kata-deploy pod

				oc apply -f kata-cleanup/base/kata-cleanup.yaml

				echo "Wait for all related pods to be gone"

				( repeats=1; for _ in $(seq 1 600); do

				  oc get pods -l name="kubelet-kata-cleanup" --no-headers=true -n kube-system 2>&1 | grep "No resources found" -q && ((repeats++)) || repeats=1

				  [[ "${repeats}" -gt 5 ]] && echo kata-cleanup finished && break

				  sleep 1

				done) || { echo "There are still some kata-cleanup related pods after 600 iterations"; oc get all -n kube-system; exit 1; }

				oc delete -f kata-cleanup/base/kata-cleanup.yaml

				oc delete -f kata-rbac/base/kata-rbac.yaml

				oc delete -f runtimeclasses/kata-runtimeClasses.yaml

6

ci/openshift-ci/cluster/configs/selinux.conf Normal file

View File

@@ -0,0 +1,6 @@
 # Copyright (c) 2020 Red Hat, Inc.
 #
 # SPDX-License-Identifier: Apache-2.0
 #
 SELINUX=permissive
 SELINUXTYPE=targeted

									
										34

ci/openshift-ci/cluster/deploy_webhook.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,34 @@

				#!/bin/bash

				#

				# Copyright (c) 2021 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# This script builds the kata-webhook and deploys it in the test cluster.

				#

				# You should export the KATA_RUNTIME variable with the runtimeclass name

				# configured in your cluster in case it is not the default "kata-ci".

				#

				set -e

				set -o nounset

				set -o pipefail

				script_dir="$(realpath "$(dirname "$0")")"

				webhook_dir="${script_dir}/../../../tools/testing/kata-webhook"

				# shellcheck disable=SC1091 # import based on variable

				source "${script_dir}/../lib.sh"

				KATA_RUNTIME=${KATA_RUNTIME:-kata-ci}

				pushd "${webhook_dir}" >/dev/null

				# Build and deploy the webhook

				#

				info "Builds the kata-webhook"

				./create-certs.sh

				info "Override our KATA_RUNTIME ConfigMap"

				sed -i deploy/webhook.yaml -e "s/runtime_class: .*$/runtime_class: ${KATA_RUNTIME}/g"

				info "Deploys the kata-webhook"

				oc apply -f deploy/

				# Check the webhook was deployed and is working.

				RUNTIME_CLASS="${KATA_RUNTIME}" ./webhook-check.sh

				popd >/dev/null

									
										13

ci/openshift-ci/cluster/deployments/configmap_installer_kernel.yaml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,13 @@

				# Copyright (c) 2021 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# Instruct the daemonset installer to configure Kata Containers to use the

				# host kernel.

				#

				apiVersion: v1

				kind: ConfigMap

				metadata:

				  name: ci.kata.installer.kernel

				data:

				  host_kernel: "yes"

Compare commits

10422 Commits 2.3.0-rc1 ... decouple-k

30 .github/actionlint.yaml vendored Normal file Unescape Escape View File

40 .github/cargo-deny-composite-action/cargo-deny-generator.sh vendored Normal file Unescape Escape View File

30 .github/cargo-deny-composite-action/cargo-deny-skeleton.yaml.in vendored Normal file Unescape Escape View File

93 .github/dependabot.yml vendored Normal file Unescape Escape View File

10 .github/workflows/PR-wip-checks.yaml vendored Unescape Escape View File

30 .github/workflows/actionlint.yaml vendored Normal file Unescape Escape View File

55 .github/workflows/add-issues-to-project.yaml vendored Unescape Escape View File

391 .github/workflows/basic-ci-amd64.yaml vendored Normal file Unescape Escape View File

108 .github/workflows/basic-ci-s390x.yaml vendored Normal file Unescape Escape View File

134 .github/workflows/build-checks-preview-riscv64.yaml vendored Normal file Unescape Escape View File

146 .github/workflows/build-checks.yaml vendored Normal file Unescape Escape View File

462 .github/workflows/build-kata-static-tarball-amd64.yaml vendored Normal file Unescape Escape View File

336 .github/workflows/build-kata-static-tarball-arm64.yaml vendored Normal file Unescape Escape View File

271 .github/workflows/build-kata-static-tarball-ppc64le.yaml vendored Normal file Unescape Escape View File

75 .github/workflows/build-kata-static-tarball-riscv64.yaml vendored Normal file Unescape Escape View File

360 .github/workflows/build-kata-static-tarball-s390x.yaml vendored Normal file Unescape Escape View File

75 .github/workflows/build-kubectl-image.yaml vendored Normal file Unescape Escape View File

32 .github/workflows/cargo-deny-runner.yaml vendored Normal file Unescape Escape View File

33 .github/workflows/ci-coco-stability.yaml vendored Normal file Unescape Escape View File

35 .github/workflows/ci-devel.yaml vendored Normal file Unescape Escape View File

34 .github/workflows/ci-nightly-riscv.yaml vendored Normal file Unescape Escape View File

36 .github/workflows/ci-nightly-rust.yaml vendored Normal file Unescape Escape View File

27 .github/workflows/ci-nightly-s390x.yaml vendored Normal file Unescape Escape View File

34 .github/workflows/ci-nightly.yaml vendored Normal file Unescape Escape View File

54 .github/workflows/ci-on-push.yaml vendored Normal file Unescape Escape View File

128 .github/workflows/ci-weekly.yaml vendored Normal file Unescape Escape View File

502 .github/workflows/ci.yaml vendored Normal file Unescape Escape View File

38 .github/workflows/cleanup-resources.yaml vendored Normal file Unescape Escape View File

100 .github/workflows/codeql.yml vendored Normal file Unescape Escape View File

60 .github/workflows/commit-message-check.yaml vendored Unescape Escape View File

43 .github/workflows/darwin-tests.yaml vendored Normal file Unescape Escape View File

34 .github/workflows/docs-url-alive-check.yaml vendored Normal file Unescape Escape View File

32 .github/workflows/docs.yaml vendored Normal file Unescape Escape View File

55 .github/workflows/gatekeeper-skipper.yaml vendored Normal file Unescape Escape View File

55 .github/workflows/gatekeeper.yaml vendored Normal file Unescape Escape View File

53 .github/workflows/govulncheck.yaml vendored Normal file Unescape Escape View File

68 .github/workflows/kata-deploy-push.yaml vendored Unescape Escape View File

51 .github/workflows/kata-deploy-test.yaml vendored Unescape Escape View File

295 .github/workflows/main.yaml vendored Unescape Escape View File

78 .github/workflows/move-issues-to-in-progress.yaml vendored Unescape Escape View File

35 .github/workflows/nydus-snapshotter-version-in-sync.yaml vendored Normal file Unescape Escape View File

43 .github/workflows/osv-scanner.yaml vendored Normal file Unescape Escape View File

207 .github/workflows/payload-after-push.yaml vendored Normal file Unescape Escape View File

115 .github/workflows/publish-kata-deploy-payload.yaml vendored Normal file Unescape Escape View File

82 .github/workflows/release-amd64.yaml vendored Normal file Unescape Escape View File

82 .github/workflows/release-arm64.yaml vendored Normal file Unescape Escape View File

79 .github/workflows/release-ppc64le.yaml vendored Normal file Unescape Escape View File

83 .github/workflows/release-s390x.yaml vendored Normal file Unescape Escape View File

431 .github/workflows/release.yaml vendored Unescape Escape View File

51 .github/workflows/require-pr-porting-labels.yaml vendored Unescape Escape View File

75 .github/workflows/run-cri-containerd-tests.yaml vendored Normal file Unescape Escape View File

159 .github/workflows/run-k8s-tests-on-aks.yaml vendored Normal file Unescape Escape View File

90 .github/workflows/run-k8s-tests-on-arm64.yaml vendored Normal file Unescape Escape View File

130 .github/workflows/run-k8s-tests-on-nvidia-gpu.yaml vendored Normal file Unescape Escape View File

81 .github/workflows/run-k8s-tests-on-ppc64le.yaml vendored Normal file Unescape Escape View File

147 .github/workflows/run-k8s-tests-on-zvsi.yaml vendored Normal file Unescape Escape View File

157 .github/workflows/run-kata-coco-stability-tests.yaml vendored Normal file Unescape Escape View File

363 .github/workflows/run-kata-coco-tests.yaml vendored Normal file Unescape Escape View File

119 .github/workflows/run-kata-deploy-tests-on-aks.yaml vendored Normal file Unescape Escape View File

90 .github/workflows/run-kata-deploy-tests.yaml vendored Normal file Unescape Escape View File

70 .github/workflows/run-kata-monitor-tests.yaml vendored Normal file Unescape Escape View File

128 .github/workflows/run-metrics.yaml vendored Normal file Unescape Escape View File

54 .github/workflows/run-runk-tests.yaml vendored Normal file Unescape Escape View File

60 .github/workflows/scorecard.yaml vendored Normal file Unescape Escape View File

32 .github/workflows/shellcheck.yaml vendored Normal file Unescape Escape View File

35 .github/workflows/shellcheck_required.yaml vendored Normal file Unescape Escape View File

39 .github/workflows/snap-release.yaml vendored Unescape Escape View File

17 .github/workflows/snap.yaml vendored Unescape Escape View File

20 .github/workflows/stale.yaml vendored Normal file Unescape Escape View File

36 .github/workflows/static-checks-self-hosted.yaml vendored Normal file Unescape Escape View File

268 .github/workflows/static-checks.yaml vendored Unescape Escape View File

29 .github/workflows/zizmor.yaml vendored Normal file Unescape Escape View File

3 .github/zizmor.yml vendored Normal file Unescape Escape View File

12 .gitignore vendored Unescape Escape View File

83 CODEOWNERS Unescape Escape View File

2 CONTRIBUTING.md Unescape Escape View File

6287 Cargo.lock generated Normal file View File

140 Cargo.toml Normal file Unescape Escape View File

10422 Commits

2.3.0-rc1 ... decouple-k

30

.github/actionlint.yaml vendored Normal file

View File

40

.github/cargo-deny-composite-action/cargo-deny-generator.sh vendored Normal file

View File

30

.github/cargo-deny-composite-action/cargo-deny-skeleton.yaml.in vendored Normal file

View File

93

.github/dependabot.yml vendored Normal file

View File

10

.github/workflows/PR-wip-checks.yaml vendored

View File

30

.github/workflows/actionlint.yaml vendored Normal file

View File

55

.github/workflows/add-issues-to-project.yaml vendored

View File

391

.github/workflows/basic-ci-amd64.yaml vendored Normal file

View File

108

.github/workflows/basic-ci-s390x.yaml vendored Normal file

View File

134

.github/workflows/build-checks-preview-riscv64.yaml vendored Normal file

View File

146

.github/workflows/build-checks.yaml vendored Normal file

View File

462

.github/workflows/build-kata-static-tarball-amd64.yaml vendored Normal file

View File

336

.github/workflows/build-kata-static-tarball-arm64.yaml vendored Normal file

View File

271

.github/workflows/build-kata-static-tarball-ppc64le.yaml vendored Normal file

View File

75

.github/workflows/build-kata-static-tarball-riscv64.yaml vendored Normal file

View File

360

.github/workflows/build-kata-static-tarball-s390x.yaml vendored Normal file

View File

75

.github/workflows/build-kubectl-image.yaml vendored Normal file

View File

32

.github/workflows/cargo-deny-runner.yaml vendored Normal file

View File

33

.github/workflows/ci-coco-stability.yaml vendored Normal file

View File

35

.github/workflows/ci-devel.yaml vendored Normal file

View File

34

.github/workflows/ci-nightly-riscv.yaml vendored Normal file

View File

36

.github/workflows/ci-nightly-rust.yaml vendored Normal file

View File

27

.github/workflows/ci-nightly-s390x.yaml vendored Normal file

View File

34

.github/workflows/ci-nightly.yaml vendored Normal file

View File

54

.github/workflows/ci-on-push.yaml vendored Normal file

View File

128

.github/workflows/ci-weekly.yaml vendored Normal file

View File

502

.github/workflows/ci.yaml vendored Normal file

View File

38

.github/workflows/cleanup-resources.yaml vendored Normal file

View File

100

.github/workflows/codeql.yml vendored Normal file

View File

60

.github/workflows/commit-message-check.yaml vendored

View File

43

.github/workflows/darwin-tests.yaml vendored Normal file

View File

34

.github/workflows/docs-url-alive-check.yaml vendored Normal file

View File

32

.github/workflows/docs.yaml vendored Normal file

View File

55

.github/workflows/gatekeeper-skipper.yaml vendored Normal file

View File

55

.github/workflows/gatekeeper.yaml vendored Normal file

View File

53

.github/workflows/govulncheck.yaml vendored Normal file

View File

68

.github/workflows/kata-deploy-push.yaml vendored

View File

51

.github/workflows/kata-deploy-test.yaml vendored

View File

295

.github/workflows/main.yaml vendored

View File

78

.github/workflows/move-issues-to-in-progress.yaml vendored

View File

35

.github/workflows/nydus-snapshotter-version-in-sync.yaml vendored Normal file

View File

43

.github/workflows/osv-scanner.yaml vendored Normal file

View File

207

.github/workflows/payload-after-push.yaml vendored Normal file

View File

115

.github/workflows/publish-kata-deploy-payload.yaml vendored Normal file

View File

82

.github/workflows/release-amd64.yaml vendored Normal file

View File

82

.github/workflows/release-arm64.yaml vendored Normal file

View File

79

.github/workflows/release-ppc64le.yaml vendored Normal file

View File

83

.github/workflows/release-s390x.yaml vendored Normal file

View File

431

.github/workflows/release.yaml vendored

View File

51

.github/workflows/require-pr-porting-labels.yaml vendored

View File

75

.github/workflows/run-cri-containerd-tests.yaml vendored Normal file

View File

159

.github/workflows/run-k8s-tests-on-aks.yaml vendored Normal file

View File

90

.github/workflows/run-k8s-tests-on-arm64.yaml vendored Normal file

View File

130

.github/workflows/run-k8s-tests-on-nvidia-gpu.yaml vendored Normal file

View File

81

.github/workflows/run-k8s-tests-on-ppc64le.yaml vendored Normal file

View File

147

.github/workflows/run-k8s-tests-on-zvsi.yaml vendored Normal file

View File

157

.github/workflows/run-kata-coco-stability-tests.yaml vendored Normal file

View File

363

.github/workflows/run-kata-coco-tests.yaml vendored Normal file

View File

119

.github/workflows/run-kata-deploy-tests-on-aks.yaml vendored Normal file

View File

90

.github/workflows/run-kata-deploy-tests.yaml vendored Normal file

View File

70

.github/workflows/run-kata-monitor-tests.yaml vendored Normal file

View File

128

.github/workflows/run-metrics.yaml vendored Normal file

View File

54

.github/workflows/run-runk-tests.yaml vendored Normal file

View File

60

.github/workflows/scorecard.yaml vendored Normal file

View File

32

.github/workflows/shellcheck.yaml vendored Normal file

View File

35

.github/workflows/shellcheck_required.yaml vendored Normal file

View File

39

.github/workflows/snap-release.yaml vendored

View File

17

.github/workflows/snap.yaml vendored

View File

20

.github/workflows/stale.yaml vendored Normal file

View File

36

.github/workflows/static-checks-self-hosted.yaml vendored Normal file

View File

268

.github/workflows/static-checks.yaml vendored

View File

29

.github/workflows/zizmor.yaml vendored Normal file

View File

3

.github/zizmor.yml vendored Normal file

View File

12

.gitignore vendored

View File

83

CODEOWNERS

View File

2

CONTRIBUTING.md

View File

6287

Cargo.lock generated Normal file

View File

140

Cargo.toml Normal file

View File

93

Glossary.md

View File