Commit Graph

16599 Commits

Author SHA1 Message Date
Wainer dos Santos Moschetta
806d63d1d8 tests/k8s: call teardown_common in k8s-credentials-secrets.bats
The teardown_common will print the description of the running pods, kill
them all and print the system's syslogs afterwards.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta
c8f40fe12c tests/k8s: call teardown_common in k8s-sandbox-vcpus-allocation.bats
The teardown_common will print the description of the running pods, kill
them all and print the system's syslogs afterwards.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-12 10:13:51 +01:00
Fabiano Fidêncio
4a79c2520d
Merge pull request #11491 from Apokleos/default-blk-driver
runtime-rs: Change default block device driver from virtio-scsi to virtio-blk-*
2025-07-11 23:14:13 +02:00
alex.lyn
9cc14e4908 runtime-rs: Update block device driver docs within configuration
The previous description for the `block_device_driver` was inaccurate or
outdated. This commit updates the documentation to provide a more
precise explanation of its function.

Fixes #11488

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-11 17:40:58 +02:00
alex.lyn
92160c82ff runtime-rs: Change block device driver defualt with virtio-blk-*
When we run a kata pod with runtime-rs/qemu and with a default
configuration toml, it will fail with error "unsupported driver type
virtio-scsi".
As virtio-scsi within runtime-rs is not so popular, we set default block
device driver with `virtio-blk-*`.

Fixes #11488

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-11 17:40:58 +02:00
Ankita Pareek
5f08cc75b3 agent: update the processes hashmap to use exec_id as primary key
This patch changes the container process HashMap to use exec_id as the primary
key instead of PID, preventing exec_id collisions that could be exploited in
Confidential Computing scenarios where the host is less trusted than the guest.

Key changes:
- Changed `processes: HashMap<pid_t, Process>` to `HashMap<String, Process>`
- Added exec_id collision detection in `start()` method
- Updated process lookup operations to use exec_id directly
- Simplified `get_process()` with direct HashMap access

This prevents multiple exec operations from reusing the same exec_id, which
could be problematic in CoCo use cases where process isolation and unique
identification are critical for security.

Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>
2025-07-11 10:10:23 +00:00
Steve Horsman
878e50f978
Merge pull request #11554 from fidencio/topic/fix-version-file-on-release
gh: Fix released VERSION file
2025-07-11 09:20:06 +01:00
Fabiano Fidêncio
fb22e873cd gh: Fix released VERSION file
The `/opt/kata/VERSION` file, which is created using `git describe
--tags`, requires the newly released tag to be updated in order to be
accurate.

To do so, let's add a `fetch-tags: true` to the checkout action used
during the `create-kata-tarball` job.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-11 09:47:11 +02:00
Alex Lyn
87e41e2a09
Merge pull request #11549 from stevenhorsman/bump-remove_dir_all
runtime-rs: Switch tempdir to tempfile
2025-07-11 13:46:12 +08:00
Alex Lyn
f22272b8f7
Merge pull request #11540 from Apokleos/coldplug-vfio-clh
runtime-rs: Add vfio support with coldplug for cloud-hypervisor
2025-07-11 10:33:59 +08:00
RuoqingHe
7cd4e3278a
Merge pull request #11545 from RuoqingHe/remove-lockfile-for-libs
libs: Remove lockfile for libs
2025-07-10 21:56:10 +08:00
stevenhorsman
c740896b1c trace-forwarder: Bump chrono crate version
Bump chrono version to drop time@0.1.43 and remediate
vulnerability CVE-2020-26235

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-10 14:55:20 +01:00
stevenhorsman
3916507553 runtime-rs: Bump chrono crate version
Bump chrono version to drop time@0.1.45 and remediate
vulnerability CVE-2020-26235

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-10 13:47:05 +01:00
Wainer dos Santos Moschetta
3ab6a8462d ci/gatekeeper: make run-k8s-tests-coco-nontee job required
The CoCo non-TEE job (run-k8s-tests-coco-nontee) used to be required but
we had to withdraw it to fix a problem (#11156). Now the job is back
running and stable, so time to make it required again.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-10 12:19:19 +01:00
stevenhorsman
c5ceae887b runtime-rs: Switch tempdir to tempfile
tempdir hasn't been updated for seven years and pulls in
remove_dir_all@0.5.3 which has security advisory
GHSA-mc8h-8q98-g5hr, so replace this with using tempfile,
which the crate got merged into and we use elsewhere in the
project

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-10 12:16:35 +01:00
Ruoqing He
4039506740 libs: Ignore Cargo.lock in libs workspace
Ignore Cargo.lock in `libs` to prevent developers from accidentally
track lock files in `libs` workspace.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-10 09:31:45 +00:00
alex.lyn
3fbe493edc runtime-rs: Convert host devices within VmConfig for cloud-hypervisor
This PR adds support for adding a network device before starting the
cloud-hypervisor VM.
This commit will get the host devices from NamedHypervisorConfig and
assign it to VmConfig's devices which is for vfio devices when clh
starts launching.
And with this, it successfully finish the vfio devices conversion from
a generic Hypervisor config to a clh specific VmConfig.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-10 16:33:43 +08:00
alex.lyn
0b5b8f549d runtime-rs: Introduce a field host_devices within NamedHypervisorConfig
This commit introduce `host_devices` to help convert vfio devices from
a generic hypervisor config to a cloud-hypervisor specific VmConfig.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-10 16:33:41 +08:00
alex.lyn
d37183d754 runtime-rs: Add vfio support with coldplug for cloud-hypervisor
This PR adds support for adding a vfio device before starting the
cloud-hypervisor VM (or cold-plug vfio device).

This commit changes "pending_devices" for clh implementation via adding
DeviceType::Vfio() into pending_devices. And it will get shared host devices
after correctly handling vfio devices (Specially for primary device).

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-10 16:32:21 +08:00
Ruoqing He
ffa3a5a15e libs: Remove Cargo.lock
crates in `libs` workspace do not ship binaries, they are just libraries
for other workspace to reference, the `Cargo.lock` file hence would not
take effect. Removing Cargo.lock for `libs` workspace.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-10 03:14:55 +00:00
Fabiano Fidêncio
c68eb58f3f
Merge pull request #11529 from fidencio/topic/only-use-fixed-version-of-k0s-for-crio
tests: k0s: Always use latest version, apart from CRI-O tests
2025-07-09 18:47:18 +02:00
Hyounggyu Choi
09297b7955
Merge pull request #11537 from BbolroC/set-sharedfs-to-none-for-ibm-sel
runtime/runtime-rs: Set shared_fs to none for IBM SEL in config file
2025-07-09 18:30:08 +02:00
Hyounggyu Choi
bca31d5a4d runtime/runtime-rs: Set shared_fs to none for IBM SEL in config file
In line with configuration for other TEEs, shared_fs should
be set to none for IBM SEL. This commit updates the value for
runtime/runtime-rs.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-07-09 14:22:28 +02:00
Fabiano Fidêncio
5f17e61d11 tests: kata-deploy: Remove --wait from helm uninstall
As we're using a `kubectl wait --timeout ...` to check whether the
kata-deploy pod's been deleted or not, let's remove the `--wait` from
the `helm uninstall ...` call as k0s tests were failing because the
`kubectl wait --timeout...` was starting after the pod was deleted,
making the test fail.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-09 14:01:30 +02:00
Fabiano Fidêncio
842e17b756 tests: k0s: Always use latest version, apart from CRI-O tests
We've been pinning a specific version of k0s for CRI-O tests, which may
make sense for CRI-O, but doesn't make sense at all when it comes to
testing that we can install kata-deploy on latest k0s (and currently our
test for that is broken).

Let's bump to the latest, and from this point we start debugging,
instead of debugging on an ancient version of the project.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-09 13:27:18 +02:00
Steve Horsman
7bc25b0259
Merge pull request #11494 from katexochen/p/opa-1.6
versions: bump opa 1.5.1 -> 1.6.0
2025-07-09 11:45:54 +01:00
Steve Horsman
967f66f677
Merge pull request #11380 from arvindskumar99/sev-deprecation
Sev deprecation
2025-07-09 11:38:13 +01:00
stevenhorsman
f96b8fb690 kata-ctl: Update expected test failure message
Update expected error after url crate bump

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-09 11:34:27 +01:00
stevenhorsman
b7bf46fdfa versions: Bump idna crate to >= 1.0.4
Bump url, reqwests and idna crates in order to move away from
idna <1.0.3 and remediate CVE-2024-12224.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-09 11:34:27 +01:00
Xuewei Niu
b8838140d0
Merge pull request #11527 from StevenFryto/fix-runtime-rootless-bugs
runtime: Fix rootlessDir not correctly set in rootless VMM mode
2025-07-09 16:40:11 +08:00
Steve Horsman
990c4e68ee
Merge pull request #11523 from wainersm/ci_setup_kubectl
workflows: adopting azure/setup-kubectl
2025-07-09 09:09:38 +01:00
stevenfryto
3c7a670129 runtime: Fix rootlessDir not correctly set in rootless VMM mode
Previously, the rootlessDir variable in `src/runtime/virtcontainers/pkg/rootless.go` was initialized at
package load time using `os.Getenv("XDG_RUNTIME_DIR")`. However, in rootless
VMM mode, the correct value of $XDG_RUNTIME_DIR is set later during runtime
using os.Setenv(), so rootlessDir remained empty.

This patch defers the initialization of rootlessDir until the first call
to `GetRootlessDir()`, ensuring it always reflects the current environment
value of $XDG_RUNTIME_DIR.

Fixes: #11526

Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>
2025-07-09 09:51:48 +08:00
Wainer dos Santos Moschetta
e4da3b84a3 workflows: adopting azure/setup-kubectl
There are workflows that rely on `az aks install-cli` to get kubectl
installed. There is a well-known problem on install-cli, related with
API usage rate limit, that has recently caused the command to fail
quite often.

This is replacing install-cli with the azure/setup-kubectl github
action which has no such as rate limit problem.

While here, removed the install_cli() function from gha-run-k8s-common.sh
so avoid developers using it by mistake in the future.

Fixes #11463
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-08 15:15:54 -03:00
Alex Lyn
294b2c1c10
Merge pull request #11528 from Apokleos/remote-initdata
runtime-rs: add initdata annotation for remote hypervisor
2025-07-08 09:13:13 +08:00
Arvind Kumar
afedad0965 kernel: Removing SEV kernel packages
Removing kernel config files realting
to SEV as part of the SEV deprecation
efforts.

Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-07 11:21:11 -05:00
Arvind Kumar
ecac3d2d28 runtime: Removing runtime logic for SEV
Removing runtime SEV functionality,
such as the kbs, ovmf, VMSA handling,
and SEV configs as part of deprecating
SEV from kata.

Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-07 11:17:32 -05:00
Arvind Kumar
8eebcef8fb tests: Removing testing framework for SEV
Removing files pertaining to SEV from
the CI framework.

Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-07 11:17:32 -05:00
Arvind Kumar
675ea86aba kata-deploy: Removing SEV from kata-deploy
Removing files related to SEV, responsible for
installing and configuring Kata containers.

Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-07 11:17:32 -05:00
Paul Meyer
ff7ac58579 versions: bump opa 1.5.1 -> 1.6.0
Bumping opa to latest release.

Signed-off-by: Paul Meyer <katexochen0@gmail.com>
2025-07-07 14:19:08 +02:00
alex.lyn
fcaade24f4 runtime-rs: add initdata annotation for remote hypervisor
Add init data annotation within preparing remote hypervisor annotations
when prepare vm, so that it can be passed within CreateVMRequest.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-07 12:46:05 +01:00
Fabiano Fidêncio
110f68a0f1
Merge pull request #11530 from fidencio/topic/tests-fix-runtime-class-check
tests: runtimeclasses: Adjust gpu runtimeclasses
2025-07-07 13:42:46 +02:00
Fabiano Fidêncio
2c2995b7b0 tests: runtimeclasses: Adjust gpu runtimeclasses
679cc9d47c was merged and bumped the
podoverhead for the gpu related runtimeclasses. However, the bump on the
`kata-runtimeClasses.yaml` as overlooked, making our tests fail due to
that discrepancy.

Let's just adjust the values here and move on.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-07 11:43:40 +02:00
Fabiano Fidêncio
ef545eed86
Merge pull request #11513 from lifupan/dragonball_6.12.x
tools: port the dragonball kernel patch to 6.12.x
2025-07-07 10:31:49 +02:00
Steve Horsman
d291e9bda0
Merge pull request #11336 from zvonkok/fix-podoverhead
gpu: Update runtimeClasses for correct podoverhead
2025-07-07 09:20:07 +01:00
Fabiano Fidêncio
a2faf93211 kernel: Bump to v6.12.36
As that's the latest releasesd LTS.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-06 23:48:20 +02:00
Fupan Li
fd21c9df59 tools: port the dragonball kernel patch to 6.12.x
Backport the dragonball's kernel patches to
6.12.x kernel version.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-06 23:48:20 +02:00
Zvonko Kaiser
679cc9d47c gpu: Update runtimeClasses for correct podoverhead
We cannot only rely only on default_cpu and default_memory in the
config, default is 1 and 2Gi but we need some overhead for QEMU and
the other related binaries running as the pod overhead. Especially
when QEMU is hot-plugging GPUs, CPUs, and memory it can consume more
memory.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-04 12:20:15 -04:00
Steve Horsman
1c718dbcdd
Merge pull request #11506 from stevenhorsman/remove-atty-dependency
Remove atty dependency
2025-07-04 10:46:28 +01:00
Alex Lyn
362ea54763
Merge pull request #11517 from zvonkok/fix-nvrc-build
gpu: NVRC static build
2025-07-04 13:51:03 +08:00
Alex Lyn
2e35a8067d
Merge pull request #11482 from Apokleos/fix-force-guestpull
runtime-rs:  refactor and fix the implementation of guest-pull
2025-07-04 11:29:33 +08:00