kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-06-25 15:02:45 +00:00

Author	SHA1	Message	Date
Hyounggyu Choi	2c2941122c	tests: Fail fast in assert_pod_fail() `assert_pod_fail()` currently calls `k8s_create_pod()` to ensure that a pod does not become ready within the default 120s. However, this delays the test's completion even if an error message is detected earlier in the journal. This commit removes the use of `k8s_create_pod()` and modifies `assert_pod_fail()` to fail as soon as the pod enters a failed state. All failing pods end up in one of the following states: - CrashLoopBackOff - ImagePullBackOff The function now polls the pod's state every 5 seconds to check for these conditions. If the pod enters a failed state, the function immediately returns 0. If the pod does not reach a failed state within 120 seconds, it returns 1. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-24 16:09:20 +02:00
Aurélien Bombo	e738054ddb	Merge pull request #10311 from pawelpros/pproskur/fixyq ci: don't require sudo for yq if already installed	2024-09-23 08:57:11 -07:00
Alex Lyn	6b94cc47a8	Merge pull request #10146 from Apokleos/intro-cdi Introduce cdi in runtime-rs	2024-09-23 21:45:42 +08:00
Alex Lyn	b8ba346e98	runtime-rs: Add test for container devices with CDI. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-23 17:20:22 +08:00
Steve Horsman	0e0cb24387	Merge pull request #10329 from Bickor/webhook-check tools.kata-webhook: Specify runtime class using configMap	2024-09-23 09:59:12 +01:00
Steve Horsman	6f0b3eb2f9	Merge pull request #10337 from stevenhorsman/update-release-process-post-3.9.0 doc: Update the release process	2024-09-23 09:55:57 +01:00
Hyounggyu Choi	8a893cd4ee	Merge pull request #10232 from BbolroC/fix-loop-device-for-exec_host tests: Fix loop device handling for exec_host()	2024-09-23 08:15:03 +02:00
Fupan Li	f1f5bef9ef	Merge pull request #10339 from lifupan/main_fix runtime-rs: fix the issue of using block_on	2024-09-23 09:28:40 +08:00
Fupan Li	fb27de3561	runtime-rs: fix the issue of using block_on Since the block_on would block on the current thread which would prevent other async tasks to be run on this worker thread, thus change it to use the async task for this task. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:40:44 +08:00
Aurélien Bombo	79a3b4e2e5	Merge pull request #10335 from kata-containers/sprt/fix-kata-deploy-docs kata-deploy: clean up and fix docs for k0s	2024-09-20 13:33:14 -07:00
stevenhorsman	4f745f77cb	doc: Update the release process - Reflect the need to update the versions in the Helm Chart - Add the lock branch instruction - Add clarity about the permissions needed to complete tasks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-20 19:04:33 +01:00
Aurélien Bombo	78c63c7951	kata-deploy: clean up and fix docs for k0s * Clarifies instructions for k0s. * Adds kata-deploy step for each cluster type. * Removes the old kata-deploy-stable step for vanilla k8s. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-09-20 11:59:40 -05:00
Hyounggyu Choi	2d6ac3d85d	tests: Re-enable guest-pull-image tests for qemu-coco-dev Now that the issue with handling loop devices has been resolved, this commit re-enables the guest-pull-image tests for `qemu-coco-dev`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	c6b86e88e4	tests: Increase timeouts for qemu-coco-dev in trusted image storage tests Timeouts occur (e.g. `create_container_timeout` and `wait_time`) when using qemu-coco-dev. This commit increases these timeouts for the trusted image storage test cases Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	9cff9271bc	tests: Run all commands in _loop_device() using exec_host() If the host running the tests is different from the host where the cluster is running, the _loop_device() functions do not work as expected because the device is created on the test host, while the cluster expects the device to be local. This commit ensures that all commands for the relevant functions are executed via exec_host() so that a device should be handled on a cluster node. Additionally, it modifies exec_host() to return the exit code of the last executed command because the existing logic with `kubectl debug` sometimes includes unexpected characters that are difficult to handle. `kubectl exec` appears to properly return the exit code for a given command to it. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	374b8d2534	tests: Create and delete node debugger pod only once Creating and deleting a node debugger pod for every `exec_host()` call is inefficient. This commit changes the test suite to create and delete the pod only once, globally. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	aedf14b244	tests: Mimic node debugger with full privileges This commit addresses an issue with handling loop devices via a node debugger due to restricted privileges. It runs a pod with full privileges, allowing it to mount the host root to `/host`, similar to the node debugger. This change enables us to run tests for trusted image storage using the `qemu-coco-dev` runtime class. Fixes: #10133 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Alex Lyn	63b25e8cb0	runtime-rs: Introduce cdi devices in container creation Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Alex Lyn	03735d78ec	runtime-rs: add cdi devices definition and related methods Add cdi devices including ContainerDevice definition and annotation_container_device method to annotate vfio device in OCI Spec annotations which is inserted into Guest with its mapping of vendor-class and guest pci path. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Alex Lyn	020e3da9b9	runtime-rs: extend DeviceVendor with device class We need vfio device's properties device, vendor and class, but we can only get property device and vendor. just extend it with class is ok. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Fabiano Fidêncio	77c844da12	Merge pull request #10239 from fidencio/topic/remove-acrn acrn: Drop support	2024-09-19 23:10:29 +02:00
GabyCT	6eef58dc3e	Merge pull request #10336 from GabyCT/topic/extendtimeout gha: Increase timeout to run k8s tests on TDX	2024-09-19 13:12:55 -06:00
Martin	b9d88f74ed	tools.kata-webhook: Specify runtime class using configMap The kata webhook requires a configmap to define what runtime class it should set for the newly created pods. Additionally, the configmap allows others to modify the default runtime class name we wish to set (in case the handler is kata but the name of the runtimeclass is different). Finally, this PR changes the webhook-check to compare the runtime of the newly created pod against the specific runtime class in the configmap, if said confimap doesn't exist, then it will default to "kata". Signed-off-by: Martin <mheberling@microsoft.com>	2024-09-19 11:51:38 -07:00
Fabiano Fidêncio	51dade3382	docs: Fix spell checker tokio is not a valid word, it seeems, so let's use `tokio`. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 20:25:21 +02:00
Gabriela Cervantes	49b3a0faa3	gha: Increase timeout to run k8s tests on TDX This PR increases the timeout to run k8s tests for Kata CoCo TDX to avoid the random failures of timeout. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-19 17:15:47 +00:00
Fabiano Fidêncio	31438dba79	docs: Fix qemu link Otherwise static checks will fail, as we woke up the dogs with changes on the same file. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 16:05:43 +02:00
Fabiano Fidêncio	fefcf7cfa4	acrn: Drop support As we don't have any CI, nor maintainer to keep ACRN code around, we better have it removed than give users the expectation that it should or would work at some point. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 16:05:43 +02:00
Fabiano Fidêncio	cdaaf708a1	Merge pull request #10334 from emanuellima1/bump-version release: Bump version to 3.9.0	2024-09-19 15:27:50 +02:00
Emanuel Lima	a6ee15c5c7	release: Bump VERSION to 3.9.0 Starting the v3.9.0 release Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-09-19 10:14:55 -03:00
Fabiano Fidêncio	e9593b53a4	Merge pull request #10234 from pmores/add-support-for-disabled-guest-selinux runtime-rs: add support for disabled guest selinux	2024-09-19 15:03:24 +02:00
Fabiano Fidêncio	4d11fecc2d	Merge pull request #10274 from ajaypvictor/remote_image-os_types runtime: Enable Image annotation for remote hypervisor	2024-09-19 13:39:20 +02:00
Fabiano Fidêncio	3d5f48e02e	Merge pull request #10283 from alexman-stripe/alexman-stripe/fix-kata-shim-not-reporting-inactive-file-cgroup-v2 shim: Fix memory usage reporting for cgroup v2	2024-09-19 12:50:36 +02:00
Pavel Mores	5e5eb9759f	runtime-rs: handle disabled guest selinux in virtiofsd This is just a port of functionality existing in the golang runtime. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	8c92f3bfec	runtime-rs: enable/disable selinux in guest based on disable_guest_selinux This change technically affects the path for enabled guest selinux as well, however since this is not implemented in runtime-rs anyway nothing should break. When guest selinux support is added this change will come handy. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	204ee21bc8	runtime-rs: handle disabled guest selinux in OCI spec If guest selinux is off the runtime has to ensure that container OCI spec contains no selinux labels for the container rootfs and process. Failure to do so causes kata agent to try and apply the labels which fails since selinux is not enabled in guest, which in turn causes container launch to fail. This is largely inspired by golang runtime() with a slight deviation in ordering of checks. This change simply checks the disable_guest_selinux config setting and if it's true it clears both rootfs and process label if necessary. Golang runtime, on the other hand, seems to first check if process label is non-empty and only then it checks the config setting, meaning that if process label is empty the rootfs label is not reset even if it's non-empty. Frankly, this looks like a potential bug though probably unlikely to manifest since it can be assumed that the labels are either both empty, or both non-empty. () `4fd4b02f2e/src/runtime/virtcontainers/kata_agent.go (L1005)` Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	eb1227f47d	runtime-rs: parse the disable_guest_selinux config key In order to handle the setting we have to first parse it and make its value available to the rest of the program. The yes() function is added to comply with serde which seems to insist on default values being returned from functions. Long term, this is surely not the best place for this function to live, however given that this is currently the first and only place where it's used it seems appropriate to put it near its use. If it ends up being reused elsewhere a better place will surely emerge. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Steve Horsman	8789551fe6	Merge pull request #10333 from fidencio/topic/ci-bump-ubuntu-20.04-runners-to-22.04 ci: Bump ubuntu 20.04 runners to 22.04	2024-09-19 11:44:33 +01:00
Fabiano Fidêncio	35c7f8d1ba	ci: Bump ubuntu 20.04 runners to 22.04 Azure internal mirrors for Ubuntu 20.04 have gone awry, leading to a situation where dependencies cannot be installed (such as libdevmapper-dev), blocking then our CI. Let's bump the runners to 22.04 regardless, even knowing it'll cause an issue with the runk tests, as the agent check tests are considered more crucial to the project at this point. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 12:29:20 +02:00
Fabiano Fidêncio	eccdffebf7	Merge pull request #10243 from katexochen/nydus-overlayfs-path virtcontainers: allow specifying nydus-overlayfs binary by path	2024-09-19 11:35:45 +02:00
Ajay Victor	a19f2eacec	runtime: Enable ImageName annotation for remote hypervisor Enables ImageName to support multiple VM images in remote hypervisor scenario Fixes https://github.com/kata-containers/kata-containers/issues/10240 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>	2024-09-19 14:48:46 +05:30
Alex Man	27f8f69195	shim: Fix memory usage reporting for cgroup v2 kata-shim was not reporting `inactive_file` in memory stat. This memory is deducted by containerd when calculating the size of container working set, as it can be paged out by the operating system under memory pressure. Without reporting `inactive_file`, containerd will over report container memory usage. [Here](https://github.com/containerd/containerd/blob/v1.7.22/pkg/cri/server/container_stats_list_linux.go#L117) is where containerd deducts `inactive_file` from memory usage. Note that kata-shim correctly reports `total_inactive_file` for cgroup v1, but this was not implemented for cgroup v2. This commit: - Adds code in kata-shim to report "inactive_file" memory for cgroup v2 - Implements reporting of all available cgroup v2 memory stats to containerd - Uses defensive coding to avoid assuming existence of any memory.stat fields The list of available cgroup v2 memory stats defined by containerd can be found [here](https://pkg.go.dev/github.com/containerd/cgroups/v2/stats#MemoryStat). Fixes #10280 Signed-off-by: Alex Man <alexman@stripe.com>	2024-09-18 14:04:24 -07:00
Fabiano Fidêncio	1597f8ba00	Merge pull request #10279 from alexman-stripe/alexman-stripe/fix-cgroup-v2-wrong-cpu-usage-unit agent: Fix CPU usage reporting for cgroup v2 in kata-agent	2024-09-18 21:36:52 +02:00
Fabiano Fidêncio	593cbb8710	Merge pull request #10306 from microsoft/danmihai1/more-security-contexts genpolicy: get UID from PodSecurityContext	2024-09-18 21:33:39 +02:00
Aurélien Bombo	5402f2c637	Merge pull request #10308 from Sumynwa/sumsharma/add_setpolicy_agent_ctl agent-ctl: Add SetPolicy support	2024-09-18 10:09:07 -07:00
Pawel Proskurnicki	b63d49b34a	ci: don't require sudo for yq if already installed Yq installation shouldn't force to use sudo in case yq is already installed in correct version. Signed-off-by: Pawel Proskurnicki <pawel.proskurnicki@intel.com>	2024-09-18 11:01:07 +02:00
Sumedh Alok Sharma	18c887f055	agent-ctl: Add SetPolicy support This patch adds support to call kata agents SetPolicy API. Also adds tests for SetPolicy API using agent-ctl. Fixes #9711 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-18 10:53:49 +05:30
GabyCT	28d430ec42	Merge pull request #10324 from GabyCT/topic/fixinlib ci: Fix indentation of install libseccomp script	2024-09-17 14:21:24 -06:00
Fabiano Fidêncio	da2377346d	Merge pull request #10323 from stevenhorsman/update-kubectl-release-url kata-deploy: Switch Kubernetes URL	2024-09-17 20:47:17 +02:00
Gabriela Cervantes	096f32cc52	ci: Fix indentation of install libseccomp script This PR fixes the indentation of the install libseccomp script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-17 16:38:53 +00:00
Aurélien Bombo	9d29ce460d	Merge pull request #10303 from Sumynwa/sumsharma/agent_policy_set_env agent: add support to provide default agent policy via env	2024-09-17 09:04:11 -07:00

1 2 3 4 5 ...

14618 Commits