kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-04-28 03:42:09 +00:00

Author	SHA1	Message	Date
stevenhorsman	23434791f2	workflows: Refactor publish workflows Replace the four different publish workflows with a single one that take input parameters of the arch and runner, so reduce the amount of duplicated code and try and avoid the ``` too many workflows are referenced, total: 21, limit: 20 ``` error	2025-02-25 10:49:09 +00:00
Fabiano Fidêncio	7bd444fa52	ci: Run k8s tests on arm64 Let's take advantege of the current arm64 runners, and make sure we have those tests running there as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-02-24 18:43:20 +01:00
Aurélien Bombo	adca339c3c	ci: Fix GH throttling in run-nerdctl-tests Specify a GH API token to avoid the below throttling error: https://github.com/kata-containers/kata-containers/actions/runs/13450787436/job/37585810679?pr=10911#step:4:96 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Hyounggyu Choi	d973d41efb	GHA: Turn off MEASURED_ROOTFS in build-kata-static-tarball-s390x This is the first attempt to remove the following code: ``` if [ "${ARCH}" == "s390x" ]; then export MEASURED_ROOTFS=no fi ``` from install_shimv2() in kata-deploy-binaries.sh. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-19 18:19:19 +01:00
Zvonko Kaiser	ca4d227562	gpu: Add qemu-tdx-experimental build We need to introduce again the qemu-tdx build for the GPU Depends-on: #10867 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-19 14:48:56 +00:00
Zvonko Kaiser	1d9915147d	release: Remove artifacts for release We need to make sure the release does not have any residual binaries left for the release payload Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-17 20:16:48 +00:00
Adithya Krishnan Kannan	6cc5b79507	CI: Deprecate SEV Phase 1 of Issue #10840 AMD has deprecated SEV support on Kata Containers, and going forward, SNP will be the only AMD feature supported. As a first step in this deprecation process, we are removing the SEV CI workflow from the test suite to unblock the CI. Will be adding future commits to remove redundant SEV code paths. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2025-02-13 12:20:21 -06:00
Zvonko Kaiser	fbc8454d3d	Merge pull request #10866 from zvonkok/enable-cc-gpu-build gpu: enable confidential initrd build	2025-02-12 09:26:08 -05:00
Zvonko Kaiser	5431841a80	Merge pull request #10814 from kata-containers/shellcheck-gha gha: Add shellcheck	2025-02-11 18:30:41 -05:00
Zvonko Kaiser	b231a795d7	gha: Add shellcheck We need to start to fix our scripts. Lets run shellcheck and see what needs to be reworked. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 16:00:34 +00:00
Zvonko Kaiser	befb2a7c33	gpu: Confidential Initrd Start building the confidential initrd Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 15:41:36 +00:00
Fupan Li	a3fd3d90bc	ci: Add the sandbox api testcases A test case is added based on the intergrated cri-containerd case. The difference between cri containerd integrated testcase and sandbox api testcase is the "sandboxer" setting in the sandbox runtime handler. If the "sandboxer" is set to "" or "podsandbox", then containerd will use the legacy shimv2 api, and if the "sandboxer" is set to "shim", then it will use the sandbox api to launch the pod. In addition, add a containerd v2.0.0 version. Because containerd officially supports the sandbox api from version 2.0.0. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fabiano Fidêncio	c9f5966f56	Merge pull request #10860 from kata-containers/topic/debug-ci workflows: build: Do not store unnecessary content on the tarball	2025-02-10 20:01:37 +01:00
Fabiano Fidêncio	ec290853e9	workflows: build: Do not store unnecessary content on the tarball Otherwise we may end up simply unpacking kata-containers specific binaries into the same location that system ones are needed, leading to a broken system (most likely what happened with the metrics CI, and also what's happening with the GHA runners). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-10 18:57:29 +01:00
Fabiano Fidêncio	23cb5bb6c2	ci: Only use the Ubuntu TDX machine in the CI We've been hitting issues with the CentOS 9 Stream machine, which Intel doesn't have cycles to debug. After raising this up in the Confidential Containers community meeting we got the green light from Red Hat (Ariel Adam) to just disable the CI based on CentOS 9 Stream for now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-10 12:50:16 +01:00
Zvonko Kaiser	45bd451fa0	ci: add arm64 attestation Do the very same thing that we do on amd64 and add attestation Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	9a7dff9c40	gpu: Add arm64 targets We want to make sure we deliver arm64 GPU targets as well Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	968318180d	ci: Add extratarballs steps We introduced extratarballs with a make target. The CI currently only uploads tarballs that are listed in the matrix. The NV kernel builds a headers package which needs to be uploaded as well. The get-artifacts has a glob to download all artifacts hence we should be good. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	b04bdf54a5	gpu: Add rootfs target amd64/arm64 Adding the initrd build first to get the rootfs on amd64. With that we can start to add tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Steve Horsman	9060904c4f	Merge pull request #10826 from kata-containers/topic/crio-test-timeouts workflows: Add delete kata-deploy timeouts for crio tests	2025-02-04 13:09:49 +00:00
stevenhorsman	d9eb1b0e06	versions: Bump golang version Bump golang versions so we are more up-to-date and have the extra security fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-03 15:28:53 +00:00
stevenhorsman	5203158195	workflows: Add delete kata-deploy timeouts for crio tests I've also seen cases (the qemu, crio, k0s tests) where Delete kata-deploy is still running for this test after 2 hours, and had to be manually cancelled, so let's try adding a 5m timeout to the kata-deploy delete to stop CI jobs hanging. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-03 11:45:43 +00:00
stevenhorsman	d625f20d18	workflows: Move arm static checks runner Now we have the build-assets running on the gh-hosted runners, try the same approach for the static-checks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 14:23:09 +00:00
stevenhorsman	ab27e11d31	workflows: Switch to github-hosted arm runner Now that gituhb have hosted arm runners https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/ we should try and switch our arm64 builder jobs to run on these. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-22 16:27:17 +00:00
Ruoqing He	373a388844	ci: Retry on failure of Create AKS cluster The `Create AKS cluster` step in `run-k8s-tests-on-aks.yaml` is likely to fail fail since we are trying to issue `PUT` to `aks` in a relatively high frequency, while the `aks` end has it's limit on `bucket-size` and `refill-rate`, documented here [1]. Use `nick-fields/retry@v3` to retry in 10 seconds after request fail, based on observations that AKS were request 7, or 8 second delays before retry as part of their 429 response [1] https://learn.microsoft.com/en-us/azure/aks/quotas-skus-regions#throttling-limits-on-aks-resource-provider-apis Fixes: #10772 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-22 13:24:51 +00:00
Aurélien Bombo	0d70dc31c1	ci: Unify on $GH_PR_NUMBER environment variable While working on #10559, I realized that some parts of the codebase use $GH_PR_NUMBER, while other parts use $PR_NUMBER. Notably, in that PR, since I used $GH_PR_NUMBER for CoCo non-TEE tests without realizing that TEE tests use $PR_NUMBER, the tests on that PR fail on TEEs: https://github.com/kata-containers/kata-containers/actions/runs/12818127344/job/35744760351?pr=10559#step:10:45 ... 44 error: error parsing STDIN: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context ... 135 image: ghcr.io/kata-containers/csi-kata-directvolume: ... So let's unify on $GH_PR_NUMBER so that this issue doesn't repro in the future: I replaced all instances of PR_NUMBER with GH_PR_NUMBER. Note that since some test scripts also refer to that variable, the CI for this PR will fail (would have also happened with the converse substitution), hence I'm not adding the ok-to-test label and we should force-merge this after review. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-01-17 10:53:08 -06:00
stevenhorsman	9b6fce9e96	workflows: Add more ppc64le timeouts Unsurprisingly now we've got passed the containerd test hangs on the ppc64le, we are hitting others in the "Prepare the self-hosted runner" stage, so add timeouts to all of them to avoid CI blockages. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 17:31:24 +00:00
stevenhorsman	d9d8d53bea	workflows: Add timeout to some ppc64le steps In some runs e.g. https://github.com/kata-containers/kata-containers/actions/runs/12426384186/job/34697095588 and https://github.com/kata-containers/kata-containers/actions/runs/12422958889/job/34697016842 we've seen the Prepare the self-hosted runner and Install dependencies steps get stuck for 5hours+. If they are working then it should take a few minutes, so let's add timeouts and not hold up whole the CI if they are stuck Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 16:37:36 +00:00
stevenhorsman	cf8b82794a	workflows: Only remove artifacts in release builds Due to the agent-api tests requiring the agent to be deployed in the CI by the tarball, so in the short-term lets only do this on the release stage, so that both kata-manager works with the release and the agent-api tests work with the other CI builds. In the longer term we need to re-evaluate what is in our tarballs (issue #10619), but want to unblock the tests in the short-term. Fixes: #10630 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-12 17:38:27 +00:00
stevenhorsman	e1f6aca9de	workflows: Remove potential timing issues with artifacts With the code I originally did I think there is potentially a case where we can get a failure due to timing of steps. Before this change the `build-asset-shim-v2` job could start the `get-artifacts` step and concurrently `remove-rootfs-binary-artifacts` could run and delete the artifact during the download and result in the error. In this commit, I try to resolve this by making sure that the shim build waits for the artifact deletes to complete before starting. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-12 16:52:54 +00:00
stevenhorsman	b4b3471bcb	workflows: linting: Fix shellcheck SC1001 > This \/ will be a regular '/' in this context Remove ignored escape Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	491210ed22	workflows: linting: Fix shellcheck SC2006 > Use $(...) notation instead of legacy backticks `...` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	5d7c5bdfa4	workflows: linting: Fix shellcheck SC2015 > A && B \|\| C is not if-then-else. C may run when A is true Refactor the echo so that we can't get into a situation where the retry of workspace delete happens if the original one was successful Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	c2ba15c111	workflows: linting: Fix shellcheck SC2206 > Quote to prevent word splitting/globbing Double quote variables expanded in an array Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	007514154c	workflows: linting: Fix shellcheck SC2068 > Double quote array expansions to avoid re-splitting elements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	4ef05c6176	workflows: linting: Fix shellcheck SC2116 > Useless echo? Instead of 'cmd $(echo foo)', just use 'cmd foo' Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	f02d540799	workflows: Bump outdated action versions Bump some actions that are significantly out-of-date and out of sync with the versions used in other workflows Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	935327b5aa	workflows: linting: Fix shellcheck SC2046 > Quote this to prevent word splitting. Quote around subshell Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	d4bd314d52	workflows: linting: Fix incorrect properties These properties are currently invalid, so either fix, or remove them Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	9113606d45	workflows: linting: Fix shellcheck SC2086 > Double quote to prevent globbing and word splitting. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	42cd2ce6e4	workflows: Add actionlint workflows On PRs that update anything in the workflows directory, add an actionlint run to validate our workflow files for errors and hopefully catch issues earlier. Fixes: #9646 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 11:36:08 +00:00
Fabiano Fidêncio	300a827d03	release: helm: Add the chart as part of the release So users can simply download the chart and use it accordingly without the need to download the full repo. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-06 11:19:34 +01:00
stevenhorsman	14a3adf4d6	workflows: Fix remove artifact name filter - Fix copy-paste errors in artifact filters for arm64 and ppc64le - Remove the trailing wildcard filter that falsely ends up removing agent-ctl and replace with the tarball-suffix, which should exactly match the artifacts Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-04 13:34:42 +00:00
Aurélien Bombo	4aa7d4e358	ci: Require CSI driver for CoCo tests With the building/publishing step for the CSI driver validated, we can set that as a requirement for the CoCo tests. Depends on: #10561 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 14:43:36 -06:00
Aurélien Bombo	a23ceac913	ci: Fix Docker publishing for CSI driver, 2nd try Follow-up to #10609 as it seems GHA doesn't allow hard links: https://github.com/kata-containers/kata-containers/actions/runs/12144941404/job/33868901896?pr=10563#step:6:8 Note that I also updated the `needs` directive as we don't need the Kata payload container, just the tarball artifact. Part of: #10560 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 13:04:46 -06:00
Aurélien Bombo	85d3bcd713	ci: Fix Docker publishing for CSI driver The compilation succeeds, however Docker can't find the binary because we specify an absolute path. In Docker world, an absolute path is absolute to the Docker build context (here: src/tools/csi-kata-directvolume). To fix this, we link the binary into the build context, where the Dockerfile expects it. Failure mode: https://github.com/kata-containers/kata-containers/actions/runs/12068202642/job/33693101962?pr=10563#step:8:213 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-02 15:50:01 -06:00
Fabiano Fidêncio	92b8091f62	Revert "ci: unbreak: Reallow no-op builds" This reverts commit `559018554b`. As we've noticed that this is causing issues with initrd builds in the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-28 12:02:40 +01:00
Aurélien Bombo	559018554b	ci: unbreak: Reallow no-op builds #9838 previously modified the static build so as not to repeatedly copy the same assets on each matrix iteration: https://github.com/kata-containers/kata-containers/pull/9838#issuecomment-2169299202 However, that implementation breaks specifiying no-op/WIP build targets such as done in `e43c59a`. Such no-op builds have been a historical of the project requirement because of a GHA limitation. The breakage is due to no-op builds not generating a tar file corresponding to the asset: https://github.com/kata-containers/kata-containers/actions/runs/12059743390/job/33628926474?pr=10592 To address this breakage, we revert to the `cp -r` implementation and add the `--no-clobber` flag to still preserve the current behavior. Note that `-r` will also create the destination directory if it doesn't exist. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-27 18:40:29 -06:00
Aurélien Bombo	7f659f3d63	gha: Unbreak CI and work around workflow limit #10561 inadvertently broke the CI by going over the limit of 20 reusable workflows: https://github.com/kata-containers/kata-containers/actions/runs/12054648658/workflow This commit fixes that by inlining the job. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-27 12:23:15 -06:00
Aurélien Bombo	16a91fccbe	Merge pull request #10561 from sprt/csi-driver-ci coco: ci: Lay groundwork for compiling and publishing CSI driver image [1/x]	2024-11-27 10:26:45 -06:00

1 2 3 4 5 ...

785 Commits