kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-06-22 13:38:26 +00:00

Author	SHA1	Message	Date
Dan Mihai	fdf3088be0	Merge pull request #10842 from microsoft/danmihai1/disable-job-policy-test tests: disable k8s-policy-job.bats on coco-dev	2025-02-06 09:09:49 -08:00
Hyounggyu Choi	1bdb34e880	tests: Skip trusted storage tests for IBM SE Let's skip all tests for trusted storage until #10838 is resolved. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-06 12:09:14 +01:00
Dan Mihai	47ce5dad9d	tests: disable k8s-policy-job.bats on coco-dev k8s-policy-job is modeled after the older k8s-job, and it appears that both of them fail occasionally on coco-dev. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-02-05 23:06:16 +00:00
Arvind Kumar	47534c1c3e	nydus: Skipping SNP and SEV from deploying and deleting Snapshotter Preparing to install nydus permanently on the AMD node, so disabling deploy and delete command for SNP and SEV. Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-02-05 12:26:53 -06:00
Greg Kurz	0215d958da	Merge pull request #10805 from balintTobik/egrep_removal egrep/fgrep removal	2025-01-30 18:26:59 +01:00
Balint Tobik	1943a1c96d	tests: replace egrep with grep -E to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-29 11:26:27 +01:00
Hyounggyu Choi	dde627cef4	test: Run full set of zcrypttest for VFIO-AP coldplug Previously, the test for VFIO-AP coldplug only checked whether a passthrough device was attached to the VM guest. This commit expands the test to include a full set of zcrypttest to verify that the device functions properly within a container. Additionally, since containerd has been upgraded to v1.7.25 on the test machine, it is no longer necessary to run the test via crictl. The commit removes all related codes/files. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Zvonko Kaiser	b4c710576e	Merge pull request #10782 from stevenhorsman/clh-metrics-write-update metrics: Increase minval range for blogbench test	2025-01-24 10:21:20 -05:00
Fabiano Fidêncio	b47cc6fffe	cri-containerd: Skip TestDeviceCgroup till it's adapted to cgroupsv2 As the devices controller works in a different way in cgroupsv2, the "/sys/fs/cgroup/devices/devices.list" file simply doesn't exist. For now, let's skip the test till the test maintainer decides to re-enable it for cgroupsv2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
Fabiano Fidêncio	0626d7182a	tests: k8s-cpu-ns: Adapt to cgroupsv2 The changes done are: * cpu/cpu.shares was replaced by cpu.weight * The weight, according to our reference[0], is calculated by: weight = (1 + ((request - 2) * 9999) / 262142) * cpu/cpu.cfs_quota_us & cpu/cpu.cfs_period_us were replaced by cpu.max, where quota and period are written together (in this order) [0]: https://github.com/containers/crun/blob/main/crun.1.md#cgroup-v2 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
Fabiano Fidêncio	4307f0c998	Revert "ci: mariner: Ensure kernel_params can be set" This reverts commit `091ad2a1b2`, in order to ensure tests would be running with cgroupsv2 on the guest. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
stevenhorsman	d031e479ab	metrics: Increase minval range for blogbench test In the last couple of days I've seen the blogbench metrics write latency test on clh fail a few times because the latency was too low, so adjust the minimum range to tolerate quicker finishes. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 15:58:31 +00:00
Fabiano Fidêncio	734ef71cf7	tests: k8s: confidential: Cleanup $HOME/.ssh/known_hosts I've noticed the following error when running the tests with SEV: ``` 2025-01-21T17:10:28.7999896Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2025-01-21T17:10:28.8000614Z # @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ 2025-01-21T17:10:28.8001217Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2025-01-21T17:10:28.8001857Z # IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! 2025-01-21T17:10:28.8003009Z # Someone could be eavesdropping on you right now (man-in-the-middle attack)! 2025-01-21T17:10:28.8003348Z # It is also possible that a host key has just been changed. 2025-01-21T17:10:28.8004422Z # The fingerprint for the ED25519 key sent by the remote host is 2025-01-21T17:10:28.8005019Z # SHA256:x7wF8zI+LLyiwphzmUhqY12lrGY4gs5qNCD81f1Cn1E. 2025-01-21T17:10:28.8005459Z # Please contact your system administrator. 2025-01-21T17:10:28.8006734Z # Add correct host key in /home/kata/.ssh/known_hosts to get rid of this message. 2025-01-21T17:10:28.8007031Z # Offending ED25519 key in /home/kata/.ssh/known_hosts:178 2025-01-21T17:10:28.8007254Z # remove with: 2025-01-21T17:10:28.8008172Z # ssh-keygen -f "/home/kata/.ssh/known_hosts" -R "10.244.0.71" ``` And this was causing a failure to ssh into the confidential pod. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 12:04:13 +01:00
Fabiano Fidêncio	18137b1583	tests: k8s: confidential: Increase log_buf_len to 4M Relying on dmesg is really not ideal, as we may lose important info, mainly those which happen very early in the boot, depending on the size of kernel ring buffer. So, for this specific test, let's increase the kernel ring buffer, by default, to 4M. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 12:04:13 +01:00
Aurélien Bombo	0d70dc31c1	ci: Unify on $GH_PR_NUMBER environment variable While working on #10559, I realized that some parts of the codebase use $GH_PR_NUMBER, while other parts use $PR_NUMBER. Notably, in that PR, since I used $GH_PR_NUMBER for CoCo non-TEE tests without realizing that TEE tests use $PR_NUMBER, the tests on that PR fail on TEEs: https://github.com/kata-containers/kata-containers/actions/runs/12818127344/job/35744760351?pr=10559#step:10:45 ... 44 error: error parsing STDIN: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context ... 135 image: ghcr.io/kata-containers/csi-kata-directvolume: ... So let's unify on $GH_PR_NUMBER so that this issue doesn't repro in the future: I replaced all instances of PR_NUMBER with GH_PR_NUMBER. Note that since some test scripts also refer to that variable, the CI for this PR will fail (would have also happened with the converse substitution), hence I'm not adding the ok-to-test label and we should force-merge this after review. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-01-17 10:53:08 -06:00
Hyounggyu Choi	f7816e9206	tests: Introduce retry_kubectl_apply() for trusted storage On s390x, some tests for trusted storage occasionally failed due to: ```bash etcdserver: request timed out ``` or ```bash Internal error occurred: resource quota evaluation timed out ``` These timeouts were not observed previously on k3s but occur sporadically on kubeadm. Importantly, they appear to be temporary and transient, which means they can be ignored in most cases. To address this, we introduced a new wrapper function, `retry_kubectl_apply()`, for `kubectl create`. This function retries applying a given manifest up to 5 times if it fails due to a timeout. However, it will still catch and handle any other errors during pod creation. Fixes: #10651 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-14 21:15:44 +01:00
Pradipta Banerjee	36580bb642	tests: Update sealed secret CI value to base64url The existing encoding was base64 and it fails due to `874948638a` Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2025-01-13 09:37:05 -05:00
Zvonko Kaiser	f08a9eac11	Merge pull request #10721 from stevenhorsman/more-metrics-latency-minimum-range-fixes metrics: Increase latency test range	2025-01-10 21:59:39 -05:00
Wainer Moschetta	5fae2a9f91	Merge pull request #9871 from wainersm/fix-print_cluster_name tests/gha-run-k8s-common: shorten AKS cluster name	2025-01-09 14:35:02 -03:00
stevenhorsman	aaae5b6d0f	metrics: clh: Increase network-iperf3 range We hit a failure with: ``` time="2025-01-09T09:55:58Z" level=warning msg="Failed Minval (0.017600 > 0.015000) for [network-iperf3]" ``` The range is very big, but in the last 3 test runs I reviewed we have got a minimum value of 0.015s and a max value of 0.052, so there is a ~350% difference possible so I think we need to have a wide range to make this stable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-09 11:25:57 +00:00
stevenhorsman	e946d9d5d3	metrics: qemu: Increase latency test range After the kernel version bump, in the latest nightly run https://github.com/kata-containers/kata-containers/actions/runs/12681309963/job/35345228400 The sequential read throughput result was 79.7% of the expected (so failed) and the sequential write was 84% of the expected, so was fairly close, so increase their minimum ranges to make them more robust. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-09 11:25:50 +00:00
Wainer dos Santos Moschetta	badc208e9a	tests/gha-run-k8s-common: shorten AKS cluster name Because az client restricts the name to be less than 64 characters. In some cases (e.g. KATA_HYPERVISOR=qemu-runtime-rs) the generated name will exceed the limit. This changed the function to shorten the name: * SHA1 is computed from metadata then compound the cluster's name * metadata as plain-text are passed as --tags Fixes: #9850 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-01-08 16:39:07 -03:00
Fabiano Fidêncio	8f8988fcd1	Merge pull request #10714 from fidencio/topic/update-virtiofsd virtiofsd: Update to its v1.13.0 ( + one patch) release :-)	2025-01-08 17:59:29 +01:00
Fabiano Fidêncio	eb3fe0d27c	Merge pull request #10717 from fidencio/topic/re-enable-oom-test-for-mariner tests: Re-enable oom tests for mariner	2025-01-08 17:43:56 +01:00
stevenhorsman	dc069d83b5	metrics: Increase latency test range The bump to kernel 6.12 seems to have reduced the latency in the metrics test, so increase the ranges for the minimal value, to account for this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-08 15:11:49 +00:00
Fabiano Fidêncio	967d5afb42	Revert "tests: k8s: Skip one of the empty-dir tests" This reverts commit `9aea7456fb`. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-08 14:07:34 +01:00
Fabiano Fidêncio	53ac0f00c5	tests: Re-enable oom tests for mariner Since we bumped to the 6.12.x LTS kernel, we've also adjusted the aggressivity of the OOM test, which may be enough to allow us to re-enable it for mariner. Fixes: #8821 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-07 18:33:17 +01:00
Fabiano Fidêncio	f4a39e8c40	Merge pull request #10468 from fidencio/topic/early-tests-on-next-lts-kernel versions: Move kernel to the latest 6.12 release (the current LTS)	2025-01-07 18:02:04 +01:00
Fupan Li	b19db40343	CI: change the containerd tarball name to containerd Since from https://github.com/containerd/containerd/pull/9096 containerd removed cri-containerd-*.tar.gz release bundles, thus we'd better change the tarball name to "containerd". BTW, the containerd tarball containerd the follow files: bin/ bin/containerd-shim bin/ctr bin/containerd-shim-runc-v1 bin/containerd-stress bin/containerd bin/containerd-shim-runc-v2 thus we should untar containerd into /usr/local directory instead of "/" to keep align with the cri-containerd. In addition, there's no containerd.service file,runc binary and cni-plugin included, thus we should add a specific containerd.service file and install install the runc binary and cni-pluginspecifically. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-01-07 17:39:05 +08:00
Fabiano Fidêncio	9aea7456fb	tests: k8s: Skip one of the empty-dir tests An issue has been created for this, and we should fix the issue before the next release. However, for now, let's unblock the kernel bump and have the test skipped. Reference: https://github.com/kata-containers/kata-containers/issues/10706 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-06 21:48:20 +01:00
Fabiano Fidêncio	44ff602c64	tests: k8s: Be more aggressive to get OOM Let's increase the amount of bytes allocated per VM worker, so we can hit the OOM sooner. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-06 21:48:20 +01:00
Fupan Li	2068801b80	Merge pull request #10626 from teawater/ma Add mem-agent to kata	2024-12-24 14:11:36 +08:00
Steve Horsman	99f239bc44	Merge pull request #10380 from stevenhorsman/required-tests-guidance doc: Add required jobs info	2024-12-20 16:24:42 +00:00
stevenhorsman	d1d4bc43a4	static-checks: Add words to dictionary devmapper and snapshotters are being marked as spelling errors, so add them to the kata dictionary Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 14:16:52 +00:00
stevenhorsman	dd02b6699e	tests: Fix qemu-coc-dev skip Fix the logic to make the test skipped on qemu-coco-dev, rather than the opposite and update the syntax to make it clearer as it incorrectly got written and reviewed by three different people in it's prior form. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-19 19:50:46 +00:00
Hyounggyu Choi	341e5ca58e	vfio-ap: Assign default string "0" for empty APID and APQI The current script logic assigns an empty string to APID and APQI when APQN consists entirely of zeros (e.g., "00.0000"). However, this behavior is incorrect, as "00" and "0000" are valid values and should be represented as "0". This commit ensures that the script assigns the default string “0” to APID and APQI if their computed values are empty. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-12-13 14:39:03 +01:00
Ryan Savino	7d45382f54	Revert "ci: Skip the failing tests in SNP" This reverts commit `2242aee099`.	2024-12-10 16:20:31 -06:00
Aurélien Bombo	037281d699	Merge pull request #10593 from microsoft/saulparedes/improve_namespace_validation policy: improve pod namespace validation	2024-12-09 11:55:09 -06:00
Hui Zhu	4407f6e098	mem-agent: Add to src mem-agent is a component designed for managing memory in Linux environments. Sub-feature memcg: Utilizes the MgLRU feature to monitor each cgroup's memory usage and periodically reclaim cold memory. Sub-feature compact: Periodically compacts memory to facilitate the kernel's free page reporting feature, enabling the release of more idle memory from guests. During memory reclamation and compaction, mem-agent monitors system pressure using Pressure Stall Information (PSI). If the system pressure becomes too high, memory reclamation or compaction will automatically stop. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:02 +08:00
Wainer Moschetta	a94982d8b8	Merge pull request #10617 from stevenhorsman/skip-k8s-job-test-on-non-tee tests: Skip k8s job test on qemu-coco-dev	2024-12-04 15:47:33 -03:00
Saul Paredes	84a411dac4	policy: improve pod namespace validation - Remove default_namespace from settings - Ensure container namespaces in a pod match each other in case no namespace is specified in the YAML Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-12-04 10:17:54 -08:00
Steve Horsman	c86f76d324	Merge pull request #10588 from stevenhorsman/metrics-clh-min-range-relaxation metrics: Increase minval range for failing tests	2024-12-04 16:10:26 +00:00
stevenhorsman	a8ccd9a2ac	tests: Skip k8s job test on qemu-coco-dev The tests is unstable on this platform, so skip it for now to prevent the regular known failures covering up other issues. See #10616 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-04 16:00:05 +00:00
Saul Paredes	711d12e5db	policy: support optional metadata uid field This prevents a deserialization error when uid is specified Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-12-02 11:24:58 -08:00
stevenhorsman	b87b4b6756	metrics: Increase ranges range for qemu failing tests We've also seen the qemu metrics tests are failing due to the results being slightly outside the max range for network-iperf3 parallel and minimum for network-iperf3 jitter tests on PRs that have no code changes, so we've increase the bounds to not see false negatives. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-29 10:52:16 +00:00
stevenhorsman	4011071526	metrics: Increase minval range for failing tests We've seen a couple of instances recently where the metrics tests are failing due to the results being below the minimum value by ~2%. For tests like latency I'm not sure why values being too low would be an issue, but I've updated the minpercent range of the failing tests to try and get them passing. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-29 10:50:02 +00:00
Aurélien Bombo	16a91fccbe	Merge pull request #10561 from sprt/csi-driver-ci coco: ci: Lay groundwork for compiling and publishing CSI driver image [1/x]	2024-11-27 10:26:45 -06:00
Aurélien Bombo	5e4990bcf5	coco: ci: Add no-op steps to deploy CSI driver This adds no-op steps that'll be used to deploy and clean up the CSI driver used for testing. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-21 16:08:06 -06:00
Adithya Krishnan Kannan	2242aee099	ci: Skip the failing tests in SNP Per [Issue#10549](https://github.com/kata-containers/kata-containers/issues/10549), the following tests are failing on SNP. 1. k8s-guest-pull-image-encrypted.bats 2. k8s-guest-pull-image-authenticated.bats 3. k8s-guest-pull-image-signature.bats 4. k8s-confidential-attestation.bats Per @fidencio 's comment on [PR#10558](https://github.com/kata-containers/kata-containers/pull/10558), I am skipping the same. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-11-19 10:41:43 -06:00
Fabiano Fidêncio	9b1a5f2ac2	tests: Add a way to run only tests which rely on attestation We're doing this as, at Intel, we have two different kind of machines we can plug into our CI. Without going much into details, only one of those two kinds of machines will work for the attestation tests we perform with ITA, thus in order to speed up the CI and improve test coverage (OS wise), we're going to run different tests in different machines. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-14 15:51:57 +01:00
GabyCT	06fe459e52	Merge pull request #10508 from GabyCT/topic/installartsta gha: Get artifacts when installing kata tools in stability workflow	2024-11-11 15:59:06 -06:00
Fabiano Fidêncio	2281342fb8	Merge pull request #10513 from fidencio/topic/ci-adjust-proxy-nightmare-for-tdx ci: tdx: kbs: Ensure https_proxy is taken in consideration	2024-11-10 00:17:10 +01:00
Saul Paredes	461efc0dd5	tests: remove manifest v1 test This test was meant to show support for pulling images with v1 manifest schema versions. The nginxhttps image has been modified in https://hub.docker.com/r/ymqytw/nginxhttps/tags such that we are no longer able to pull it: $ docker pull ymqytw/nginxhttps:1.5 Error response from daemon: missing signature key We may remove this test since schema version 1 manifests are deprecated per https://docs.docker.com/engine/deprecated/#pushing-and-pulling-with-image-manifest-v2-schema-1 : "These legacy formats should no longer be used, and users are recommended to update images to use current formats, or to upgrade to more current images". This schema version was used by old docker versions. Further OCI spec https://github.com/opencontainers/image-spec/blob/main/manifest.md#image-manifest-property-descriptions only supports schema version 2. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-11-08 13:38:51 -08:00
Fabiano Fidêncio	baf88bb72d	ci: tdx: kbs: Ensure https_proxy is taken in consideration Trustee's deployment must set the correct https_proxy as env var on the container that will talk to the ITA / ITTS server, otherwise the kbs service won't be able to start, causing then issues in our CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Krzysztof Sandowicz <krzysztof.sandowicz@intel.com>	2024-11-08 16:06:16 +01:00
Steve Horsman	1f728eb906	Merge pull request #10498 from stevenhorsman/update-create-container-timeout-log tests: k8s: Update image pull timeout error	2024-11-08 10:47:39 +00:00
Gabriela Cervantes	4274198664	gha: Get artifacts when installing kata tools in stability workflow This PR adds the get artifacts which are needed when installing kata tools in stability workflow to avoid failures saying that artifacts are missing. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-07 16:20:41 +00:00
GabyCT	47cea6f3c6	Merge pull request #10493 from GabyCT/topic/katatoolsta gha: Add install kata tools as part of the stability workflow	2024-11-06 14:16:48 -06:00
Gabriela Cervantes	13e27331ef	gha: Add install kata tools as part of the stability workflow This PR adds the install kata tools step as part of the k8s stability workflow. To avoid the failures saying that certain kata components are not installed it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-06 20:07:06 +00:00
Fabiano Fidêncio	71c4c2a514	Merge pull request #10486 from kata-containers/topic/enable-AUTO_GENERATE_POLICY-for-qemu-coco-dev workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev	2024-11-06 21:04:45 +01:00
stevenhorsman	85554257f8	tests: k8s: Update image pull timeout error Currently the error we are checking for is `CreateContainerRequest timed out`, but this message doesn't always seem to be printed to our pod log. Try using a more general message that should be present more reliably. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-06 17:00:26 +00:00
Julien Ropé	da5e0c3f53	ci: skip nginx connectivity test with crio We have an error with service name resolution with this test when using crio. This error could not be reproduced outside of the CI for now. Skipping it to keep the CI job running until we find a solution. See: #10414 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-11-06 12:07:02 +01:00
Julien Ropé	6d0cb1e9a8	ci: export CONTAINER_RUNTIME to the test scripts This variable will allow tests to adapt their behaviour to the runtime (containerd/crio). Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-11-06 11:29:11 +01:00
Fabiano Fidêncio	72979d7f30	workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev By the moment we're testing it also with qemu-coco-dev, it becomes easier for a developer without access to TEE to also test it locally. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-06 10:47:08 +01:00
Dan Mihai	03bf4433d7	Merge pull request #10459 from stevenhorsman/update-bats tests: k8s: Update bats	2024-11-04 12:26:58 -08:00
Aurélien Bombo	f639d3e87c	Merge pull request #10395 from Sumynwa/sumsharma/create_container agent-ctl: Add support to test kata-agent's container creation APIs.	2024-11-04 14:09:12 -06:00
Gabriela Cervantes	fd4d0dd1ce	gha: Fix source for gha stability run script This PR fixes the source to avoid duplication specially in the common.sh script and avoid failures saying that certain script is not in the directory. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-04 16:16:13 +00:00
stevenhorsman	175ebfec7c	Revert "k8s:kbs: Add trap statement to clean up tmp files" This reverts commit `973b8a1d8f`. As @danmihai1 points out https://github.com/bats-core/bats-core/issues/364 states that using traps in bats is error prone, so this could be the cause of the confidential test instability we've been seeing, like it was in the static checks, so let's try and revert this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:37 +00:00
stevenhorsman	75cb1f46b8	tests/k8s: Add skip is setup_common fails At @danmihai1's suggestion add a die message in case the call to setup_common fails, so we can see if in the test output. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:33 +00:00
stevenhorsman	3f5bf9828b	tests: k8s: Update bats We've seen some issues with tests not being run in some of the Coco CI jobs (Issue #10451) and in the envrionments that are more stable we noticed that they had a newer version of bats installed. Try updating the version to 1.10+ and print out the version for debug purposes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:33 +00:00
Sumedh Alok Sharma	4b7aba5c57	agent-ctl: Add support to test kata-agent's container creation APIs. This commit introduces changes to enable testing kata-agent's container APIs of CreateContainer/StartContainer/RemoveContainer. The changeset include: - using confidential-containers image-rs crate to pull/unpack/mount a container image. Currently supports only un-authenicated registry pull - re-factor api handlers to reduce cmdline complexity and handle request generation logic in tool - introduce an OCI config template for container creation - add test case Fixes #9707 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-11-01 22:18:54 +05:30
Gabriela Cervantes	c4089df9d2	gha: Add missing steps in Kata stability workflow This PR adds missing steps in the gha run script for the kata stability workflow. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-30 19:13:15 +00:00
Hyounggyu Choi	dca69296ae	Merge pull request #10476 from BbolroC/switch-to-kubeadm-s390x gha: Switch KUBERNETES from k3s to kubeadm on s390x	2024-10-30 09:52:06 +01:00
GabyCT	8539cd361a	Merge pull request #10462 from GabyCT/topic/increstress tests: Increase time to run stressng k8s tests	2024-10-29 11:08:47 -06:00
Hyounggyu Choi	238f67005f	tests: Add `kubeadm` option for KUBERNETES in gha-run.sh When creating a k8s cluster via kubeadm, the devmapper setup for containerd requires a different configuration. This commit introduces a new `kubeadm` option for the KUBERNETES variable and adjusts the path to the containerd config file for devmapper setup. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-10-29 14:19:42 +01:00
stevenhorsman	b1cffb4b09	Revert "tests: Add trap statement in kata doc script" This reverts commit `093a6fd542`. as it is breaking the static checks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-29 09:57:18 +00:00
Fabiano Fidêncio	b70d7c1aac	tests: Enable measured rootfs tests for qemu-coco-dev Then it's on pair with what's being tested with TEEs using a rootfs image. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:54 +01:00
Fabiano Fidêncio	7d202fc173	tests: Re-enable measured_rootfs test for TDX As we're now building everything needed to test TDX with measured rootfs support, let's bring this test back in (for TDX only, at least for now). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	eb07a809ce	tests: Add a helper script to use prebuild components This is a helper script that does basically what's already being done by the s390x CI, which is: * Move a folder with the components that we were stored / downloaded during the GHA execution to the expected `build` location * Get rid of the dependencies for a specific asset, as the dependencies are already pulled in from previous GHA steps For now this script is only being added but not yet executed anywhere, and that will come as the next step in this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:52 +01:00
Gabriela Cervantes	a3ef8c0a16	tests: Increase time to run stressng k8s tests This PR increase the time to run the stressng k8s tests for the CoCo stability CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-24 16:34:17 +00:00
Gabriela Cervantes	093a6fd542	tests: Add trap statement in kata doc script This PR adds the trap statement into the kata doc script to clean up properly the temporary files. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-23 15:56:58 +00:00
alex.lyn	b25538f670	ci: Introduce CI to validate pod hostname Fixes #10422 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-10-21 16:32:56 +01:00
GabyCT	b00203ba9b	Merge pull request #10428 from GabyCT/topic/archk8sc gha: Use a arch_to_golang variable to have uniformity	2024-10-17 11:00:59 -06:00
Gabriela Cervantes	f0e0c74fd4	gha: Use a arch_to_golang variable to have uniformity This PR replaces the arch uname -m to use the arch_to_golang variable in the script to have a better uniformity across the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-15 20:03:09 +00:00
Dan Mihai	ece0f9690e	tests: k8s-inotify: longer pod termination timeout inotify-configmap-pod.yaml is using: "inotifywait --timeout 120", so wait for up to 180 seconds for the pod termination to be reported. Hopefully, some of the sporadic errors from #10413 will be avoided this way: not ok 1 configmap update works, and preserves symlinks waitForProcess "${wait_time}" "$sleep_time" "${command}" failed Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-15 16:01:25 +00:00
Dan Mihai	ccfb7faa1b	tests: k8s-inotify.bats: don't leak configmap Delete the configmap if the test failed, not just on the successful path. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-15 16:01:25 +00:00
Leonard Cohnen	c06bf2e3bb	genpolicy: read binaryData value as String While Kubernetes defines `binaryData` as `[]byte`, when defined in a YAML file the raw bytes are base64 encoded. Therefore, we need to read the YAML value as `String` and not as `Vec<u8>`. Fixes: #10410 Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-10-14 20:03:11 +02:00
Fabiano Fidêncio	cf5d3ed0d4	kbs: ita: Ensure the proper image / image_tag is used for ITA When dealing with a specific release, it was easier to just do some adjustments on the image that has to be used for ITA without actually adding a new entry in the versions.yaml. However, it's been proven to be more complicated than that when it comes to dealing with staged images, and we better explicitly add (and update) those versions altogether to avoid CI issues. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-10 10:01:33 +02:00
Fabiano Fidêncio	3f7ce1d620	Merge pull request #10401 from stevenhorsman/kbs-deploy-overlays-update Kbs deploy overlays update	2024-10-10 09:50:19 +02:00
Fabiano Fidêncio	091ad2a1b2	ci: mariner: Ensure kernel_params can be set The reason we're doing this is because mariner image uses, by default, cgroups default-hierarchy as `unified` (aka, cgroupsv2). In order to keep the same initrd behaviour for mariner, let's enforce that `SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 systemd.legacy_systemd_cgroup_controller=yes systemd.unified_cgroup_hierarchy=0` is passed to the kernel cmdline, at least for now. Other tests that are setting `kernel_params` are not running on mariner, then we're safe taking this path as it's done as part of this PR. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 18:23:35 +02:00
Fabiano Fidêncio	3bbf3c81c2	ci: mariner: Use the image instead of the initrd As an image has been added for mariner as part of the commit `63c1f81c2`, let's start using it in the CI, instead of using the initrd. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 18:23:32 +02:00
stevenhorsman	8763880e93	tests/k8s: kbs: Update overlays logic In https://github.com/confidential-containers/trustee/pull/521 the overlays logic was modified to add non-SE s390x support and simplify non-ibm-se platforms. We need to update the logic in `kbs_k8s_deploy` to match and can remove the dummying of `IBM_SE_CREDS_DIR` for non-SE now Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-09 09:39:41 +01:00
ChengyuZhu6	a94024aedc	tests: add test for sealed file secrets add a test for sealed file secrets. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-10-08 16:01:48 +08:00
Dan Mihai	6d5fc898b8	tests: k8s: AUTO_GENERATE_POLICY=yes for local testing The behavior of Kata CI doesn't change. For local testing using kubernetes/gha-run.sh and AUTO_GENERATE_POLICY=yes: 1. Before these changes users were forced to use: - SEV, SNP, or TDX guests, or - KATA_HOST_OS=cbl-mariner 2. After these changes users can also use other platforms that are configured with "shared_fs = virtio-fs" - e.g., - KATA_HOST_OS=ubuntu + KATA_HYPERVISOR=qemu Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-04 18:26:00 +00:00
Dan Mihai	5aaef8e6eb	Merge pull request #10376 from microsoft/danmihai1/auto-generate-just-for-ci gha: enable AUTO_GENERATE_POLICY where needed	2024-10-04 10:52:31 -07:00
Dan Mihai	1a4928e710	gha: enable AUTO_GENERATE_POLICY where needed The behavior of Kata CI doesn't change. For local testing using kubernetes/gha-run.sh: 1. Before these changes: - AUTO_GENERATE_POLICY=yes was always used by the users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner. 2. After these changes: - Users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner must specify AUTO_GENERATE_POLICY=yes if they want to auto-generate policy. - These users have the option to test just using hard-coded policies (e.g., using the default policy built into the Guest rootfs) by using AUTO_GENERATE_POLICY=no. AUTO_GENERATE_POLICY=no is the default value of this env variable. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-02 23:20:33 +00:00
Gabriela Cervantes	973b8a1d8f	k8s:kbs: Add trap statement to clean up tmp files This PR adds the trap statement in the confidential kbs script to clean up temporary files and ensure we are leaving them. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-02 19:59:08 +00:00
Steve Horsman	8412c09143	Merge pull request #10371 from fidencio/topic/k8s-tdx-re-enable-empty-dir-tests k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev	2024-10-02 18:41:19 +01:00
Dan Mihai	9a8341f431	Merge pull request #10370 from microsoft/danmihai1/k8s-policy-rc tests: k8s-policy-rc: remove default UID from YAML	2024-10-02 09:32:17 -07:00
GabyCT	a1d380305c	Merge pull request #10369 from GabyCT/topic/egrepfastf metrics: Update fast footprint script to use grep	2024-10-02 10:10:12 -06:00
Fabiano Fidêncio	b3ed7830e4	k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev The tests is disabled for qemu-coco-dev / qemu-tdx, but it doesn't seen to actually be failing on those. Plus, it's passing on SEV / SNP, which means that we most likely missed re-enabling this one in the past. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-01 20:51:01 +02:00

1 2 3 4 5 ...

1453 Commits