Commit Graph

701 Commits

Author SHA1 Message Date
stevenhorsman
f2a2117252 tests: k8s: Retry output of kubectl exec in k8s-cpu-ns
We are seeing failures in this test, where the output of
the kubectl exec command seems to be blank, so try
retrying the exec like #11024

Fixes: #11133
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-04-30 15:01:08 +01:00
Aurélien Bombo
46af7cf817
Merge pull request #11077 from microsoft/cameronbaird/address-gid-mismatch
genpolicy: Align GID behavior with CRI and enable GID policy checks.
2025-04-29 22:23:23 +01:00
Aurélien Bombo
19371e2d3b
Merge pull request #11164 from wainersm/fix_kbs_on_aks
tests/k8s: fix kbs installation on Azure AKS
2025-04-29 18:25:14 +01:00
stevenhorsman
52b2662b75 tests: confidential: Add KBS logging
For help with debugging add, logging of the KBS,
like the container system logs if the confidential test fails

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-04-29 09:48:18 +01:00
Cameron Baird
70ef0376fb genpolicy: Introduce special handling for clusters using nydus
Nydus+guest_pull has specific behavior where it improperly handles image layers on
the host, causing the CRI to not find /etc/passwd and /etc/group files
on container images which have them. The unfortunately causes different
outcomes w.r.t. GID used which we are trying to enforce with policy.

This behavior is observed/explained in https://github.com/kata-containers/kata-containers/issues/11162

Handle this exception with a config.settings.cluster_config.guest_pull
field. When this is true, simply ignore the /etc/* files in the
container image as they will not be parsed by the CRI.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-04-28 20:18:42 +00:00
Cameron Baird
fc75aee13a ci: Add CI tests for runAsGroup, GID policy
Introduce tests to check for policy correctness on a redis deployment
with 1. a pod-level securityContext 2. a container-level securityContext
which shadows the pod-level securityContext 3. a pod-level
securityContext which selects an existing user (nobody), causing a new GID to be selected.

Redis is an interesting container image to test with because it includes
a /etc/passwd file with existing user/group configuration of 1000:1000 baked in.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-04-28 16:28:31 +00:00
Wainer dos Santos Moschetta
a66aac0d77 tests/k8s: optimize nginx ingress for AKS small VM
It's used an AKS managed ingress controller which keeps two nginx pod
replicas where both request 500m of CPU. On small VMs like we've used on
CI for running the CoCo non-TEE tests, it left only a few amount of CPU
for the tests. Actually, one of these pod replicas won't even get
started. So let's patch the ingress controller to have only one replica
of nginx.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-04-28 12:08:31 -03:00
Wainer dos Santos Moschetta
14e74b8fc9 tests/k8s: fix kbs installation on Azure AKS
The Azure AKS addon-http-application-routing add-on is deprecated and
cannot be enabled on new clusters which has caused some CI jobs to fail.

Migrated our code to use approuting instead. Unlike
addon-http-application-routing, this add-on doesn't
configure a managed cluster DNS zone, but the created ingress has a
public IP. To avoid having to deal with DNS setup, we will be using that
address from now on. Thus, some functions no longer used are deleted.

Fixes #11156
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-04-28 12:08:31 -03:00
Dan Mihai
706c2e2d68
Merge pull request #11184 from microsoft/danmihai1/retry-genpolicy
ci: retry genpolicy execution
2025-04-24 08:01:22 -07:00
Dan Mihai
517d6201f5 ci: retry genpolicy execution
genpolicy is sending more HTTPS requests than other components during
CI so it's more likely to be affected by transient network errors
similar to:

ConnectError(
  "dns error",
  Custom {
     kind: Uncategorized,
     error: "failed to lookup address information: Try again",
  },
)

Note that genpolicy is not the only component hitting network errors
during CI. Recent example from a different component:

"Message:  failed to create containerd task: failed to create shim task:
 failed to async pull blob stream HTTP status server error (502 Bad Gateway)"

This CI change might help just with the genpolicy errors.

Fixes: #11182

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-04-23 21:38:12 +00:00
Xynnn007
b1c72c7094 test: add integration test for initdata
This test we will test initdata in the following logic
1. Enable image signature verification via kernel commandline
2. Set Trustee address via initdata
3. Pull an image from a banned registry
4. Check if the pulling fails with log `image security validation
failed` the initdata works.

Note that if initdata does not work, the pod still fails to launch. But
the error information is `[CDH] [ERROR]: Get Resource failed` which
internally means that the KBS URL has not been set correctly.

This test now only runs on qemu-coco-dev+x86_64 and qemu-tdx

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2025-04-23 15:55:04 +08:00
RuoqingHe
10ceeb0930
Merge pull request #11104 from fidencio/topic/kata-deploy-create-runtimeclasses-by-default
kata-deploy: Create runtimeclasses by default
2025-04-01 10:55:44 +08:00
Zvonko Kaiser
e5c4cfb8a1
Merge pull request #11081 from BbolroC/unsealed-secret-fix
tests: Enable sealed secrets for all TEEs
2025-03-31 11:19:52 -04:00
Fabiano Fidêncio
28be53ac92 kata-deploy: Create runtimeclasses by default
Let's make the life of the users easier and create the runtimeclasses
for them by default.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-03-31 11:29:44 +01:00
Fabiano Fidêncio
41b536d487
Merge pull request #11059 from microsoft/danmihai1/tests-common
tests: k8s: clean-up shellcheck warnings in tests_common.sh
2025-03-27 09:51:49 +01:00
Hyounggyu Choi
0aa76f7206 tests: Enable sealed secrets for TEEs
Fixes: #11011

This commit allows all TEEs to run the sealed secret test.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-03-26 17:50:41 +01:00
Hyounggyu Choi
8088064b8b tests: Set default policy before running sealed secrets tests
The test `Cannot get CDH resource when deny-all policy is set`
completes with a KBS policy set to deny-all. This affects the
future TEE test (e.g. k8s-sealed-secrets.bats) which makes a
request against KBS.
This commit introduces kbs_set_default_policy() and puts it to
the setup() in k8s-sealed-secrets.bats.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-03-26 17:44:38 +01:00
Fabiano Fidêncio
f7976a40e4 tests: Create a helm_helper() common function
Let's use what we have in the k8s functional tests to create a common
function to deploy kata containers using our helm charts.  This will
help us immensely in the kata-deploy testing side in the near future.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-03-26 13:30:11 +01:00
Fabiano Fidêncio
eb884d33a8 tests: k8s: Export all the default env vars on gha-run.sh
This is not strictly needed, but it does help a lot when setting up a
cluster manually, while still relying on those scripts.

While here, let's also ensure the assignment is between quotes, to make
shellchecker happier.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-03-26 13:23:16 +01:00
Dan Mihai
835c6814d7 tests: k8s/tests_common: avoid using regex
More straightforward implementation of hard_coded_policy_tests_enabled,
that avoids ShellCheck warning:

warning: Remove quotes from right-hand side of =~ to match as a regex rather than literally. [SC2076]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 22:23:19 +00:00
Dan Mihai
d83b8349a2 tests: policy: avoid using caller's variable
Fix unintended use of caller's variable. Use the corresponding function
parameter instead. ShellCheck:

warning: policy_settings_dir is referenced but not assigned. [SC2154]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:29 +00:00
Dan Mihai
59a70a2b28 tests: k8s/tests_common: avoid masking return values
Avoid masking command return values by declaring and only then assigning.

ShellCheck:

warning: Declare and assign separately to avoid masking return values. [SC2155]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:29 +00:00
Dan Mihai
b895e3b3e5 tests: k8s/tests_common.sh: add variable assignments
Pick the the values exported by other scripts. ShellCheck:

warning: AUTO_GENERATE_POLICY is referenced but not assigned. [SC2154]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:29 +00:00
Dan Mihai
0f4de1c94a tests: tests_common: remove useless assignment
ShellCheck:

warning: This assignment is only seen by the forked process. [SC2097]
warning: This expansion will not see the mentioned assignment. [SC2098]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:29 +00:00
Dan Mihai
9c0d069ac7 tests: tests_common: prevent globbing and word splitting
ShellCheck:

note: Double quote to prevent globbing and word splitting. [SC2086]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:28 +00:00
Dan Mihai
15961b03f7 tests: k8s/tests_common.sh: -n instead of ! -z
ShellCheck:

note: Use -n instead of ! -z. [SC2236]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:28 +00:00
Dan Mihai
4589dc96ef tests: k8s/tests_common.sh: add double quoting
ShellCheck:

note: Prefer double quoting even when variables don't contain special characters. [SC2248]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:28 +00:00
Dan Mihai
cc5f8d31d2 tests: k8s/tests_common.sh: add braces
ShellCheck: add braces around variable references:

note: Prefer putting braces around variable references even when not strictly required. [SC2250]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:28 +00:00
Dan Mihai
0d3f9fcee1 tests: tests_common: export variables used externally
ShellCheck: export variables used outside of tests_common.sh - e.g.,

warning: timeout appears unused. Verify use (or export if used externally). [SC2034]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:28 +00:00
Dan Mihai
5df43ffc7c tests: k8s/tests_common.sh: Prefer [[ ]] over [ ]
Replace [ ] with [[ ]] as advised by shellcheck:

note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292]

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-20 19:02:28 +00:00
Dan Mihai
dab981b0bc tests: k8s: retry "kubectl exec" on empty output
Retry "kubectl exec" a few times if it unexpectedly produced an empty
output string.

This is an attempt to work around test failures similar to:

https://github.com/kata-containers/kata-containers/actions/runs/13840930994/job/38730153687?pr=10983

not ok 1 Environment variables
(from function `grep_pod_exec_output' in file tests_common.sh, line 394,
 in test file k8s-env.bats, line 36)
`grep_pod_exec_output "${pod_name}" "HOST_IP=\([0-9]\+\(\.\|$\)\)\{4\}" "${exec_command[@]}"' failed

That test obtained correct ouput from "sh -c printenv" one time, but the
second execution of the same command returned an empty output string.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-14 17:03:03 +00:00
Dan Mihai
0e26dd4ce8 tests: k8s-policy-pod: safer host path volume source
Test using the host path /tmp/k8s-policy-pod-test instead of
/var/lib/kubelet/pods.

/var/lib/kubelet/pods might happen to contain files that CopyFileRequest
would try to send to the Guest before CreateContainerRequest. Such
CopyFileRequest was an unintended side effect of this test.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-13 18:56:57 +00:00
Dan Mihai
4f41989a6a
Merge pull request #11009 from mythi/e2e-skip-flaky-tests
tests: k8s: skip trusted storage tests for qemu-tdx
2025-03-11 12:13:35 -07:00
Dan Mihai
e40251d9f8
Merge pull request #11006 from ryansavino/fix-confidential-ssh-dockerfile
tests: fix confidential ssh Dockerfile
2025-03-11 11:22:23 -07:00
Mikko Ylinen
71531a82f4 tests: k8s: skip trusted storage tests for qemu-tdx
follow other TEEs to skip trusted storage tests due to #10838.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2025-03-11 15:14:03 +02:00
Ryan Savino
1dbe3fb8bc tests: fix confidential ssh Dockerfile
Need to set correct permissions for ssh directories and files

Fixes: #11005

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2025-03-10 18:31:05 -05:00
Dan Mihai
e8405590c1 ci: temporarily avoid using the Mariner Host image
Disable the Mariner host during CI, while investigating test failures
with new Cloud Hypervisor v43.0.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-10 20:15:09 +00:00
Dan Mihai
509e6da965 tests: k8s-env.bats: log exec output
Log the "kubectl exec" ouput, just in case it helps investigate sporadic
test errors like:

https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329321?pr=10973

not ok 1 Environment variables
(in test file k8s-env.bats, line 37)
 `grep "HOST_IP=\([0-9]\+\(\.\|$\)\)\{4\}"' failed

It appears that the first exec from this test case produced the expected
output:

MY_POD_NAME=test-env

but the second exec produced something else - that will be logged after
this change.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-07 19:37:20 +00:00
Dan Mihai
95d47e4d05 tests: k8s-configmap.bats: log exec output
Log the "kubectl exec" ouput, just in case it helps investigate sporadic
test errors like:

https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329268?pr=10973

not ok 1 ConfigMap for a pod
(in test file k8s-configmap.bats, line 44)
`kubectl exec $pod_name -- "${exec_command[@]}" | grep "KUBE_CONFIG_2=value-2"' failed

It appears that the first exec from this test case produced the expected
output:

KUBE_CONFIG_1=value-1

but the second exec produced something else - that will be logged after
this change.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-07 19:35:45 +00:00
Dan Mihai
caee12c796 tests: k8s: add function to log exec output
grep_pod_exec_output invokes "kubectl exec", logs its output, and checks
that a grep pattern is present in the output.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-07 19:34:57 +00:00
Fabiano Fidêncio
545780a83a shellcheck: tests: k8s: Fix gha-run.sh warnings
As we'll touch this file during this series, let's already make sure we
solve all the needed warnings.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-03-05 19:44:27 +01:00
Zvonko Kaiser
4bb0eb4590
Merge pull request #10954 from kata-containers/topic/metrics-kata-deploy
Rework and fix metrics issues
2025-03-04 20:22:53 -05:00
stevenhorsman
02a2f6a9c1 tests: Sanitize K8S_TEST_ENTRY
Now we've added the double quotes around
`${K8S_TEST_UNION[@]}`, so platforms are
failing with:
```
Error: Test file "/home/ubuntu/runner/_layout/_work/kata-containers/kata-containers/tests/integration/kubernetes/k8s-nginx-connectivity.bats
" does not exist
```
due to the line continuation, so sanitise the value
to try and fix this.

Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-03-04 09:39:10 +00:00
stevenhorsman
c5ff513e0b shellcheck: Fix shellcheck SC2068
> Double quote array expansions to avoid re-splitting elements

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-03-04 09:35:46 +00:00
stevenhorsman
6f918d71f5 workflows: Update metrics jobs
Currently the run-metrics job runs a manual install
and does this in a separate job before the metrics
tests run. This doesn't make sense as if we have multiple
CI runs in parallel (like we often do), there is a high chance
that the setup for another PR runs between the metrics
setup and the runs, meaning it's not testing the correct
version of code. We want to remove this from happening,
so install (and delete to cleanup) kata as part of the metrics
test jobs.

Also switch to kata-deploy rather than manual install for
simplicity and in order to test what we recommend to users.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-03-01 17:50:05 +00:00
Zvonko Kaiser
33460386b9
Merge pull request #10803 from ryansavino/update-confidential-initrd-22.04
versions: update confidential initrd to 22.04
2025-02-27 09:29:36 -05:00
Ryan Savino
ceafa82f2e tests: skip trusted storage tests for qemu-snp
skip tests for trusted storage until #10838 is resolved.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2025-02-26 14:23:57 -06:00
Fabiano Fidêncio
a6186b6244 ci: k8s: arm: Skip "Check the number vcpus are ..." test
See https://github.com/kata-containers/kata-containers/issues/10928

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-24 18:43:24 +01:00
Fabiano Fidêncio
1798804c32 ci: k8s: arm: Skip "Pod quota" test
See https://github.com/kata-containers/kata-containers/issues/10927

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-24 18:43:24 +01:00
Fabiano Fidêncio
053827cacc ci: k8s: arm: Skip "Running within memory constraints" test
See https://github.com/kata-containers/kata-containers/issues/10926

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-24 18:43:24 +01:00
Aurélien Bombo
1f8c15fa48 Revert "tests: Skip k8s job test on qemu-coco-dev"
This reverts commit a8ccd9a2ac.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-02-21 17:52:17 -06:00
Aurélien Bombo
7542dbffb8 Revert "tests: disable k8s-policy-job.bats on coco-dev"
This reverts commit 47ce5dad9d.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-02-21 17:52:17 -06:00
Aurélien Bombo
601c403603
Merge pull request #10818 from burgerdev/plumbing
agent: clear log pipes if denied by policy
2025-02-19 16:28:58 -06:00
Aurélien Bombo
cb3467535c tests: Add policy test for ReadStreamRequest
This test verifies that, when ReadStreamRequest is blocked by the
policy, the logs are empty and the container does not deadlock.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-02-19 14:03:41 -06:00
Fabiano Fidêncio
64ceb0832a
Merge pull request #10851 from fidencio/topic/bump-image-rs-to-bring-in-ttrpc-0.8.4
agent: Bump image-rs to 514c561d93
2025-02-14 18:21:56 +01:00
stevenhorsman
56fb2a9482 tests: Skip block volume test on fc, stratovirt
The block volume test has failed on 10/10 nightlies
and all the PRs I've seen, so skip it until it can be assessed.

See #10873

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-13 11:50:35 +00:00
stevenhorsman
2d266df846 test: Update expected error in signed image tests
We are seeing a different error in the new version of image-rs,
so update our tests to match.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-13 11:44:51 +00:00
Wainer Moschetta
62e239ceaa
Merge pull request #10810 from arvindskumar99/nydus_perm_install
Skipping SNP and SEV from deploying and deleting Snapshotter
2025-02-12 14:38:56 -03:00
Dan Mihai
fdf3088be0
Merge pull request #10842 from microsoft/danmihai1/disable-job-policy-test
tests: disable k8s-policy-job.bats on coco-dev
2025-02-06 09:09:49 -08:00
Hyounggyu Choi
1bdb34e880 tests: Skip trusted storage tests for IBM SE
Let's skip all tests for trusted storage until #10838 is resolved.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-02-06 12:09:14 +01:00
Dan Mihai
47ce5dad9d tests: disable k8s-policy-job.bats on coco-dev
k8s-policy-job is modeled after the older k8s-job, and it appears
that both of them fail occasionally on coco-dev.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-02-05 23:06:16 +00:00
Arvind Kumar
47534c1c3e nydus: Skipping SNP and SEV from deploying and deleting Snapshotter
Preparing to install nydus permanently on the AMD node,
so disabling deploy and delete command for SNP and SEV.

Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-02-05 12:26:53 -06:00
Fabiano Fidêncio
0626d7182a tests: k8s-cpu-ns: Adapt to cgroupsv2
The changes done are:
* cpu/cpu.shares was replaced by cpu.weight
  * The weight, according to our reference[0], is calculated by:
    weight = (1 + ((request - 2) * 9999) / 262142)

* cpu/cpu.cfs_quota_us & cpu/cpu.cfs_period_us were replaced by cpu.max,
  where quota and period are written together (in this order)

[0]: https://github.com/containers/crun/blob/main/crun.1.md#cgroup-v2

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 17:25:56 +01:00
Fabiano Fidêncio
4307f0c998 Revert "ci: mariner: Ensure kernel_params can be set"
This reverts commit 091ad2a1b2, in order
to ensure tests would be running with cgroupsv2 on the guest.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 17:25:56 +01:00
Fabiano Fidêncio
734ef71cf7 tests: k8s: confidential: Cleanup $HOME/.ssh/known_hosts
I've noticed the following error when running the tests with SEV:
```
2025-01-21T17:10:28.7999896Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2025-01-21T17:10:28.8000614Z # @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
2025-01-21T17:10:28.8001217Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2025-01-21T17:10:28.8001857Z # IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
2025-01-21T17:10:28.8003009Z # Someone could be eavesdropping on you right now (man-in-the-middle attack)!
2025-01-21T17:10:28.8003348Z # It is also possible that a host key has just been changed.
2025-01-21T17:10:28.8004422Z # The fingerprint for the ED25519 key sent by the remote host is
2025-01-21T17:10:28.8005019Z # SHA256:x7wF8zI+LLyiwphzmUhqY12lrGY4gs5qNCD81f1Cn1E.
2025-01-21T17:10:28.8005459Z # Please contact your system administrator.
2025-01-21T17:10:28.8006734Z # Add correct host key in /home/kata/.ssh/known_hosts to get rid of this message.
2025-01-21T17:10:28.8007031Z # Offending ED25519 key in /home/kata/.ssh/known_hosts:178
2025-01-21T17:10:28.8007254Z #   remove with:
2025-01-21T17:10:28.8008172Z #   ssh-keygen -f "/home/kata/.ssh/known_hosts" -R "10.244.0.71"
```

And this was causing a failure to ssh into the confidential pod.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 12:04:13 +01:00
Fabiano Fidêncio
18137b1583 tests: k8s: confidential: Increase log_buf_len to 4M
Relying on dmesg is really not ideal, as we may lose important info,
mainly those which happen very early in the boot, depending on the size
of kernel ring buffer.

So, for this specific test, let's increase the kernel ring buffer, by
default, to 4M.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 12:04:13 +01:00
Aurélien Bombo
0d70dc31c1 ci: Unify on $GH_PR_NUMBER environment variable
While working on #10559, I realized that some parts of the codebase use
$GH_PR_NUMBER, while other parts use $PR_NUMBER.

Notably, in that PR, since I used $GH_PR_NUMBER for CoCo non-TEE tests
without realizing that TEE tests use $PR_NUMBER, the tests on that PR
fail on TEEs:

https://github.com/kata-containers/kata-containers/actions/runs/12818127344/job/35744760351?pr=10559#step:10:45

  ...
  44      error: error parsing STDIN: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context
  ...
  135               image: ghcr.io/kata-containers/csi-kata-directvolume:
  ...

So let's unify on $GH_PR_NUMBER so that this issue doesn't repro in the
future: I replaced all instances of PR_NUMBER with GH_PR_NUMBER.

Note that since some test scripts also refer to that variable, the CI
for this PR will fail (would have also happened with the converse
substitution), hence I'm not adding the ok-to-test label and we should
force-merge this after review.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-01-17 10:53:08 -06:00
Hyounggyu Choi
f7816e9206 tests: Introduce retry_kubectl_apply() for trusted storage
On s390x, some tests for trusted storage occasionally failed due to:

```bash
etcdserver: request timed out
```

or

```bash
Internal error occurred: resource quota evaluation timed out
```

These timeouts were not observed previously on k3s but occur
sporadically on kubeadm. Importantly, they appear to be temporary
and transient, which means they can be ignored in most cases.

To address this, we introduced a new wrapper function, `retry_kubectl_apply()`,
for `kubectl create`. This function retries applying a given manifest up to 5
times if it fails due to a timeout. However, it will still catch and handle
any other errors during pod creation.

Fixes: #10651

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-14 21:15:44 +01:00
Pradipta Banerjee
36580bb642 tests: Update sealed secret CI value to base64url
The existing encoding was base64 and it fails due to
874948638a

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2025-01-13 09:37:05 -05:00
Fabiano Fidêncio
8f8988fcd1
Merge pull request #10714 from fidencio/topic/update-virtiofsd
virtiofsd: Update to its v1.13.0 ( + one patch) release :-)
2025-01-08 17:59:29 +01:00
Fabiano Fidêncio
967d5afb42 Revert "tests: k8s: Skip one of the empty-dir tests"
This reverts commit 9aea7456fb.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-08 14:07:34 +01:00
Fabiano Fidêncio
53ac0f00c5 tests: Re-enable oom tests for mariner
Since we bumped to the 6.12.x LTS kernel, we've also adjusted the
aggressivity of the OOM test, which may be enough to allow us to
re-enable it for mariner.

Fixes: #8821

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-07 18:33:17 +01:00
Fabiano Fidêncio
9aea7456fb tests: k8s: Skip one of the empty-dir tests
An issue has been created for this, and we should fix the issue before
the next release.  However, for now, let's unblock the kernel bump and
have the test skipped.

Reference: https://github.com/kata-containers/kata-containers/issues/10706

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-06 21:48:20 +01:00
Fabiano Fidêncio
44ff602c64 tests: k8s: Be more aggressive to get OOM
Let's increase the amount of bytes allocated per VM worker, so we can
hit the OOM sooner.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-06 21:48:20 +01:00
stevenhorsman
dd02b6699e tests: Fix qemu-coc-dev skip
Fix the logic to make the test skipped on qemu-coco-dev,
rather than the opposite and update the syntax to make it
clearer as it incorrectly got written and reviewed by three
different people in it's prior form.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-19 19:50:46 +00:00
Ryan Savino
7d45382f54 Revert "ci: Skip the failing tests in SNP"
This reverts commit 2242aee099.
2024-12-10 16:20:31 -06:00
Aurélien Bombo
037281d699
Merge pull request #10593 from microsoft/saulparedes/improve_namespace_validation
policy: improve pod namespace validation
2024-12-09 11:55:09 -06:00
Saul Paredes
84a411dac4 policy: improve pod namespace validation
- Remove default_namespace from settings
- Ensure container namespaces in a pod match each other in case no namespace is specified in the YAML

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-12-04 10:17:54 -08:00
stevenhorsman
a8ccd9a2ac tests: Skip k8s job test on qemu-coco-dev
The tests is unstable on this platform, so skip it for now to prevent
the regular known failures covering up other issues. See #10616

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-04 16:00:05 +00:00
Saul Paredes
711d12e5db policy: support optional metadata uid field
This prevents a deserialization error when uid is specified

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-12-02 11:24:58 -08:00
Aurélien Bombo
16a91fccbe
Merge pull request #10561 from sprt/csi-driver-ci
coco: ci: Lay groundwork for compiling and publishing CSI driver image [1/x]
2024-11-27 10:26:45 -06:00
Aurélien Bombo
5e4990bcf5 coco: ci: Add no-op steps to deploy CSI driver
This adds no-op steps that'll be used to deploy and clean up the CSI driver
used for testing.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:08:06 -06:00
Adithya Krishnan Kannan
2242aee099 ci: Skip the failing tests in SNP
Per [Issue#10549](https://github.com/kata-containers/kata-containers/issues/10549),
the following tests are failing on SNP.
1. k8s-guest-pull-image-encrypted.bats
2. k8s-guest-pull-image-authenticated.bats
3. k8s-guest-pull-image-signature.bats
4. k8s-confidential-attestation.bats

Per @fidencio 's comment on
[PR#10558](https://github.com/kata-containers/kata-containers/pull/10558),
I am skipping the same.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-19 10:41:43 -06:00
Fabiano Fidêncio
9b1a5f2ac2 tests: Add a way to run only tests which rely on attestation
We're doing this as, at Intel, we have two different kind of machines we
can plug into our CI.  Without going much into details, only one of
those two kinds of machines will work for the attestation tests we
perform with ITA, thus in order to speed up the CI and improve test
coverage (OS wise), we're going to run different tests in different
machines.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-14 15:51:57 +01:00
Fabiano Fidêncio
2281342fb8
Merge pull request #10513 from fidencio/topic/ci-adjust-proxy-nightmare-for-tdx
ci: tdx: kbs: Ensure https_proxy is taken in consideration
2024-11-10 00:17:10 +01:00
Saul Paredes
461efc0dd5 tests: remove manifest v1 test
This test was meant to show support for pulling images with v1 manifest schema versions.

The nginxhttps image has been modified in https://hub.docker.com/r/ymqytw/nginxhttps/tags such that we are no longer able to pull it:

$ docker pull ymqytw/nginxhttps:1.5
Error response from daemon: missing signature key

We may remove this test since schema version 1 manifests are deprecated per
https://docs.docker.com/engine/deprecated/#pushing-and-pulling-with-image-manifest-v2-schema-1 :
"These legacy formats should no longer be used, and users are recommended to update images to use current formats, or to upgrade to more
current images". This schema version was used by old docker versions. Further OCI spec
https://github.com/opencontainers/image-spec/blob/main/manifest.md#image-manifest-property-descriptions only supports schema version 2.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-11-08 13:38:51 -08:00
Fabiano Fidêncio
baf88bb72d ci: tdx: kbs: Ensure https_proxy is taken in consideration
Trustee's deployment must set the correct https_proxy as env var on the
container that will talk to the ITA / ITTS server, otherwise the kbs
service won't be able to start, causing then issues in our CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Krzysztof Sandowicz <krzysztof.sandowicz@intel.com>
2024-11-08 16:06:16 +01:00
Steve Horsman
1f728eb906
Merge pull request #10498 from stevenhorsman/update-create-container-timeout-log
tests: k8s: Update image pull timeout error
2024-11-08 10:47:39 +00:00
Fabiano Fidêncio
71c4c2a514
Merge pull request #10486 from kata-containers/topic/enable-AUTO_GENERATE_POLICY-for-qemu-coco-dev
workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev
2024-11-06 21:04:45 +01:00
stevenhorsman
85554257f8 tests: k8s: Update image pull timeout error
Currently the error we are checking for is
`CreateContainerRequest timed out`, but this message
doesn't always seem to be printed to our pod log.
Try using a more general message that should be present
more reliably.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-06 17:00:26 +00:00
Julien Ropé
da5e0c3f53 ci: skip nginx connectivity test with crio
We have an error with service name resolution with this test when using crio.
This error could not be reproduced outside of the CI for now.
Skipping it to keep the CI job running until we find a solution.

See: #10414

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-11-06 12:07:02 +01:00
Julien Ropé
6d0cb1e9a8 ci: export CONTAINER_RUNTIME to the test scripts
This variable will allow tests to adapt their behaviour to the runtime (containerd/crio).

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-11-06 11:29:11 +01:00
Fabiano Fidêncio
72979d7f30 workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev
By the moment we're testing it also with qemu-coco-dev, it becomes
easier for a developer without access to TEE to also test it locally.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-06 10:47:08 +01:00
stevenhorsman
175ebfec7c Revert "k8s:kbs: Add trap statement to clean up tmp files"
This reverts commit 973b8a1d8f.

As @danmihai1 points out https://github.com/bats-core/bats-core/issues/364
states that using traps in bats is error prone, so this could be the cause
of the confidential test instability we've been seeing, like it was
in the static checks, so let's try and revert this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:37 +00:00
stevenhorsman
75cb1f46b8 tests/k8s: Add skip is setup_common fails
At @danmihai1's suggestion add a die message in case
the call to setup_common fails, so we can see if in the test
output.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:33 +00:00
stevenhorsman
3f5bf9828b tests: k8s: Update bats
We've seen some issues with tests not being run in
some of the Coco CI jobs (Issue #10451) and in the
envrionments that are more stable we noticed that
they had a newer version of bats installed.

Try updating the version to 1.10+ and print out
the version for debug purposes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:33 +00:00
Hyounggyu Choi
238f67005f tests: Add kubeadm option for KUBERNETES in gha-run.sh
When creating a k8s cluster via kubeadm, the devmapper setup
for containerd requires a different configuration.
This commit introduces a new `kubeadm` option for the KUBERNETES
variable and adjusts the path to the containerd config file for
devmapper setup.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-10-29 14:19:42 +01:00
Fabiano Fidêncio
b70d7c1aac
tests: Enable measured rootfs tests for qemu-coco-dev
Then it's on pair with what's being tested with TEEs using a rootfs
image.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:54 +01:00
Fabiano Fidêncio
7d202fc173
tests: Re-enable measured_rootfs test for TDX
As we're now building everything needed to test TDX with measured rootfs
support, let's bring this test back in (for TDX only, at least for now).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
alex.lyn
b25538f670 ci: Introduce CI to validate pod hostname
Fixes #10422

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-10-21 16:32:56 +01:00