We are seeing failures in this test, where the output of
the kubectl exec command seems to be blank, so try
retrying the exec like #11024Fixes: #11133
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
For help with debugging add, logging of the KBS,
like the container system logs if the confidential test fails
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Nydus+guest_pull has specific behavior where it improperly handles image layers on
the host, causing the CRI to not find /etc/passwd and /etc/group files
on container images which have them. The unfortunately causes different
outcomes w.r.t. GID used which we are trying to enforce with policy.
This behavior is observed/explained in https://github.com/kata-containers/kata-containers/issues/11162
Handle this exception with a config.settings.cluster_config.guest_pull
field. When this is true, simply ignore the /etc/* files in the
container image as they will not be parsed by the CRI.
Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
Introduce tests to check for policy correctness on a redis deployment
with 1. a pod-level securityContext 2. a container-level securityContext
which shadows the pod-level securityContext 3. a pod-level
securityContext which selects an existing user (nobody), causing a new GID to be selected.
Redis is an interesting container image to test with because it includes
a /etc/passwd file with existing user/group configuration of 1000:1000 baked in.
Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
It's used an AKS managed ingress controller which keeps two nginx pod
replicas where both request 500m of CPU. On small VMs like we've used on
CI for running the CoCo non-TEE tests, it left only a few amount of CPU
for the tests. Actually, one of these pod replicas won't even get
started. So let's patch the ingress controller to have only one replica
of nginx.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The Azure AKS addon-http-application-routing add-on is deprecated and
cannot be enabled on new clusters which has caused some CI jobs to fail.
Migrated our code to use approuting instead. Unlike
addon-http-application-routing, this add-on doesn't
configure a managed cluster DNS zone, but the created ingress has a
public IP. To avoid having to deal with DNS setup, we will be using that
address from now on. Thus, some functions no longer used are deleted.
Fixes#11156
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
genpolicy is sending more HTTPS requests than other components during
CI so it's more likely to be affected by transient network errors
similar to:
ConnectError(
"dns error",
Custom {
kind: Uncategorized,
error: "failed to lookup address information: Try again",
},
)
Note that genpolicy is not the only component hitting network errors
during CI. Recent example from a different component:
"Message: failed to create containerd task: failed to create shim task:
failed to async pull blob stream HTTP status server error (502 Bad Gateway)"
This CI change might help just with the genpolicy errors.
Fixes: #11182
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
If the correct version of go is already installed then
install_go.sh runs `exit`. When calling this as source from
cri-containerd/gha-run.sh it means all dependencies after
are skipped, so remove this.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This test we will test initdata in the following logic
1. Enable image signature verification via kernel commandline
2. Set Trustee address via initdata
3. Pull an image from a banned registry
4. Check if the pulling fails with log `image security validation
failed` the initdata works.
Note that if initdata does not work, the pod still fails to launch. But
the error information is `[CDH] [ERROR]: Get Resource failed` which
internally means that the KBS URL has not been set correctly.
This test now only runs on qemu-coco-dev+x86_64 and qemu-tdx
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
The test `Cannot get CDH resource when deny-all policy is set`
completes with a KBS policy set to deny-all. This affects the
future TEE test (e.g. k8s-sealed-secrets.bats) which makes a
request against KBS.
This commit introduces kbs_set_default_policy() and puts it to
the setup() in k8s-sealed-secrets.bats.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Let's use what we have in the k8s functional tests to create a common
function to deploy kata containers using our helm charts. This will
help us immensely in the kata-deploy testing side in the near future.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
This is not strictly needed, but it does help a lot when setting up a
cluster manually, while still relying on those scripts.
While here, let's also ensure the assignment is between quotes, to make
shellchecker happier.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
More straightforward implementation of hard_coded_policy_tests_enabled,
that avoids ShellCheck warning:
warning: Remove quotes from right-hand side of =~ to match as a regex rather than literally. [SC2076]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Fix unintended use of caller's variable. Use the corresponding function
parameter instead. ShellCheck:
warning: policy_settings_dir is referenced but not assigned. [SC2154]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Avoid masking command return values by declaring and only then assigning.
ShellCheck:
warning: Declare and assign separately to avoid masking return values. [SC2155]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Pick the the values exported by other scripts. ShellCheck:
warning: AUTO_GENERATE_POLICY is referenced but not assigned. [SC2154]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
ShellCheck:
warning: This assignment is only seen by the forked process. [SC2097]
warning: This expansion will not see the mentioned assignment. [SC2098]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
ShellCheck: add braces around variable references:
note: Prefer putting braces around variable references even when not strictly required. [SC2250]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
ShellCheck: export variables used outside of tests_common.sh - e.g.,
warning: timeout appears unused. Verify use (or export if used externally). [SC2034]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Replace [ ] with [[ ]] as advised by shellcheck:
note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Retry "kubectl exec" a few times if it unexpectedly produced an empty
output string.
This is an attempt to work around test failures similar to:
https://github.com/kata-containers/kata-containers/actions/runs/13840930994/job/38730153687?pr=10983
not ok 1 Environment variables
(from function `grep_pod_exec_output' in file tests_common.sh, line 394,
in test file k8s-env.bats, line 36)
`grep_pod_exec_output "${pod_name}" "HOST_IP=\([0-9]\+\(\.\|$\)\)\{4\}" "${exec_command[@]}"' failed
That test obtained correct ouput from "sh -c printenv" one time, but the
second execution of the same command returned an empty output string.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Test using the host path /tmp/k8s-policy-pod-test instead of
/var/lib/kubelet/pods.
/var/lib/kubelet/pods might happen to contain files that CopyFileRequest
would try to send to the Guest before CreateContainerRequest. Such
CopyFileRequest was an unintended side effect of this test.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Log the "kubectl exec" ouput, just in case it helps investigate sporadic
test errors like:
https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329321?pr=10973
not ok 1 Environment variables
(in test file k8s-env.bats, line 37)
`grep "HOST_IP=\([0-9]\+\(\.\|$\)\)\{4\}"' failed
It appears that the first exec from this test case produced the expected
output:
MY_POD_NAME=test-env
but the second exec produced something else - that will be logged after
this change.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Log the "kubectl exec" ouput, just in case it helps investigate sporadic
test errors like:
https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329268?pr=10973
not ok 1 ConfigMap for a pod
(in test file k8s-configmap.bats, line 44)
`kubectl exec $pod_name -- "${exec_command[@]}" | grep "KUBE_CONFIG_2=value-2"' failed
It appears that the first exec from this test case produced the expected
output:
KUBE_CONFIG_1=value-1
but the second exec produced something else - that will be logged after
this change.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
grep_pod_exec_output invokes "kubectl exec", logs its output, and checks
that a grep pattern is present in the output.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
As we'll touch this file during this series, let's already make sure we
solve all the needed warnings.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
As we're testing against the LTS and the Active versions of
containers, let's upgrade the lts version from 1.6 to 1.7 and
active version from 1.7 to 2.0 to cover the sandboxapi tests.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Now we've added the double quotes around
`${K8S_TEST_UNION[@]}`, so platforms are
failing with:
```
Error: Test file "/home/ubuntu/runner/_layout/_work/kata-containers/kata-containers/tests/integration/kubernetes/k8s-nginx-connectivity.bats
" does not exist
```
due to the line continuation, so sanitise the value
to try and fix this.
Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
> In functions, use return instead of break.
> rationale: break or continue are used to abort or
continue a loop, and are not the right way to exit
a function. Use return instead.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Currently the run-metrics job runs a manual install
and does this in a separate job before the metrics
tests run. This doesn't make sense as if we have multiple
CI runs in parallel (like we often do), there is a high chance
that the setup for another PR runs between the metrics
setup and the runs, meaning it's not testing the correct
version of code. We want to remove this from happening,
so install (and delete to cleanup) kata as part of the metrics
test jobs.
Also switch to kata-deploy rather than manual install for
simplicity and in order to test what we recommend to users.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>