_print_instance_type() returns the instance type of the AKS nodes, based
on the host type. Tests are grouped per host type in "small" and "normal"
sets based on the CPU requirements: "small" tests require few CPUs and
"normal" more.
There is an 3rd case: "all" host type maps to the union of "small"
and "normal" tests, which should be handled by _print_instance_type()
properly. In this case, it should return the largest instance type
possible because "normal" tests will be executed too.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
It's used an AKS managed ingress controller which keeps two nginx pod
replicas where both request 500m of CPU. On small VMs like we've used on
CI for running the CoCo non-TEE tests, it left only a few amount of CPU
for the tests. Actually, one of these pod replicas won't even get
started. So let's patch the ingress controller to have only one replica
of nginx.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The Azure AKS addon-http-application-routing add-on is deprecated and
cannot be enabled on new clusters which has caused some CI jobs to fail.
Migrated our code to use approuting instead. Unlike
addon-http-application-routing, this add-on doesn't
configure a managed cluster DNS zone, but the created ingress has a
public IP. To avoid having to deal with DNS setup, we will be using that
address from now on. Thus, some functions no longer used are deleted.
Fixes#11156
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The AKS CLI recently introduced a regression that prevents using
aks-preview extensions (Azure/azure-cli#31345), and hence create
CI clusters.
To address this, we temporarily hardcode the last known good version of
aks-preview.
Note that I removed the comment about this being a Mariner requirement,
as aks-preview is also a requirement of AKS App Routing, which will
be introduced soon in #11164.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
genpolicy is sending more HTTPS requests than other components during
CI so it's more likely to be affected by transient network errors
similar to:
ConnectError(
"dns error",
Custom {
kind: Uncategorized,
error: "failed to lookup address information: Try again",
},
)
Note that genpolicy is not the only component hitting network errors
during CI. Recent example from a different component:
"Message: failed to create containerd task: failed to create shim task:
failed to async pull blob stream HTTP status server error (502 Bad Gateway)"
This CI change might help just with the genpolicy errors.
Fixes: #11182
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
If the correct version of go is already installed then
install_go.sh runs `exit`. When calling this as source from
cri-containerd/gha-run.sh it means all dependencies after
are skipped, so remove this.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Bump golang version to the latest minor 1.23.x release
now that 1.24 has been released and 1.22.x is no longer
stable and receiving security fixes
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This test we will test initdata in the following logic
1. Enable image signature verification via kernel commandline
2. Set Trustee address via initdata
3. Pull an image from a banned registry
4. Check if the pulling fails with log `image security validation
failed` the initdata works.
Note that if initdata does not work, the pod still fails to launch. But
the error information is `[CDH] [ERROR]: Get Resource failed` which
internally means that the KBS URL has not been set correctly.
This test now only runs on qemu-coco-dev+x86_64 and qemu-tdx
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Add how-to-use-memory-agent.md (How to use mem-agent to decrease the
memory usage of Kata container) to docs to show how to use mem-agent.
Fixes: #11013
Signed-off-by: Hui Zhu <teawater@gmail.com>
We get the following error while writing containerd config
if a base dir `/etc/containerd` does not exist like:
```
sudo tee /etc/containerd/config.toml << EOF
...
EOF
tee: /etc/containerd/config.toml: No such file or directory
```
The commit makes sure a base directory for containerd before
writing config and drops the config file deletion because a
default behaviour of `tee` is overwriting.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
The test `Cannot get CDH resource when deny-all policy is set`
completes with a KBS policy set to deny-all. This affects the
future TEE test (e.g. k8s-sealed-secrets.bats) which makes a
request against KBS.
This commit introduces kbs_set_default_policy() and puts it to
the setup() in k8s-sealed-secrets.bats.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
With this we switch to fully testing with helm, instead of testimg with
the kustomizations (which will soon be removed).
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Let's use what we have in the k8s functional tests to create a common
function to deploy kata containers using our helm charts. This will
help us immensely in the kata-deploy testing side in the near future.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
This is not strictly needed, but it does help a lot when setting up a
cluster manually, while still relying on those scripts.
While here, let's also ensure the assignment is between quotes, to make
shellchecker happier.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
More straightforward implementation of hard_coded_policy_tests_enabled,
that avoids ShellCheck warning:
warning: Remove quotes from right-hand side of =~ to match as a regex rather than literally. [SC2076]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Fix unintended use of caller's variable. Use the corresponding function
parameter instead. ShellCheck:
warning: policy_settings_dir is referenced but not assigned. [SC2154]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Avoid masking command return values by declaring and only then assigning.
ShellCheck:
warning: Declare and assign separately to avoid masking return values. [SC2155]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Pick the the values exported by other scripts. ShellCheck:
warning: AUTO_GENERATE_POLICY is referenced but not assigned. [SC2154]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
ShellCheck:
warning: This assignment is only seen by the forked process. [SC2097]
warning: This expansion will not see the mentioned assignment. [SC2098]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
ShellCheck: add braces around variable references:
note: Prefer putting braces around variable references even when not strictly required. [SC2250]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
ShellCheck: export variables used outside of tests_common.sh - e.g.,
warning: timeout appears unused. Verify use (or export if used externally). [SC2034]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Replace [ ] with [[ ]] as advised by shellcheck:
note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Same rationale as for runtime. With tests, the blackfriday replacement was
actually meaningful, so I refactored some imports.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Retry "kubectl exec" a few times if it unexpectedly produced an empty
output string.
This is an attempt to work around test failures similar to:
https://github.com/kata-containers/kata-containers/actions/runs/13840930994/job/38730153687?pr=10983
not ok 1 Environment variables
(from function `grep_pod_exec_output' in file tests_common.sh, line 394,
in test file k8s-env.bats, line 36)
`grep_pod_exec_output "${pod_name}" "HOST_IP=\([0-9]\+\(\.\|$\)\)\{4\}" "${exec_command[@]}"' failed
That test obtained correct ouput from "sh -c printenv" one time, but the
second execution of the same command returned an empty output string.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
In ef0e8669fb we
had been seeing some significantly lower minvalues in
the jitter.Result test, so I lowered the mid-value rather
than having a very high minpercent, but it appears that the
variability of this result is very high, so we are still getting
the occasional high value, so reset the midval and just
have a bigger ranges on both sides, to try and keep the test
stable.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The kubectl wait has a built in timeout of 30s, so
wrapping it in waitForProcess, means we have
180/2 * 30 delay, which is much longer than intended,
so just set the timeout directly.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Test using the host path /tmp/k8s-policy-pod-test instead of
/var/lib/kubelet/pods.
/var/lib/kubelet/pods might happen to contain files that CopyFileRequest
would try to send to the Guest before CreateContainerRequest. Such
CopyFileRequest was an unintended side effect of this test.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Log the "kubectl exec" ouput, just in case it helps investigate sporadic
test errors like:
https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329321?pr=10973
not ok 1 Environment variables
(in test file k8s-env.bats, line 37)
`grep "HOST_IP=\([0-9]\+\(\.\|$\)\)\{4\}"' failed
It appears that the first exec from this test case produced the expected
output:
MY_POD_NAME=test-env
but the second exec produced something else - that will be logged after
this change.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Log the "kubectl exec" ouput, just in case it helps investigate sporadic
test errors like:
https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329268?pr=10973
not ok 1 ConfigMap for a pod
(in test file k8s-configmap.bats, line 44)
`kubectl exec $pod_name -- "${exec_command[@]}" | grep "KUBE_CONFIG_2=value-2"' failed
It appears that the first exec from this test case produced the expected
output:
KUBE_CONFIG_1=value-1
but the second exec produced something else - that will be logged after
this change.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
grep_pod_exec_output invokes "kubectl exec", logs its output, and checks
that a grep pattern is present in the output.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
./tests/git-helper.sh:20:5: note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292]
./tests/git-helper.sh:22:26: note: Double quote to prevent globbing and word splitting. [SC2086]
./tests/git-helper.sh:23:7: note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292]
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
As we'll touch this file during this series, let's already make sure we
solve all the needed warnings.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
He were fixing the few warnings we found in the files present in the
functional tests for kata-deploy.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
As we're testing against the LTS and the Active versions of
containers, let's upgrade the lts version from 1.6 to 1.7 and
active version from 1.7 to 2.0 to cover the sandboxapi tests.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
When `KATA_HYPERVISOR` is set to `qemu-se-runtime-rs`,
a configuration file is properly referenced and a runtime class
should be created via kata-deploy.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Now we've added the double quotes around
`${K8S_TEST_UNION[@]}`, so platforms are
failing with:
```
Error: Test file "/home/ubuntu/runner/_layout/_work/kata-containers/kata-containers/tests/integration/kubernetes/k8s-nginx-connectivity.bats
" does not exist
```
due to the line continuation, so sanitise the value
to try and fix this.
Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This was checking that a literal string was non-zero.
I'm assume it instead wanted to check if the file exists
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
> In functions, use return instead of break.
> rationale: break or continue are used to abort or
continue a loop, and are not the right way to exit
a function. Use return instead.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
> Argument mixes string and array. Use * or separate argument.
- Swap echos for printfs and improve formatting
- Replace $@ with $*
- Split arrays into separate arguments
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
On qemu the run seems to error after ~4-7 runs, so try
a cut down version of repetitions to see if this helps us
get results in a stable way.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We have a new metrics machine and environment
and the iperf jitter result failed as it finished too quickly,
so increase the minpercent to try and get it stable
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We have a new metrics machine and environment
and the fio write.bw and iperf3 parallel.Results
tests failed for clh, as below
the minimum range, so increase the
minpercent to try and get it stable
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We have a new metrics machine and environment
and the boot time test failed for clh, so increase the
maxpercent to try and get it stable
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The iperf deployment is quite a lot out of date
and uses `master` for it's affinity and toleration,
so update this to control-plane, so it can run on
newer Kubernetes clusters
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The new metrics runner seems slower, so we are
seeing errors like:
The iperf3 tests are failing with:
```
pod rejected: RuntimeClass "kata" not found
```
so give more time for it to succeed
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Move `kill_kata_components` from common.bash
into the metrics code base as the only user of it
- Increase the timeout on the start of containerd as
the last 10 nightlies metric tests have failed with:
```
223478 Killed sudo timeout -s SIGKILL "${TIMEOUT}" systemctl start containerd
```
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- As the metrics tests are largely independent
then allow subsequent tests to run even if previous
ones failed. The results might not be perfect if
clean-up is required, but we can work on that later.
- Move the test results check out of the latency
test that seems arbitrary and into it's own job step
- Add timeouts to steps that might fail/hang if there
are containerd/K8s issues
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Currently the run-metrics job runs a manual install
and does this in a separate job before the metrics
tests run. This doesn't make sense as if we have multiple
CI runs in parallel (like we often do), there is a high chance
that the setup for another PR runs between the metrics
setup and the runs, meaning it's not testing the correct
version of code. We want to remove this from happening,
so install (and delete to cleanup) kata as part of the metrics
test jobs.
Also switch to kata-deploy rather than manual install for
simplicity and in order to test what we recommend to users.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Update the code to install the version of k0s
that we have in our versions.yaml, rather than
just installing the latest, to help our CI being
less stable and prone to breaking due to things
we don't control.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Architectures here with `musl` available are minority, which is more
suitable for enumeration.
With this change, we are implicitly choosing gnu target for `ppc64le`,
`riscv64` and `s390x`.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
While installing Rust and Golang in our CI workflow, `arch_to_golang`
and `arch_to_rust` are needed for inferring the correct arch string for
riscv64 architecture.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
This test verifies that, when ReadStreamRequest is blocked by the
policy, the logs are empty and the container does not deadlock.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
On Ubuntu 24.04, due to the /usr merge, system-provided unit files
now reside in `/usr/lib/systemd/system/` instead of `/lib/systemd/system/`.
For example, the command below now returns a different path:
```
$ systemctl show containerd.service -p FragmentPath
/usr/lib/systemd/system/containerd.service
```
Previously, on Ubuntu 22.04 and earlier, it returned:
```
/lib/systemd/system/containerd.service
```
The current pattern `if [[ $unit_file == /lib* ]]` fails to match the new path.
To ensure compatibility across versions, we update the pattern to match both
`/lib` and `/usr/lib` like:
```
if [[ $unit_file =~ ^/(usr/)?lib/ ]]
```
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Add entries for dbs_* crates' README.md to pass `kata-spell-check.sh`
spell checking.
Changed British terms to American terms in README of `dbs_pci` to pass
`hunspell` check.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
We are running `header_check` for non-text files like binary files,
symbolic link files, image files (pictures) and etc., which does not
make sense.
Filter out non-text files and run `header_check` only for text files
changed.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
`kata-dictionary.dic` changes after running `kata-spell-check.sh
make-dict`. This is due to someone forgot to first update entries in
data and run `make-dict`, but directly updated `kata-dictionary.dic`
instead.
Add mssing entries to data and re-run `make-dict` to generate correct
`kata-dictionary.dic`.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
The block volume test has failed on 10/10 nightlies
and all the PRs I've seen, so skip it until it can be assessed.
See #10873
Signed-off-by: stevenhorsman <steven@uk.ibm.com>