The k8s test suite halts on the first failure, i.e., failing-fast. This
isn't the behavior that we used to see when running tests on Jenkins and it
seems that running the entire test suite is still the most productive way. So
this disable fail-fast by default.
However, if you still wish to run on fail-fast mode then just export
K8S_TEST_FAIL_FAST=yes in your environment.
Fixes: #9697
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
This test has failed in confidential runtime jobs. Skip it
until we don't have a fix.
Fixes: #9663
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
One of our machines is running CentOS 9 Stream, and we could easily
verify that we can build and install the kbs client there, thus we're
expanding the installation script to also support CentOS 9 Stream.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
`externals.coco-kbs.toolchain` is not defined, get the rust_version from
`externals.coco-trustee.toolchain` instead.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR enables the installation and unistallation of the kbs client
as well as general coco components needed for the TDX GHA CI.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds the use of custom Intel DCAP configuration when
deploying the KBS.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This test doesn't fail with the guest image pulling, but it for sure
should. :-)
We can see in the bats logs, something like:
```
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 31s default-scheduler Successfully assigned kata-containers-k8s-tests/liveness-exec to 984fee00bd70.jf.intel.com
Normal Pulled 23s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 345ms (345ms including waiting)
Normal Started 21s kubelet Started container liveness
Warning Unhealthy 7s (x3 over 13s) kubelet Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
Normal Killing 7s kubelet Container liveness failed liveness probe, will be restarted
Normal Pulled 7s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 389ms (389ms including waiting)
Warning Failed 5s kubelet Error: failed to create containerd task: failed to create shim task: the file /bin/sh was not found: unknown
Normal Pulling 5s (x3 over 23s) kubelet Pulling image "quay.io/prometheus/busybox:latest"
Normal Pulled 4s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 342ms (342ms including waiting)
Normal Created 4s (x3 over 23s) kubelet Created container liveness
Warning Failed 3s kubelet Error: failed to create containerd task: failed to create shim task: failed to mount /run/kata-containers/f0ec86fb156a578964007f7773a3ccbdaf60023106634fe030f039e2e154cd11/rootfs to /run/kata-containers/liveness/rootfs, with error: ENOENT: No such file or directory: unknown
Warning BackOff 1s (x3 over 3s) kubelet Back-off restarting failed container liveness in pod liveness-exec_kata-containers-k8s-tests(b1a980bf-a5b3-479d-97c2-ebdb45773eff)
```
Let's skip it for now as we have an issue opened to track it down:
https://github.com/kata-containers/kata-containers/issues/9665
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Similarly to firecracker, which doesn't have support for virtio-fs /
virtio-9p, TDX used with `shared_fs=none` will face the very same
limitations.
The tests affected are:
* k8s-credentials-secrets.bats
* k8s-file-volume.bats
* k8s-inotify.bats
* k8s-nested-configmap-secret.bats
* k8s-projected-volume.bats
* k8s-volume.bats
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure the logs will print the correct annotation and its
value, instead of always mentioning "kernel" and "initrd".
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're doing that as all tests are going to be running with
`shared_fs=none`, meaning that we don't need any specific test for this
case anymore.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will set the needed annotation for enforcing that the
image pull will be handled by the snapshotter set for the runtime
handler, instead of using the default one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Given that we think the containerd -> snapshotter image cache
problems have been resolved by bumping to nydus-snapshotter v0.3.13
we can try removing the skips to test this out
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- We previously have an expectation for the pause rootfs
to be pull on the host when we did a guest pull. We weren't
really clear why, but it is plausible related to the issues we had
with containerd and nydus caching. Now that is fixed we can begin
to address this with setting shared_fs=none, but let's start with
updating the rootfs host check to be not higher than expected
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This test is called from `tests/integration/run_kuberentes_tests.sh`,
which already ensures that yq is installed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
On 1423420, I've mistakenly disabled the tests entirely, for both
non-TEEs and TEEs.
This happened as I didn't realise that `confidential_setup` would take
non-TEEs into consideration. :-/
Now, let me follow-up on that and make sure that the tests will be
running on non-TEEs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function is a helper to check whether the KATA_HYPERVISOR being
used is a confidential hardware (TEE) or not, and we can use it to
skip or only run tests on those platforms when needed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's rename it to `is_confidential_runtime_class`, and adapt all the
places where it's called.
The new name provides a better description, leading to a better
understanding of what the function really does.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's skip those tests on TEEs as we've been facing a reasonable amount
of issues, most likely on the containerd side, related to pulling the
image on the guest.
Once we're able to fix the issues on containerd, we can get back and
re-enable those by reverting this commit.
The decision of disabling the tests for TEEs is because the machines may
end up in a state where human intervention is necessary to get them back
to a functional state, and that's really not optimal for our CI.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds a k8s negative policy test to the confidential attestation
bats test.
Fixes#9437
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The CH v39 upgrade in #9575 is currently blocked because of a bug in the
Mariner host kernel. To address this, we temporarily tweak the Mariner
CI to use an Ubuntu host and the Kata guest kernel, while retaining the
Mariner initrd. This is tracked in #9594.
Importantly, this allows us to preserve CI for genpolicy. We had to
tweak the default rules.rego however, as the OCI version is now
different in the Ubuntu host. This is tracked in #9593.
This change has been tested together with CH v39 in #9588.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
With the addition of the 'qemu-coco-dev' runtimeClass we no longer need
to run CoCo tests on non-TEE environments with 'qemu'. As a result the
tests also no longer need to set the "io.katacontainers.config.hypervisor.image"
annotation to pods.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Avoid auto-generating Policy on platforms that haven't been tested
yet with auto-generated Policy.
Support for auto-generated Policy on these additional platforms is
coming up in future PRs, so the tests being fixed here were
prematurely enabled.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Remove k8s-policy-set-keys.bats in preparation for using the regorus
crate instead of the OPA daemon for evaluating the Agent Policy. This
test depended on sending HTTP requests to OPA.
Fixes: #9388
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Auto-generate the policy and then simulate attacks from the K8s
control plane by modifying the test yaml files. The policy then
detects and blocks those changes.
These test cases are using K8s Pods. Additional policy failures
are injected during CI using other types of K8s resources - e.g.,
using Jobs and Replication Controllers - from separate PRs.
Fixes: #9491
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Auto-generate the policy and then simulate attacks from the K8s
control plane by modifying the test yaml files. The policy then
detects and blocks those changes.
These test cases are using K8s Replication Controllers. Additional
policy failures will be injected using other types of K8s resources
- e.g., using Pods and/or Jobs - in separate PRs.
Fixes: #9463
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
When running on non-TEE environments (e.g. KATA_HYPERVISOR=qemu) the tests should
be stressing the CoCo image (/opt/kata/share/kata-containers/kata-containers-confidential.img)
although currently the default image/initrd is built to be able to do guest-pull as well.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Enabled guest-pull tests on non-TEE environment. It know requires the SNAPSHOTTER environment
variable to avoid it running on jobs where nydus-snapshotter is not installed
Fixes: #9410
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>