Commit Graph

674 Commits

Author SHA1 Message Date
Hyounggyu Choi
2c2941122c tests: Fail fast in assert_pod_fail()
`assert_pod_fail()` currently calls `k8s_create_pod()` to ensure that a pod
does not become ready within the default 120s. However, this delays the test's
completion even if an error message is detected earlier in the journal.

This commit removes the use of `k8s_create_pod()` and modifies `assert_pod_fail()`
to fail as soon as the pod enters a failed state.

All failing pods end up in one of the following states:

- CrashLoopBackOff
- ImagePullBackOff

The function now polls the pod's state every 5 seconds to check for these conditions.
If the pod enters a failed state, the function immediately returns 0. If the pod
does not reach a failed state within 120 seconds, it returns 1.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-24 16:09:20 +02:00
Hyounggyu Choi
2d6ac3d85d tests: Re-enable guest-pull-image tests for qemu-coco-dev
Now that the issue with handling loop devices has been resolved,
this commit re-enables the guest-pull-image tests for `qemu-coco-dev`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
c6b86e88e4 tests: Increase timeouts for qemu-coco-dev in trusted image storage tests
Timeouts occur (e.g. `create_container_timeout` and `wait_time`)
when using qemu-coco-dev.
This commit increases these timeouts for the trusted image storage
test cases

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
9cff9271bc tests: Run all commands in *_loop_device() using exec_host()
If the host running the tests is different from the host where the cluster is running,
the *_loop_device() functions do not work as expected because the device is created
on the test host, while the cluster expects the device to be local.

This commit ensures that all commands for the relevant functions are executed via exec_host()
so that a device should be handled on a cluster node.

Additionally, it modifies exec_host() to return the exit code of the last executed command
because the existing logic with `kubectl debug` sometimes includes unexpected characters
that are difficult to handle. `kubectl exec` appears to properly return the exit code for
a given command to it.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
374b8d2534 tests: Create and delete node debugger pod only once
Creating and deleting a node debugger pod for every `exec_host()`
call is inefficient.
This commit changes the test suite to create and delete the pod
only once, globally.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
aedf14b244 tests: Mimic node debugger with full privileges
This commit addresses an issue with handling loop devices
via a node debugger due to restricted privileges.
It runs a pod with full privileges, allowing it to mount
the host root to `/host`, similar to the node debugger.
This change enables us to run tests for trusted image storage
using the `qemu-coco-dev` runtime class.

Fixes: #10133

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Fabiano Fidêncio
593cbb8710 Merge pull request #10306 from microsoft/danmihai1/more-security-contexts
genpolicy: get UID from PodSecurityContext
2024-09-18 21:33:39 +02:00
stevenhorsman
aa9f21bd19 test: Add support for s390x in cosign testing
We've added s390x test container image, so add support
to use them based on the arch the test is running on

Fixes: #10302

Signed-off-by: stevenhorsman <steven@uk.ibm.com>

fixuop
2024-09-16 09:20:57 +01:00
stevenhorsman
3087ce17a6 tests: combined pod yaml creation for CoCo tests
This commit brings some public parts of image pulling test series like
encrypted image pulling, pulling images from authenticated registry and
image verification. This would help to reduce the cost of maintainance.

Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-16 09:20:57 +01:00
Xynnn007
c80c8d84c3 test: add cosign signature verificaton tests
Close #8120

**Case 1**
Create a pod from an unsigned image, on an insecureAcceptAnything
registry works.

Image: quay.io/prometheus/busybox:latest
Policy rule:
```
"default": [
    {
        "type": "insecureAcceptAnything"
    }
]
```

**Case 2**
Create a pod from an unsigned image, on a 'restricted registry' is
rejected.

Image: ghcr.io/confidential-containers/test-container-image-rs:unsigned
Policy rule:
```
"quay.io/confidential-containers/test-container-image-rs": [
    {
        "type": "sigstoreSigned",
        "keyPath": "kbs:///default/cosign-public-key/test"
    }
]
```

**Case 3**
Create a pod from a signed image, on a 'restricted registry' is
successful.

Image: ghcr.io/confidential-containers/test-container-image-rs:cosign-signed
Policy rule:
```
"ghcr.io/confidential-containers/test-container-image-rs": [
    {
        "type": "sigstoreSigned",
        "keyPath": "kbs:///default/cosign-public-key/test"
    }
]
```

**Case 4**
Create a pod from a signed image, on a 'restricted registry', but with
the wrong key is rejected

Image:
ghcr.io/confidential-containers/test-container-image-rs:cosign-signed-key2

Policy:
```
"ghcr.io/confidential-containers/test-container-image-rs": [
    {
        "type": "sigstoreSigned",
        "keyPath": "kbs:///default/cosign-public-key/test"
    }
]
```

**Case 5**
Create a pod from an unsigned image, on a 'restricted registry' works
if enable_signature_verfication is false

Image: ghcr.io/kata-containers/confidential-containers:unsigned

image security enable: false

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-16 09:20:57 +01:00
Dan Mihai
5777869cf4 tests: k8s-policy-rc: add unexpected UID test
Change pod runAsUser value of a Replication Controller after generating
the RC's policy, and verify that the RC pods get rejected due to this
change.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 22:05:31 +00:00
Dan Mihai
6773f14667 tests: k8s-policy-job: add unexpected UID test
Change pod runAsUser value of a Job after generating the Job's policy,
and verify that the Job gets rejected due to this change.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 22:05:31 +00:00
Dan Mihai
124f01beb3 tests: k8s-policy-deployment: add bad UID test
Change pod runAsUser value of a Deployment after generating the
Deployment's policy, and verify that the Deployment fails due to
this change.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 22:05:31 +00:00
Dan Mihai
d3127af9c5 tests: k8s-inotify: pod termination polling
Poll/wait for pod termination instead of sleeping 2 minutes. This
change typically saves ~90 seconds in my test cluster.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 17:12:55 +00:00
Hyounggyu Choi
4c933a5611 tests: Introduce retry mechanism for helm install
Kata-deploy often fails due to a transiently unreachable k8s cluster
for the qemu-coco-dev test on s390x.
(e.g. https://github.com/kata-containers/kata-containers/actions/runs/10831142906/job/30058527098?pr=10009)
This commit introduces a retry mechanism to mitigate these failures by
retrying the command two more times with a 10-second interval as a workaround.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-13 14:03:44 +02:00
Dan Mihai
0c5ac042e7 tests: k8s-policy-pod: add workaround for #10297
If the CI platform being tested doesn't support yet the prometheus
container image:
- Use busybox instead of prometheus.
- Skip the test cases that depend on the prometheus image.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-12 18:26:38 +00:00
Dan Mihai
94d95fc055 tests: k8s-policy-pod: test container UID changes
Add test cases for changing container UID after generating the policy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
Dan Mihai
db1ca4b665 tests: k8s-policy-pod: remove UID workaround
Remove the workaround for #9928, now that genpolicy is able to
convert user names from container images into the corresponding
UIDs from these images.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
Dan Mihai
eb7f747df1 genpolicy: enable create container UID verification
Disabling the UID Policy rule was a workaround for #9928. Re-enable
that rule here and add a new test/CI temporary workaround for this
issue. This new test workaround will be removed after fixing #9928.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
Dan Mihai
71ede4ea3f tests: k8s-policy-pod: use prometheus container
Change quay.io/prometheus/busybox to quay.io/prometheus/prometheus in
this test. The prometheus image will be helpful for testing the future
fix for #9928 because it specifies user = "nobody".

Also, change:

sh -c "ls -l /"

to:

echo -n "readinessProbe with space characters"

as the test readinessProbe command line. Both include a command line
argument containing space characters, but "sh -c" behaves differently
when using the prometheus container image (causes the readinessProbe
to time out, etc.).

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
Fabiano Fidêncio
97ecdabde9 Merge pull request #10294 from fidencio/topic/bring-ita-support
Bump guest-components / trustee to a version that supports ITA
2024-09-11 19:45:48 +02:00
Fabiano Fidêncio
1178fe20e9 tests: Adapt error parser for failed image decryption
With an older version of image-rs, we were getting the following error:
```
       Message:   failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key no suitable key found for decrypting layer key:
```

However, with the version of image-rs we are bumping to, the error comes
as:
```
       Message:   failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key

 Caused by:
     no suitable key found for decrypting layer key:
      keyprovider: failed to unwrap key by ttrpc
```

Due to this change, I'm splitting the check in two different ones.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 17:07:56 +02:00
Fabiano Fidêncio
3946aa7283 ci: tdx: Adapt how we get the host IP
In the process of switching the TDX CI machine we've noticed that
`hostname -i` in one of the machines returns an one and only IP address,
while in another machine it returns a full list of IPs.

As we're only interested in the first one, let's adapt the code to
always return the first one.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 09:31:43 +02:00
Dan Mihai
7ab95b56f1 Merge pull request #10251 from microsoft/saulparedes/support_readonly_hostpath
genpolicy: support readonly hostpath
2024-09-05 09:27:15 -07:00
Fabiano Fidêncio
70491ff29f Merge pull request #10244 from BbolroC/turn-on-kbs-qemu-coco-dev-s390x
gha: Turn on KBS for qemu-coco-dev on s390x
2024-09-05 13:02:42 +02:00
Saul Paredes
24c2d13fd3 genpolicy: support readonly emptyDir mount
Set emptyDir access based on volume mount readOnly value

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-09-04 15:05:44 -07:00
Saul Paredes
36a4104753 genpolicy: support readonly hostpath
Set hostpath access based on volume mount readOnly value

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-09-04 14:55:22 -07:00
Fabiano Fidêncio
a773797594 ci: Pass --debug to helm
Just to make ourlives a little bit easier.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
Wainer dos Santos Moschetta
3b23d62635 tests/k8s: fix wait for pods on deploy-kata action
On commit 51690bc157 we switched the installation from kubectl to helm
and used its `--wait` expecting the execution would continue when all
kata-deploy Pods were Ready. It turns out that there is a limitation on
helm install that won't wait properly when the daemonset is made of a
single replica and maxUnavailable=1. In order to fix that issue, let's
revert the changes partially to keep using kubectl and waitForProcess
to the exection while Pods aren't Running.

Fixes #10168
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-09-03 23:08:22 +02:00
Fabiano Fidêncio
40f8aae6db Reapply "ci: make cleanup_kata_deploy really simple"
This reverts commit 21f9f01e1d, as the
pacthes for helm are coming as part of this series.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
Fabiano Fidêncio
cfe6e4ae71 Reapply "ci: Use helm to deploy kata-deploy" (partially)
This reverts commit 36f4038a89, as the
pacthes for helm are coming as part of this series.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
Fabiano Fidêncio
424347bf0e Reapply "kata-deploy: Add Helm Chart" (partially)
This reverts commit b18c3dfce3, as the
pacthes for helm are coming as part of this series.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
Hyounggyu Choi
b0a912b8b4 tests: Enable KBS deployment for qemu-coco-dev on s390x
To deploy KBS on s390x, the environment variable `IBM_SE_CREDS_DIR`
must be exported, and the corresponding directory must be created.

This commit enables KBS deployment for `qemu-coco-dev`, in addition
to the existing `qemu-se` support on the platform.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-03 15:51:18 +02:00
GabyCT
dd9f41547c Merge pull request #10160 from microsoft/saulparedes/support_priority_class
genpolicy: add priorityClassName as a field in PodSpec interface
2024-08-28 14:36:20 -06:00
Archana Choudhary
ae2cdedba8 genpolicy: add priorityClassName as a field in PodSpec interface
This allows generation of policy for pods specifying priority classes.

Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-08-27 19:54:02 -07:00
Aurélien Bombo
a3dba3e82b ci: reinstate Mariner host
GH-9592 addressed a bug in a previous version of the AKS Mariner host
kernel that blocked the CH v39 upgrade. This bug has now been fixed so
we undo that PR.

Note we also specify a different OCI version for Mariner as it differs
from Ubuntu's.

Fixes: #9594

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-26 21:07:25 +00:00
Hyounggyu Choi
4cd83d2b98 Merge pull request #10202 from BbolroC/fix-k8s-tests-s390x
tests: Fix k8s test issues on s390x
2024-08-23 09:51:11 +02:00
Archana Shinde
b0be03a93f Revert "tests: add image check before running coco tests"
This reverts commit 41b7577f08.

We were seeing a lot of issues in the TDX CI of the nature:

"Error: failed to create containerd container: create instance
470: object with key "470" already exists: unknown"

With the TDX CI, we moved to having the nydus snapsotter pre-installed.
Essentially the `deploy-snapshotter` step was performed once before any
actual CI runs.
We were seeing failures related to the error message above.

On reverting this change, we are no longer seeing errors related to
"key exists" with the TDX CI passing now.

The change reverted here is related to downloading incomplete images, but this
seems to be messing up TDX CI.
It is possible to pass --snapshotter to `ctr image check` but that does
not seem to have any effect on the data set returned.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-08-22 18:05:42 -07:00
Hyounggyu Choi
274de8c6af tests: Introduce wait_time to k8s_create_pod()
In certain environments (e.g., those with lower performance), `k8s_create_pod()`
may require additional wait time, especially when dealing with large images.
Since `k8s_wait_pod_be_ready()` — which is called by `k8s_create_pod()` — already
accepts `wait_time` as a second argument, it makes sense to introduce `wait_time`
to `k8s_create_pod()` and propagate it to the callee.

This commit adds `wait_time` to `k8s_create_pod()` as the 2nd (optional) argument.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 17:46:53 +02:00
Hyounggyu Choi
5d7397cc69 tests: Load confidential_kbs.sh in k8s-guest-pull-iamge.bats
Some of the tests call set_metadata_annotation() for updating the kernel
parameters. For `kata-qemu-se`, repack_secure_image() is called which is
defined in `lib_se.sh` and sourced by `confidential_kbs.sh`.

This commit ensures that the function call chain for the relevant
`KATA_HYPERVISOR` is properly handled.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 17:33:38 +02:00
Dan Mihai
8ccc8a8d0b Merge pull request #9911 from microsoft/saulparedes/mounts
genpolicy: deny UpdateEphemeralMountsRequest
2024-08-21 10:12:28 -07:00
Dan Mihai
6654491cc3 genpolicy: deny UpdateEphemeralMountsRequest
* genpolicy: deny UpdateEphemeralMountsRequest

Deny UpdateEphemeralMountsRequest by default, because paths to
critical Guest components can be redirected using such request.

Signed-off-by: Dan Mihai <Daniel.Mihai@microsoft.com>
2024-08-20 18:28:17 -07:00
Fabiano Fidêncio
b18c3dfce3 Revert "kata-deploy: Add Helm Chart" (partially)
This partially reverts commit 94b3348d3c,
as there's more work needed in order to have this one done in a robust
way, and we are taking the safer path of reverting for now, and adding
it back as soon as the release is cut out.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-21 00:09:11 +02:00
Fabiano Fidêncio
36f4038a89 Revert "ci: Use helm to deploy kata-deploy" (partially)
This partially reverts commit 51690bc157,
as there's more work needed in order to have this one done in a robust
way, and we are taking the safer path of reverting for now, and adding
it back as soon as the release is cut out.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-21 00:09:11 +02:00
Fabiano Fidêncio
21f9f01e1d Revert "ci: make cleanup_kata_deploy really simple"
This reverts commit 1221ab73f9, as there's
more work needed in order to have this one done in a robust way, and we
are taking the safer path of reverting for now, and adding it back as
soon as the release is cut out.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-21 00:09:11 +02:00
Fabiano Fidêncio
aeb6f54979 Merge pull request #10180 from fidencio/topic/ci-ensure-the-key-was-created-on-kbs
ci: Ensure the KBS resources are created
2024-08-20 09:07:56 +02:00
Fabiano Fidêncio
40d385d401 Merge pull request #10188 from wainersm/kbs_key
tests/k8s: check and save kbs.key
2024-08-19 23:29:10 +02:00
Fabiano Fidêncio
c0d7222194 ci: Ensure the KBS resources are created
Otherwise we may have tests failing due to the resource not being
created yet.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-19 23:27:06 +02:00
Wainer dos Santos Moschetta
e014eee4e8 tests/k8s: check and save kbs.key
The deploy-kbs.sh script generates the kbs.key that's used to install
KBS. This same file is used lately by kbs-client to authenticate. This ensures
that the file was created, otherwise fail.

Another problem solved here is that on bare-metal machines the key doesn't survive
a reboot as it is created in a temporary directory (/tmp/trustee). So let's save
the file to a non-temporary location.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-19 16:03:03 -03:00
Fabiano Fidêncio
0831081399 ci: k8s: Replace nginx alpine images
The previous ones are gone, so let's switch to our own multi-arch image
for the tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-17 12:19:33 +02:00