Compare commits

..

578 Commits
3.5.0 ... 3.7.0

Author SHA1 Message Date
GabyCT
6aff5f300a Merge pull request #10021 from GabyCT/topic/fixarchdoc
docs: Update devmapper docs
2024-07-17 14:56:40 -06:00
Steve Horsman
e5d5284761 Merge pull request #10026 from wainersm/release_370
release: Bump VERSION to 3.7.0
2024-07-17 18:43:51 +01:00
Wainer dos Santos Moschetta
6f7ab31860 release: Bump VERSION to 3.7.0
On preparation for the 3.7.0 release, bumped the version in VERSION file.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-07-17 14:19:44 -03:00
Dan Mihai
f31c1b121e Merge pull request #9812 from microsoft/saulparedes/test_policy_on_tdx
gha: enable policy testing on TDX
2024-07-17 08:47:44 -07:00
Dan Mihai
449103c7bf Merge pull request #10020 from microsoft/danmihai1/pod-security-context
tests: fix ps command in k8s-security-context
2024-07-17 08:12:57 -07:00
Fabiano Fidêncio
b7051890af Merge pull request #9722 from zvonkok/busybox-build
deploy: Add busybox target
2024-07-17 13:47:15 +02:00
Steve Horsman
5ce2c1010a Merge pull request #9904 from stevenhorsman/registry-authentication
Support for registry authentication in guest pull
2024-07-17 10:48:38 +01:00
Fupan Li
65f2bfb8c4 Merge pull request #9967 from zvonkok/kernel-dragonball-6.1.x
dragonball: kernel dragonball 6.1.x
2024-07-17 14:38:06 +08:00
Dan Mihai
0e86a96157 tests: fix ps command in k8s-security-context
1. Use a container image that supports "ps --user 1000 -f".
2. Execute that command using:

sh -c "ps --user 1000 -f"

instead of passing additional arguments to sh:

sh -c ps --user 1000 -f

Fixes: #10019

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-17 01:33:31 +00:00
stevenhorsman
567b4d5788 test/k8s: Fix up node logging typo
We had a typo in the attestation tests that we've copied around a
lot and Wainer spotted it in the authenticated registry tests, so let's fix it up now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
0015c8ef51 tests: Add guest-pull auth registry tests
Add three new test cases for guest pull from an authenticated registry for
the following scenarios:

_**Scenario**: Creating a container from an authenticated image, with correct credentials via KBC works_
**Given** An authenticated container registry *quay.io/kata-containers/confidential-containers-auth*
  **And** a version of kata deployed with a guest image that has an agent with `guest_pull`
feature enabled and nydus-snapshotter installed and configured for
[guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml)
  **And** a KBS set up to have the correct auth.json for
registry *quay.io/kata-containers/confidential-containers-auth* embedded in the `"Credential"` section of `its resources file`
**When** I create a pod from the container image *quay.io/kata-containers/confidential-containers-auth:test*
**Then** The pull image works and the pod can start

_**Scenario**: Creating a container from an authenticated image, with incorrect credentials via KBC fails_
**Given**  An authenticated container registry *quay.io/kata-containers/confidential-containers-auth*
  **And** a version of kata deployed with a guest image that has an agent with `guest_pull`
feature enabled and nydus-snapshotter installed and configured for
[guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml)
  **And** An installed kata CC with the sample_kbs set up to have the auth.json for registry
*quay.io/kata-containers/confidential-containers-auth* embedded in the `"Credential"` resource, but with a dummy user name and password
**When** I create a pod from the container image *quay.io/kata-containers/confidential-containers-auth:test*
**Then** The pull image fails with a message that reflects that the authorisation failed

_**Scenario**: Creating a container from an authenticated image, with no credentials fails_
**Given**  An authenticated container registry *quay.io/kata-containers/confidential-containers-auth*
  **And** a version of kata deployed with a guest image that has an agent with `guest_pull`
feature enabled and nydus-snapshotter installed and configured for
[guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml)
  **And** An installed kata CC with no credentials section
**When** I create a pod from the container image *quay.io/kata-containers/confidential-containers-auth:test*
**Then** The pull image fails with a message that reflects that the authorisation failed

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
eb07f5ef5e agent: doc: Fix ordering of options
- Fix the config options to be back in alphabetical order to be
easier to find

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
7cc81ce867 agent: image: Set image-rs auth config
If the agent-config has a value for `image_registry_auth`,
Then pass this to the image-rs client and enable auth mode too

Fixes: #8122

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
265322990a agent: config: Add config option to provide auth for guest-pull
Add optional config for agent.image_registry_auth, to specify
the uri of credentials to be used when pulling images in the guest
from an authenticated registry

Fixes: #8122

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
Steve Horsman
064b45a2fa Merge pull request #10016 from wainersm/ibm-se-auth-reg
workflows: setup environment to run auth registry tests on s390x
2024-07-16 22:24:39 +01:00
Gabriela Cervantes
d2866081d2 docs: Update devmapper docs
This PR updates the devmapper docs by updating the url link
for the current containerd devmapper information.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-16 21:07:51 +00:00
GabyCT
2206e2dd5c Merge pull request #10013 from GabyCT/topic/updatecontdoc
docs: Update cri installion guide url in containerd documentation
2024-07-16 14:32:59 -06:00
Wainer dos Santos Moschetta
66c600f8d8 gha: delint the s390x workflow
Made run-k8s-tests-on-zvsi.yaml free of warnings by removing:

SC2086:info:1:1: Double quote to prevent globbing and word splitting ...
SC2086:info:2:1: Double quote to prevent globbing and word splitting ...

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-07-16 15:20:46 -03:00
Wainer dos Santos Moschetta
a98985fab8 gha: export user/password for auth registry tests on s390x
Counterpart of commit d8961cbd4a for run-k8s-tests-on-zvsi workflow

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-07-16 15:18:40 -03:00
Saul Paredes
af49252c69 gha: enable policy testing on TDX
Enable policy testing on TDX

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-15 14:09:49 -07:00
Saul Paredes
0b3d193730 genpolicy: Support cpath for mount sources
Add setting to allow specifying the cpath for a mount source.

cpath is the root path for most files used by a container. For example,
the container rootfs and various files copied from the Host to the
Guest when shared_fs=none are hosted under cpath.

mount_source_cpath is the root of the paths used a storage mount
sources. Depending on Kata settings, mount_source_cpath might have the
same value as cpath - but on TDX for example these two paths are
different: TDX uses "/run/kata-containers" as cpath,
but "/run/kata-containers/shared/containers" as mount_source_cpath.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-15 14:09:49 -07:00
Gabriela Cervantes
e4045ff29a docs: Update runtime v2 containerd url information
This PR updates the runtime v2 containerd url information at containerd
documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-15 20:36:17 +00:00
Dan Mihai
bcaf7fc3b4 Merge pull request #10008 from microsoft/danmihai1/runAsUser
genpolicy: add support for runAsUser fields
2024-07-15 12:08:50 -07:00
Gabriela Cervantes
9f738f0d05 docs: Update cri installion guide url in containerd documentation
This PR updates the cri installation guide url link in the containerd
documentation guide as the previous url link does not exists.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-15 16:58:18 +00:00
Dan Mihai
648265d80e Merge pull request #9998 from microsoft/danmihai1/GENPOLICY_PULL_METHOD
tests: k8s: GENPOLICY_PULL_METHOD clean-up
2024-07-15 09:32:29 -07:00
Steve Horsman
02b9fd6e95 Merge pull request #9382 from Xynnn007/feat-encrypt-image
Merge to main: supporting pull encrypted images
2024-07-15 15:58:42 +01:00
stevenhorsman
b060fb5b31 tests/k8s: Skip measured rootfs test
The only kernel built for measured rootfs was the kernel-tdx-experimental,
so this test only ran in the qemu-tdx job runs the test.
In commit 6cbdba7 we switched all TEE configurations to use the same kernel-confidential,
so rootfs measured is disabled for qemu-tdx too now.
The VM still fails to boot (because of a different reason...) but the bug
in the assert_logs_contain, fixed in this PR was masking the checks on the logs.
We still have a few open issues related to measured rootfs and generating
the root hash, so let's skip this test that doesn't work until they are looked at

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
stevenhorsman
2cf94ae717 tests: Add guest-pull encrypted image tests
Add three new tests cases for guest-pull of an encrypted image
for the following scenarios:

_**Scenario: Pull encrypted image on guest with correct key works**_
**Given** I have a version of kata deployed with a guest image that has
an agent with `guest_pull` feature enabled and nydus-snapshotter installed
and configured for guest-pulling
  **And** A public encrypted container image *i* with a decryption key *k*
that is configured as a resource the KBS, so that image-rs on the guest can
connect to it
**When** I try and create a pod from *i*
**Then** The pod is successfully created and runs

_**Scenario: Cannot pull encrypted image with no decryption key**_
**Given** I have a version of kata deployed with a guest image that has
an agent with `guest_pull` feature enabled and nydus-snapshotter installed
and configured for guest-pulling
  **And** A public encrypted container image *i* with a decryption key *k*,
that is **not** configured in a KBS that image-rs on the guest can connect to
**When** I try and create a pod from *i*
**Then** The pod is not created with an error message that reflects why

_**Scenario: Cannot pull encrypted image with wrong decryption key**_
**Given** I have a version of kata deployed with a guest image that has
an agent with `guest_pull` feature enabled and nydus-snapshotter installed
and configured for guest-pulling
  **And** A public encrypted container image *i* with a decryption key *k*
and a different key *k'* that is set as a resource in a KBS, that image-rs
on the guest can connect to
**When** I try and create a pod from *i*
**Then** The pod is not created with an error message that reflects why

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
Xynnn007
a56b15112a agent: add ocicrypt config
ocicrypt config is for kata-agent to connect to CDH to request for image
decryption key. This value is specified by an env. We use this
workaround the same as CCv0 branch.

In future, we will consider better ways instead of writting files and
setting envs inside inner logic of kata-agent.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-07-15 12:00:50 +01:00
Xynnn007
1072658219 agent: Enable kata-cc-rustls-tls in image-rs
- Enable the kata-cc-rustls-tls feature in image-rs, so that it
can get resources from the KBS in order to retrieve the registry
credentials.
- Also bump to the latest image-rs to pick up protobuf fixes
- Add libprotobuf-dev dependency to the agent packaging
as it is needed by the new image-rs feature
- Add extra env in the agent make test as the
new version of the anyhow crate has changed the backtrace capture thus unit
tests of kata-agent that compares a raised error with an expected one
would fail. To fix this, we need only panics to have backtraces, thus
set RUST_BACKTRACE=0 for tests due to document
https://docs.rs/anyhow/latest/anyhow/

Fixes #9538

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
stevenhorsman
3b72e9ffab tests/k8s: Fix assert_logs_contain
The pipe needs adding to the grep, otherwise the grep
gets consumed as an argument to `print_node_journal` and
run in the debug pod.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
Hyounggyu Choi
83b3a681f4 Merge pull request #10010 from BbolroC/osbuilder-bump-fedora-to-40
osbuilder: Bump Fedora to 40
2024-07-15 13:00:28 +02:00
Greg Kurz
203d9e7803 Merge pull request #10000 from littlejawa/kata_deploy_add_storage_config_for_crio
kata-deploy: add storage configuration for cri-o
2024-07-15 12:29:21 +02:00
Hyounggyu Choi
08d2f6bfe4 osbuilder: Bump Fedora to 40
As Fedora 38 has reached EOL, we are encountering 404 errors for s390x, such as:

```
Status code: 404 for https://dl.fedoraproject.org/pub/fedora-secondary/updates/38/Everything/s390x/repodata/repomd.xml
```

Let's bump the OS to the latest version.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-15 09:58:54 +02:00
Fupan Li
a7179be31d Merge pull request #9534 from Tim-Zhang/fix-stdin-stuck
Fix ctr exec stuck problem
2024-07-15 13:19:19 +08:00
Dan Mihai
dded329d26 tests: k8s: SecurityContext.runAsUser policy test
Add test for auto-generating policy for a pod spec that includes the
SecurityContext.runAsUser field.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:37:58 +00:00
Dan Mihai
7040fb8c50 tests: k8s-security-context auto-generated policy
Auto-generate the policy in k8s-security-context.bats - previously
blocked by lacking support for PodSecurityContext.runAsUser.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:23:54 +00:00
Dan Mihai
f087044ecb genpolicy: add support for runAsUser
Add ability to auto-generate policy for SecurityContext.runAsUser and
PodSecurityContext.runAsUser.

Fixes: #8879

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:10:43 +00:00
Dan Mihai
5282701b5b genpolicy: add link to allow_user() active issue
Improve comment to workaround in rules.rego, to explain better the
reason for that workaround.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:05:58 +00:00
GabyCT
3c0171df3d Merge pull request #10005 from GabyCT/topic/katadragonball
common: Add share fs information for dragonball
2024-07-12 16:10:29 -06:00
Wainer Moschetta
646d7ea4fb Merge pull request #9951 from BbolroC/enable-attestation-for-ibm-se
tests: Enable attestation e2e tests for IBM SE
2024-07-11 16:02:59 -03:00
Hyounggyu Choi
ca80301b4b Merge pull request #10003 from BbolroC/skip-pod-shared-volume-for-ibm-se
k8s: Skip shared-volume relevant tests for IBM SE
2024-07-11 19:29:13 +02:00
Gabriela Cervantes
4477b4c9dc common: Add share fs information for dragonball
This PR adds the share fs information for dragonball using kata-ctl
to avoid the failures in runk tests saying that shared_fs is an
unbound variable.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-11 17:09:35 +00:00
Dan Mihai
09c5ca8032 tests: k8s: clarify the need to use containerd.sock
Modify the permissions of containerd.sock just when genpolicy needs
access to this socket, when testing GENPOLICY_PULL_METHOD=containerd.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-11 16:49:58 +00:00
Dan Mihai
c1247cc254 tests: k8s: explain the default containerd settings
Explain why the containerd settings on the local machine get set to
containerd's defaults when testing GENPOLICY_PULL_METHOD=containerd.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-11 16:49:39 +00:00
Dan Mihai
3b62eb4695 tests: k8s: add comment for GENPOLICY_PULL_METHOD
Explain why there are two different methods for pulling container
images in genpolicy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-11 16:40:01 +00:00
Dan Mihai
eaedd21277 tests: k8s: use oci-distribution as default value
oci-distribution is the value used by run-k8s-tests-on-aks.yaml, so
use the same value as default for GENPOLICY_PULL_METHOD in gha-run.sh.

The value of GENPOLICY_PULL_METHOD is currently compared just with
"containerd", but avoid possible future problems due to using a
different default value in gha-run.sh.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-11 16:40:01 +00:00
GabyCT
2056eda5f0 Merge pull request #9922 from GabyCT/topic/updateblogname
metrics: Update container name in blogbench test
2024-07-11 10:05:35 -06:00
Hyounggyu Choi
32c3e55cde k8s: Skip shared-volume relevant tests for IBM SE
Currently, it is not viable to share a writable volume (e.g., emptyDir)
between containers in a single pod for IBM SE.
The following tests are relevant:
  - pod-shared-volume.bats
  - k8s-empty-dirs.bats
(See: https://github.com/kata-containers/kata-containers/issues/10002)

This commit skips the tests until the issue is resolved.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-11 14:09:19 +02:00
Julien Ropé
b83d4e1528 kata-deploy: add storage configuration for cri-o
Make sure that the "skip_mount_home" flag is set in cri-o config.

Fixes: #9878

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-07-11 10:11:30 +02:00
GabyCT
dac07239f5 Merge pull request #9974 from squarti/sharedfs
runtime: Initialize SharedFS for remote hypervisor
2024-07-10 17:03:00 -06:00
GabyCT
3827b5f9f2 Merge pull request #9982 from ChengyuZhu6/fix-ci
tests: Delete test scripts forcely
2024-07-10 17:00:41 -06:00
Wainer Moschetta
deb4627558 Merge pull request #9975 from niteeshkd/nd_snp_attestation
gha: enable SNP attestation
2024-07-10 18:59:05 -03:00
GabyCT
c40b3b4ce7 Merge pull request #9992 from sprt/fix-nydus
ci: fix run-nydus tests
2024-07-10 13:56:16 -06:00
David Esparza
be9385342e Merge pull request #9990 from GabyCT/topic/tdxtimeout
gha: Increase timeout to run CoCo TDX tests
2024-07-10 13:21:23 -06:00
Silenio Quarti
8260ce8d15 runtime: Initialize SharedFS for remote hypervisor
Sets SharedFS config to NoSharedFS for remote hypervisor in order to start the file watcher which syncs files from the host to the guest VMs. 

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-07-10 14:31:25 -03:00
Aurélien Bombo
25e0e2fb35 ci: fix run-nydus tests
GH-9973 introduced:

 * New function get_kata_memory_and_vcpus() in
   tests/metrics/lib/common.bash.
 * A call to get_kata_memory_and_vcpus() from extract_kata_env(), which
   is defined in tests/common.bash.

Because the nydus test only sources tests/common.bash, it can't find
get_kata_memory_and_vcpus() and errors out.

We fix this by moving the get_kata_memory_and_vcpus() call from
tests/common.bash to tests/metrics/lib/json.bash so that it doesn't
impact the nydus test.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-10 17:19:08 +00:00
Gabriela Cervantes
b6b8524ab7 gha: Increase timeout to run CoCo TDX tests
This PR increases the timeout to run the CoCo TDX tests in order
to avoid the random failures on TDX saying that
The action 'Run tests' has timed out after 30 minutes and making
the GHA job fail.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-10 16:06:07 +00:00
Niteesh Dubey
e8a3f8571e docs: update for SNP attestation
This updates how-to document for SNP attestation.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-10 15:06:55 +00:00
Niteesh Dubey
ff04154fdb gha: enable SNP attestation
This removes the code to skip the SNP attestation.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-10 15:06:55 +00:00
Hyounggyu Choi
d94b285189 tests: Enable k8s-confidential-attestation.bats for s390x
For running a KBS with `se-verifier` in service,
specific credentials need to be configured.
(See https://github.com/confidential-containers/trustee/tree/main/attestation-service/verifier/src/se for details.)

This commit introduces two procedures to support IBM SE attestation:

- Prepare required files and directory structure
- Set necessary environment variables for KBS deployment
- Repackage a secure image once the KBS service address is determined

These changes enable `k8s-confidential-attestation.bats` for s390x.

Fixes: #9933

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Hyounggyu Choi
5d0f74cd70 local-build: Extract build_secure_image() as a separate library
Currently, all functions in `build_se_image.sh` are dedicated to
publishing a payload image. However, `build_secure_image()` is now
also used for repackaging a secure image when a kernel parameter
is reconfigured. This reconfiguration is necessary because the KBS
service address is determined after the initial secure image build.

This commit extracts `build_secure_image()` from `build_se_image.sh`
and creates a separate library, which can be loaded by bats-core.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Hyounggyu Choi
bf2f0ea2ca tests: Change a location for creating key.bin
The current KBS deployment creates a file `key.bin` assuming that
`kustomization.yaml` is located in `overlays/`.

However, this does not hold true when the kustomize config is enabled
for multiple architectures. In such cases, the configuration file
should be located in `overlays/$(uname -m)`.
This commit changes the location for file creation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Hyounggyu Choi
4025ef7193 versions: Bump trustee to multi-arch deployment for KBS
As part of the enablement for s390x, KBS should support multi-arch deployment.
This commit updates the version of coco-trustee to a commit where the support
is implemented.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Hyounggyu Choi
856a1f72c6 packaging: Set ATTESTER to se-attester for guest components on s390x
This commit allows the guest-components builder to only build se-attester on s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Xuewei Niu
7f71eac6de Merge pull request #9868 from l8huang/dan
runtime: implement DAN in Go kata-runtime
2024-07-10 19:09:46 +08:00
Alex Lyn
dafff26f01 Merge pull request #9814 from Apokleos/bugfix-pcipath
runtime-rs: bugfix for root bus slot allocation
2024-07-10 16:19:06 +08:00
Steve Horsman
aa487307e8 Merge pull request #9962 from GabyCT/topic/removecif
scripts: Eliminate CI variable as it is not longer used
2024-07-10 09:02:33 +01:00
Steve Horsman
78bbc51ff0 Merge pull request #9806 from niteeshkd/nd_snp_certs
runtime: pass certificates to get extended attestation report for SNP coco
2024-07-10 08:57:45 +01:00
Steve Horsman
29413021e5 Merge pull request #9981 from stevenhorsman/run-k8s-tests-on-zvsi-inherit-secrets
gha: make run-k8s-tests-on-zvsi inherit secrets
2024-07-10 08:49:11 +01:00
Lei Huang
171d298dea runtime: implement DAN in Go kata-runtime
The DAN feature has already been implemented in kata-runtime-rs, and
this commit brings the same capability to the Go kata-runtime.

Fixes: #9758

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-07-10 00:22:30 -07:00
ChengyuZhu6
489afffd8c tests:gha: delete namespace before resetting namespace
Delete the kata-containers-k8s-tests namespace before resetting the namespace
to ensure that no deployments or services are restarting and creating pods in the default namespace.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Wang, Arron <arron.wang@intel.com>
2024-07-10 12:08:28 +08:00
ChengyuZhu6
e874c8fa2e tests: Delete test scripts forcely
Delete test scripts forcely in `Delete kata-deploy` step before
deleting all kata pods.

Fixes: #9980

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-10 12:08:28 +08:00
Alex Lyn
806e959b01 runtime-rs: bugfix for device slot allocation failed in dragonball
In dragonball Vfio device passthrough scenarois, the first passthrough
device will be allocated slot 0 which is occupied by root device.
It will cause error, looks like as below:
```
...
6: failed to add VFIO passthrough device: NoResource\n
7: no resource available for VFIO device"): unknown
...
```
To address such problem, we adopt another method with no pre-allocated
guest device id and just let dragonball auto allocate guest device id
and return it to runtime. With this idea, add_device will return value
Result<DeviceType> and apply the change to related code.

Fixes #9813

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-10 10:59:57 +08:00
Alex Lyn
27947cbb0b dragonball: make add vfio device return guest device id
Fixes #9813

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-10 10:59:51 +08:00
Alex Lyn
fa4af09658 Merge pull request #9985 from GabyCT/topic/fixcrites
cri-containerd: Remove use_devmapper variable for cri-containerd tests
2024-07-10 10:13:27 +08:00
Alex Lyn
e4997760f1 Merge pull request #9987 from kata-containers/remove_double_process_check_from_memory_usage_test
metrics: Remove duplicate check of processes from memory test.
2024-07-10 10:12:18 +08:00
David Esparza
09f523c815 Merge pull request #9973 from kata-containers/add_memory_and_vcpus_info_to_results
Add memory and vcpus info to metrics results
2024-07-09 18:05:07 -06:00
David Esparza
e77d44614b metrics: Remove duplicate check of processes from memory test.
This PR removes the common_init function call from the memory
usage script to eliminate duplicate checking that is also done
from the init_env function.

It also eliminates duplicaction of nested conditionals.

Fixes: #9984

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-07-09 12:34:51 -06:00
Gabriela Cervantes
7061272b4e kernel: bump kata config version
This PR bumps the kata config version as the kernel scripts were
modified.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Gabriela Cervantes
de848c1458 packaging: Remove CI variable from build kernel script
This PR removes the CI variable from build kernel script which
is not longer supported it as this was part of the jenkins
environment.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Gabriela Cervantes
28601b51d2 tools: Remove CI variable in kata deploy in docker script
This PR removes the CI variable in kata deploy in docker script
which was supported it in jenkins environment which is not
longer being supported it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Gabriela Cervantes
f2b8c6619d makefile: Remove CI variable from local build makefile
This PR removes the CI variable from the local build makefile as
this was part of the jenkins environment which is not longer supported
it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Gabriela Cervantes
4161fa3792 tools: Remove CI variable in test images script for osbuilder
This PR removes the CI variable in test images script for osbuilder
as this was part of the jenkins environment which is not longer supported
it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Greg Kurz
7506d1ec29 tools: Remove CI variable in test config osbuilder script
This PR removes the CI variable in test config osbuilder script
which was supported on the jenkins environment which is not
longer supported it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
[greg: squash all fixes into a single patch]
Signed-off-by: Greg Kurz <groug@kaod.org>
2024-07-09 20:03:08 +02:00
Niteesh Dubey
647dad2a00 gha: skip SNP attestation test
Skip the SNP attestation test for now.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-09 17:16:07 +00:00
Niteesh Dubey
e7b4e5e386 gha: add SNP attestation test
This tests the attestation of SNP guest.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-09 17:14:26 +00:00
Gabriela Cervantes
1a1e62b968 cri-containerd: Remove use_devmapper variable for cri-containerd tests
This PR removes the use_devmapper variable which was part of the jenkins
environment flags which is not longer support it or available for the
cri-containerd tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 17:09:55 +00:00
GabyCT
eb0bc5007c Merge pull request #9976 from sprt/fix-cri-containerd
tests: cri-containerd: Ensure Docker isn't present
2024-07-09 11:02:20 -06:00
David Esparza
04df85a44f metrics: Add num_vcpus and free_mem to metrics results template.
This PR retrieves the free memory and the vcpus count from
a kata container and includes them to the json results file of
any metric.

Additionally this PR parses the requested vcpus quantity and the
requested amount memory from kata configuration file and includes
this pair of values into the json results file of any metric.

Finally, the file system defined in the kata configuration file
is included in the results template.

Fixes: #9972

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-07-09 10:29:29 -06:00
David Esparza
a554541495 metrics: Improvement to the description of certain functions.
This PR rephrased the description and usage of certain functions
as such as:
- set_kata_configuration_performance
- set_kata_config_file
- get_current_kata_config_file
- check_if_root
- check_ctr_images

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-07-09 10:29:29 -06:00
stevenhorsman
c7cf26fa32 gha: make run-k8s-tests-on-zvsi inherit secrets
run-k8s-tests-on-zvsi runs the coco tests and we've added new
secrets to provide credentials for the authenticated image testing,
so we need to let the zvsi job inherit these from the caller workflow
like the rest of the coco tests

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-09 15:29:48 +01:00
Hyounggyu Choi
37b907dfbc Merge pull request #9859 from BbolroC/set-ocispec-for-vfio-ap
tests: Extend vfio-ap hotplug test to use a zcrypttest tool
2024-07-09 14:03:45 +02:00
Steve Horsman
ff498c55d1 Merge pull request #9719 from fitzthum/sealed-secret
Support Confidential Sealed Secrets (as env vars)
2024-07-09 09:43:51 +01:00
Niteesh Dubey
529660fafb runtime: pass certificates for SNP coco
This will be used to get extended attestation report.

Fixes: #9805

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-09 03:46:00 +00:00
Tim Zhang
704da86e9b CI: Add tests for stdio
Add tests for stdio

Signed-off-by: Tim Zhang <tim@hyper.sh>
2024-07-09 11:44:40 +08:00
Tim Zhang
8801554889 runtime-rs: Fix ctr exec stuck problem
Fixes: #9532

Instead of call agent.close_stdin in close_io, we call agent.write_stdin
with 0 len data when the stdin pipe ends.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2024-07-09 11:44:36 +08:00
Tobin Feldman-Fitzthum
1c2d69ded7 tests: add test for sealed env secrets
The sealed secret test depends on the KBS to provide
the unsealed value of a vault secret.

This secret is provisioned to an environment variable.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2024-07-08 17:41:20 -05:00
Linda Yu
b4d61f887b agent: unittest for sealed secret as env in kata
To test unsealing secrets stored in environment variables,
we create a simple test server that takes the place of
the CDH. We start this server and then use it to
unseal a test secret.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-07-08 17:32:45 -05:00
Linda Yu
6003608fe6 agent: support sealed secret as env in kata
When sealed-secret is enabled, the Kata Agent
intercepts environment variables containing
sealed secrets and uses the CDH to unseal
the value.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-07-08 17:31:33 -05:00
Gabriela Cervantes
cf2d5ff4c1 scrips: Fix indentation in QAT run script
This PR fixes the indentation of the QAT run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:23:50 +00:00
Gabriela Cervantes
d53eb61856 QAT: Remove CI variable from QAT run script
This PR removes the CI variable from QAT run script which was used
in the jenkins environment and not longer used.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:16:00 +00:00
Gabriela Cervantes
8a79b1449e tests: Remove CI variable in tracing test
This PR removes the CI variable as well as the instructions related
to this as this was part of the jenkins environment which is not
longer supported it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:12:41 +00:00
Gabriela Cervantes
9d44abb406 tests: Remove CI variable in test agent shutdown
This PR removes the CI variable as well as the instructions related
to this variable which was used on the jenkins environment and not
longer supported.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:10:24 +00:00
Gabriela Cervantes
f2ed8dc568 docs: Remove CI variable from Intel QAT documentation
This PR updates the Intel QAT documentation by removing the CI variable
which is not longer being supported as this was part of the jenkins
CI environment.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:05:47 +00:00
Gabriela Cervantes
ff06ef0bbc scripts: Eliminate CI variable as it is not longer used
This PR removes the CI variable which is not longer being used or valid
in the kata containers repository. The CI variable was used when we
were using jenkins and scripts setups which are not longer supported.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:00:30 +00:00
GabyCT
cb0fb91bdd Merge pull request #9966 from GabyCT/topic/fixstability
tests: Use variable already defined in metrics common script for stability tests
2024-07-08 13:55:55 -06:00
Aurélien Bombo
e9d6179b28 tests: cri-containerd: Ensure Docker isn't present
Following #9960 that transitioned this test to a free runner, we need to
ensure Docker isn't installed on the system as that will conflict with
the installation of Podman.

Example error:
https://github.com/kata-containers/kata-containers/actions/runs/9818218975/job/27177785716

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-08 18:50:57 +00:00
Steve Horsman
e8836fafaa Merge pull request #9828 from stevenhorsman/image-rs-bump-bad84c7
Image rs bump to latest main
2024-07-08 17:07:59 +01:00
Fabiano Fidêncio
67ba0ad0ad Merge pull request #9971 from GabyCT/topic/fixnerdctldep
gha: Fix pip installation for nerdctl GHA
2024-07-06 21:37:55 +02:00
Gabriela Cervantes
724b2c612c gha: Fix pip installation for nerdctl GHA
This PR fixes the pip installation for nerdctl by removing a flag
which is not longer supported and avoid the failure of
no such option: --break-system-packages.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-05 17:31:52 +00:00
stevenhorsman
1d6c1d1621 test: Add journal logging for debug
- Due to the error we hit with pulling the agnhost
image used in the liveness-probe tests, we want to leave
the console printing to help with debug when we next try
to bump the image-rs version

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-05 10:25:28 +01:00
stevenhorsman
d511820974 agent: Bump image-rs
- Bump the commit of image-rs we are pulling in to 413295415
Note: This is the last commmit before a change to whiteout handling
was introduced that lead to the error `'failed to unpack: convert whiteout"`
when pulling the agnhost:2.21 image

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-05 10:25:28 +01:00
Fabiano Fidêncio
543c90f145 Merge pull request #9695 from ChengyuZhu6/fix-init
Fix issues on CI about guest-pull
2024-07-05 11:21:08 +02:00
ChengyuZhu6
65dc12d791 tests: Re-enable k8s-kill-all-process-in-container.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
2ea521db5e tests:tdx: Re-enable k8s-liveness-probes.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
93453c37d6 tests: Re-enable k8s-sysctls.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
6c5e053dd5 tests: Re-enable k8s-shared-volume.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
85979021b3 tests: Re-enable k8s-file-volume.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
e71c7ab932 agent/image: Remove functions about merging container spec for guest pull
Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information
from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises
when two containers in the same pod attempt to pull the same image, like InitContainer.
This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>,
is stored in the directory /run/kata-containers/<cid>. Consequently, when the InitContainer finishes
its task and terminates, the directory ceases to exist. As a result, during the creation
of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fixes: kata-containers#9665
Fixes: kata-containers#9666
Fixes: kata-containers#9667
Fixes: kata-containers#9668

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
c9d1a758cd agent/image: Reuse the mountpoint in image-rs
Currently, the image is pulled by image-rs in the guest and mounted at
`/run/kata-containers/image/cid/rootfs`. Finally, the agent rebinds
`/run/kata-containers/image/cid/rootfs` to `/run/kata-containers/cid/rootfs` in CreateContainer.
However, this process requires specific cleanup steps for these mount points.

To simplify, we reuse the mount point `/run/kata-containers/cid/rootfs`
and allow image-rs to directly mount the image there, eliminating the need for rebinding.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
stevenhorsman
05cd1cc7a0 agent: Add CreateContainer support for pre-pulled bundle
- Add a check in setup_bundle to see if the bundle already exists
and if it does then skip the setup.

This commit is cherry-picked from 44ed3ab80e.

The reason that k8s-kill-all-process-in-container.bats failed is that
deletion of the directory `/root/kata-containers/cid/rootfs` failed during removing container
because it was mounted twice (one in image-rs and one in set_bundle ) and only unmounted once in removing container.

Fixes: #9664

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Dave Hay <david_hay@uk.ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-05 08:10:00 +08:00
Zvonko Kaiser
7990d3a154 dragonball: Update kata config version
Mandatory update

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-04 17:24:16 +00:00
Zvonko Kaiser
cfbca4fe0d dragonball: Update versions
Use the latest guest kernel that we use for all other VMMs

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-04 17:24:16 +00:00
Zvonko Kaiser
26446d1edb dragonball: Update patches
After v5.14 there is no cpu_hotplug_begin function
now cpus_write_lock same for cpu_hotplug_done = cpus_write_unlock

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-04 17:23:24 +00:00
Zvonko Kaiser
ad574b7e10 dragonball: Add patches for 6.1.x
Ported the 5.10 patchs to 6.1.x

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-04 17:06:39 +00:00
Gabriela Cervantes
757f37d956 stability: General improvements for soak parallel test
This PR has better variable definitons as well the use of a variable
which is already defined in the metrics common script for soak parallel
test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-04 16:32:46 +00:00
Gabriela Cervantes
6d56abbdad stability: General improvements to agent stability test
This PR is for better variable definitions as well as the use of the
CTR_EXE variable which is already defined in the metrics common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-04 16:24:27 +00:00
Gabriela Cervantes
3e6c32c3c8 tests: Use variable already defined in stability tests
This PR uses the CTR_EXE which is already defined in the metrics common
script to have uniformity across the multiple stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-04 16:21:24 +00:00
Steve Horsman
ddb8a94677 Merge pull request #9960 from sprt/fix-garm
ci: Transition GARM tests to free runners, pt. I
2024-07-04 09:04:58 +01:00
Biao Lu
6c1a2f01f8 protocols: add support for sealed_secret service
To unseal a secret, the Kata agent will contact the CDH
using ttRPC. Add the proto that describes the sealed
secret service and messages that will be used.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Biao Lu <biao.lu@intel.com>
2024-07-04 01:03:41 -05:00
Fabiano Fidêncio
49696bbdf2 Merge pull request #9943 from AdithyaKrishnan/nydus-cleanup-timeout
tests: Fixes TEE timeout issue
2024-07-03 22:57:17 +02:00
Anastassios Nanos
db75b5f3c4 Merge pull request #8070 from nubificus/feat_add-fc-runtime-rs
runtime-rs: firecracker hypervisor backend
2024-07-03 22:29:30 +03:00
Adithya Krishnan Kannan
9250858c3e tests: Stop trying to patch finalize
We have not seen instances of the nydus snapshotter hanging on its
deletion that we must patch its finalize.

Let's just drop this line for now.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-07-03 12:19:26 -05:00
Dan Mihai
ada53744ea Merge pull request #9907 from microsoft/saulparedes/allow_empty_env_vars
genpolicy: allow some empty env vars
2024-07-03 08:07:23 -07:00
Aurélien Bombo
f18e35014f ci: Move run-nerdctl-tests to free runner
See #9940.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:58:11 +00:00
Aurélien Bombo
c0919d6f45 ci: Move run-docker-tests to free runner
Removed the Docker installation step as that's preinstalled in free
runners.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:59 +00:00
Aurélien Bombo
743a765525 ci: Move run-runk to free runner
See #9940.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:48 +00:00
Aurélien Bombo
09cce86cc7 ci: Move run-nydus to free runner
See #9940.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:42 +00:00
Aurélien Bombo
9e1b6064dc ci: Move run-containerd-stability to free runner
Removes the Docker installation step as that's preinstalled on the free
runner:

https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md#tools

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:37 +00:00
Aurélien Bombo
6a0e403acf ci: Move run-cri-containerd to free runner
See #9940.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:29 +00:00
George Pyrros
2d19f3fbd7 runtime-rs: firecracker hypervisor backend
Add a basic runtime-rs `Hypervisor` trait implementation for
AWS Firecracker

- Add basic hypervisor operations (setup / start / stop / add_device)
- Implement AWS Firecracker API on a separate file `fc_api.rs`
- Add support for running jailed (include all sandbox-related content)
- Add initial device support (limited as hotplug is not supported)
- Add separate config for runtime-rs (FC)

Notes:
- devmapper is the only snapshotter supported
- to account for no sharefs support, we copy files in the sandbox (as
  in the GO runtime)
- nerdctl spawn is broken (TODO: #7703)

Fixes: #5268

Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk>
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
Signed-off-by: Charalampos Mainas <cmainas@nubificus.co.uk>
Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk>
2024-07-03 08:30:30 +00:00
GabyCT
e3e3873857 Merge pull request #9954 from GabyCT/topic/sysbenchci
metrics: Remove variable in sysbench that is not being used
2024-07-02 16:58:46 -06:00
GabyCT
0590aab3e6 Merge pull request #9952 from GabyCT/topic/unitjenkins
docs: Remove jenkins reference from unit testing presentation
2024-07-02 15:34:25 -06:00
Aurélien Bombo
33d08a8417 Merge pull request #9825 from microsoft/mahuber/main
osbuilder: allow rootfs builds w/o git or version file deps
2024-07-02 09:38:13 -07:00
Steve Horsman
078a1147a6 Merge pull request #9909 from kata-containers/sprt/gha-cleanup-pt2
ci: Add scheduled job to cleanup resources, pt. II
2024-07-02 17:12:03 +01:00
Gabriela Cervantes
b7da1291ea metrics: Remove variable in sysbench that is not being used
This PR removes the CI_JOB variable which previously was used but
not longer being supported of the metrics sysbench test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-02 15:29:50 +00:00
Wainer Moschetta
ec695f67e1 Merge pull request #9577 from microsoft/saulparedes/topology
genpolicy: add topologySpreadConstraints support
2024-07-02 11:24:26 -03:00
Fabiano Fidêncio
ef3f6515cf Merge pull request #9941 from sprt/temp-disable-test
ci: Temporarily disable kata-deploy and GARM tests
2024-07-02 14:13:46 +02:00
Amulya Meka
dd12089e0d Merge pull request #9914 from Amulyam24/qemu-fix
kata-deploy: fix qemu static build on ppc64le
2024-07-02 10:45:03 +05:30
Saul Paredes
f3f3caa80a genpolicy: update sample
Update pod-one-container.yaml sample

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-01 13:49:08 -07:00
Dan Mihai
75aee526a9 genpolicy: add topologySpreadConstraints support
Allow genpolicy to process Pod YAML files including
topologySpreadConstraints.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-01 13:32:49 -07:00
Gabriela Cervantes
c270df7a9c docs: Remove jenkins reference from unit testing presentation
This PR removes the jenkins reference from unit testing presentation
as this is not longer supported on the kata containers project.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-01 20:26:35 +00:00
GabyCT
e94490232e Merge pull request #9949 from cmaf/tests-fix-openvino-help
tests: Update help section in openvino test
2024-07-01 13:31:51 -06:00
Gabriela Cervantes
e3318a04f7 metrics: Update container name in blogbench test
This PR updates the container name to put a random name instead
of using a hard coded name. This PR is a general improvement
to avoid random bug failures specially when we are running on
baremetal environments.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-01 19:28:16 +00:00
Fabiano Fidêncio
05848d0c34 Merge pull request #9930 from likebreath/0627/clh_v40.0
Upgrade to Cloud Hypervisor v40.0
2024-07-01 20:04:47 +02:00
Steve Horsman
4fd820abd2 Merge pull request #9947 from stevenhorsman/fix-cleanups-workflow-secret
gha: ci: Remove incorrect secrets line
2024-07-01 16:30:37 +01:00
Chelsea Mafrica
0b83c8549a tests: Update help section in openvino test
Test reports that it is a onednn test when it is openvino; update
description.

Fixes: #9948

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-07-01 14:24:50 +00:00
Hyounggyu Choi
795c5dc0ff tests: Extend vfio-ap hotplug test to use zcrypttest
This commit extends the vfio-ap hotplug test to include the use of `zcrypttest`.
A newly introduced test by the tool consists of several test rounds as follows:

- ioctl_test
- simple_test
- simple_one_thread_test
- simple_multi_threads_test
- multi_thread_stress_test
- hang_after_offline_online_test

A writable root filesystem is required for testing because the reference count
needs to be reset after each test round. The current containerd kata containers
support does not include `--privileged_without_host_devices`, which is necessary
to configure a writable filesystem along with `--privileged`. (Please check out
https://github.com/kata-containers/kata-containers/issues/9791 for details)

So `crictl` is chosen to extend the test.

The commit also includes the removal of old commands previously used for the
tests repository but no longer in use.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-01 11:41:59 +02:00
Hyounggyu Choi
5bda197e9d tests: Add zcrypttest tool to test image Dockerfile
This commit copies an internal testing tool `zcrypttest` to the
test image. A base image is changed to `ubuntu:22.04` due to a
library dependency issue.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-01 11:40:49 +02:00
Hyounggyu Choi
99690ab202 runtime: Instantiate/pass vfio-ap device to ociSpec
This commit adds the missing step of passing an attached vfio-ap device
to a container via ociSpec. It instantiates and passes a vfio-ap device
(e.g. a Z crypto device).
A device at `/dev/z90crypt` covers all use cases at the time of writing.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-01 11:40:49 +02:00
Amulyam24
259ec408b5 kata-deploy: fix qemu static build for v8.2.1 on ppc64le
Do not install the packages librados-dev and librbd-dev as they are not needed for building static qemu.

Add machine option cap-ail-mode-3=off while creating the VM to qemu cmdline.
Fixes: #9893

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-07-01 14:56:43 +05:30
stevenhorsman
16130e473c gha: ci: Remove incorrect secrets line
The CI is failing with:
```
Invalid workflow file: .github/workflows/cleanup-resources.yaml#L10
The workflow is not valid. .github/workflows/cleanup-resources.yaml (Line: 10, Col: 5): Unexpected value 'secrets'
```
I think this is because `secrets: inherit` is only applicable
when re-using a workflow, not for a standalone job like
we have here.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-01 09:32:58 +01:00
Hyounggyu Choi
f0187ff969 Merge pull request #9932 from BbolroC/drop-ci-install-go
CI: Eliminate dependency on tests repo
2024-07-01 08:24:28 +02:00
Hyounggyu Choi
f2bfc306a2 Merge pull request #9936 from BbolroC/use-quay-lpine-bash-curl
CI: Use multi-arch image for alpine-bash-curl
2024-07-01 08:02:01 +02:00
Manuel Huber
4b2e725d03 rootfs: Install Rust only when necessary
For docker-based builds only install Rust when necessary.
Further, execute the detect Rust version check only when
intending to install Rust.
As of today, this is the case when we intend to build the
agent during rootfs build.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2024-06-28 22:19:46 +00:00
Aurélien Bombo
c605fff4c1 ci: Temporarily disable kata-deploy and GARM tests
Per the decision taken in the 6/27 AC meeting, this PR temporarily
disables kata-deploy and GARM tests until we secure further Azure CI
funding.

In the meantime, I'll transition the GARM tests to free runners and
reenable them to regain that coverage without affecting spending (see
#9940). If it turns out the free runners are too slow, we'll switch back
to GARM.

After funding is secured, we'll reenable the kata-deploy tests (see
#9939).

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-06-28 20:23:07 +00:00
Hyounggyu Choi
dd23beeb05 CI: Eliminating dependency on clone_tests_repo()
As part of archiving the tests repo, we are eliminating the dependency on
`clone_tests_repo()`. The scripts using the function is as follows:

- `ci/install_rust.sh`.
- `ci/setup.sh`
- `ci/lib.sh`

This commit removes or replaces the files, and makes an adjustment accordingly.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-28 14:52:02 +02:00
Hyounggyu Choi
f2c5f18952 CI: Use multi-arch image for alpine-bash-curl
A multi-arch image for `alpine-bash-curl` has been pushed to and available
at `quay.io/kata-containers`.

This commit switches the test image to `quay.io/kata-containers/alpine-bash-curl`.

Fixes: #9935

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-28 12:01:53 +02:00
Hyounggyu Choi
0e20f60534 CI: Drop unused scripts
The following scripts are not used by the repository any more:

- ci/install_go.sh
- ci/run.sh
- ci/install_vc.sh

Additionally, they rely on the tests repo, which is soon to be archived.

This commit drops the unused scripts.

Fixes: #8507

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-28 07:55:21 +02:00
Zvonko Kaiser
a32b21bd32 Merge pull request #9918 from zvonkok/build-error
rootfs: Fix spurious error
2024-06-27 19:46:51 +02:00
Bo Chen
25e3cab028 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v40.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #9929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-27 09:59:00 -07:00
Bo Chen
ad92d73e43 versions: Upgrade to Cloud Hypervisor v40.0
Details of this release can be found in our roadmap project as iteration
v40.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #9929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-27 09:40:13 -07:00
Alex Lyn
d66c214ae7 Merge pull request #9849 from markyangcc/main
runtime: fix missing of VhostUserDeviceReconnect parameter assignment
2024-06-27 21:48:37 +08:00
Wainer Moschetta
afc1c1a782 Merge pull request #9896 from fitzthum/bump-gc-090
versions: bump coco guest components and trustee
2024-06-27 09:46:06 -03:00
Zvonko Kaiser
29bb9de864 Merge pull request #9923 from BbolroC/increase-interval-max-tries-kubectl
tests: Increase interval and max_tries for kubectl_retry
2024-06-27 09:49:24 +02:00
Hyounggyu Choi
4ec355fb78 tests: Increase interval and max_tries for kubectl_retry
Observed instability in the API server after deploying kata-deploy caused test failures.
(see: https://github.com/kata-containers/kata-containers/actions/runs/9681494440/job/26743286861)
Specifically, `kubectl_retry logs` failed before the API server could respond properly.

This commit increases the interval and max_tries for kubectl_retry(), allowing sufficient
time to handle this situation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-27 08:39:22 +02:00
Aurélien Bombo
2c89828749 ci: Add scheduled job to cleanup resources, pt. II
Follow-up to #9898 and final PR of this set. This implements the actual
deletion logic.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-06-26 17:36:47 +00:00
Zvonko Kaiser
893fd2b59c Merge pull request #9916 from zvonkok/config-fix
gpu: Missing separator
2024-06-26 14:46:47 +02:00
Greg Kurz
fe7ef878d2 Merge pull request #9913 from gkurz/update-kata-ctl-deps
kata-ctl: Update Cargo.lock
2024-06-26 14:31:03 +02:00
Zvonko Kaiser
30ec78b19a rootfs: Fix spurious error
In some DMZ'ed or CI systems the repos are not up to date
and multistrap fails to find the ubuntu-keyring package.
Update the repos to fix this;

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-26 11:10:58 +00:00
Zvonko Kaiser
e0aa54301f gpu: Missing separator
Add the correct separator for replacement

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-26 10:40:35 +00:00
Greg Kurz
ac33a389c0 Merge pull request #9879 from pmores/remove-dependency-on-containerd-bundle-dir-tree
runtime-rs: remove attempt to access sandbox bundle from container bu…
2024-06-26 10:57:50 +02:00
Greg Kurz
db7b2f7aaa kata-ctl: Update Cargo.lock
A previous change missed to refresh Cargo.lock.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-26 08:27:52 +02:00
Tobin Feldman-Fitzthum
dd8605917b versions: bump coco guest components and trustee
Pick up the changes from the newest version of guest-components
and trustee.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2024-06-25 23:56:18 +00:00
GabyCT
81d23a1865 Merge pull request #9897 from GabyCT/topic/montime
tests: Increase timeout to crictl calls on kata monitor tests
2024-06-25 17:27:15 -06:00
Gabriela Cervantes
a8432880f8 tests: Increase timeout to crictl calls on kata monitor tests
This PR increases the timeout to crictl calls on kata monitor
tests to avoid to hit issues every now and avoid random failures.
This PR is very similar to PR #7640.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-25 22:32:47 +00:00
Wainer Moschetta
c4fb6fbda2 Merge pull request #9887 from ldoktor/ci-kata-runtime
ci.ocp: Ensure we smoke-test with the right runtime class
2024-06-25 15:27:27 -03:00
Fabiano Fidêncio
fb44edc22f Merge pull request #9906 from stevenhorsman/TEE-sample-kbs-policy-guards
tests: attestation: Restrict sample policy use
2024-06-25 20:27:13 +02:00
Steve Horsman
c9df743dab Merge pull request #9898 from sprt/gha-cleanup-job
ci: Add scheduled job to cleanup resources, pt. I
2024-06-25 19:11:30 +01:00
Saul Paredes
ce19419d72 genpolicy: allow some empty env vars
Updated genpolicy settings to allow 2 empty environment variables that
may be forgotten to specify (AZURE_CLIENT_ID and AZURE_TENANT_ID)

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-25 10:53:05 -07:00
Aurélien Bombo
0582a9c75b Merge pull request #9864 from 3u13r/feat/genpolicy/layers-cache-file-path
genpolicy: allow specifying layer cache file
2024-06-25 10:42:22 -07:00
Aurélien Bombo
d60b548d61 ci: Add scheduled job to cleanup resources
This is the first part of adding a job to clean up potentially dangling
Azure resources. This will be based on Jeremi's tool from
https://github.com/jepio/kata-azure-automation.

At first, we'll only clean up AKS clusters, as this is what has been
causing us problems lately, but this could very well be extended to
cleaning up entire resource groups, which is why I left the different
names pretty generic (i.e. "resources" instead of "clusters").

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-06-25 16:33:03 +00:00
stevenhorsman
7610b34426 tests: attestation: Restrict sample policy use
- We only want to enable the sample verifier in the KBS for non-TEE
tests, so prevent an edge case where the TEE platform isn't set up
correctly and we might fall back to the sample and get false positives.
To prevent this we add guards around the sample policy enablement and
only run it for non confidential hardware

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-25 16:59:40 +01:00
Steve Horsman
d574d37c4b Merge pull request #9903 from stevenhorsman/authenticated-regsitry-workflow-secrets
workflow: coco: Add auth registry secret
2024-06-25 16:40:46 +01:00
stevenhorsman
d8961cbd4a workflow: coco: Add auth registry secret
- Add the `AUTHENTICATED_IMAGE_USER` and
`AUTHENTICATED_IMAGE_PASSWORD` repository secrets as env vars
to the coco tests, so we can use them to pull an images from
and authenticated registry for testing

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-25 11:11:02 +01:00
Alex Lyn
2c5b3a5c20 Merge pull request #9830 from gaohuatao-1/ght/count-rs
runtime-rs: fix the bug of func count_files
2024-06-25 15:00:46 +08:00
GabyCT
27d75f93e2 Merge pull request #9872 from GabyCT/topic/varmemin
metrics: Improve variable definition in memory inside containers script
2024-06-24 15:30:05 -06:00
Aurélien Bombo
b0cdf4eb0d Merge pull request #9579 from microsoft/saulparedes/add_seccomp_support
genpolicy: ignore SeccompProfile in PodSpec
2024-06-24 08:58:01 -07:00
Wainer Moschetta
bcdc4fde10 Merge pull request #9857 from wainersm/disable_failing_jobs-part2
CI: disable jobs that failed >= 50% on nightly CI recently - part 2
2024-06-24 10:11:05 -03:00
Leonard Cohnen
6a3ed38140 genpolicy: allow specifying layer cache file
Add --layers-cache-file-path flag to allow the user to
specify where the cache file for the container layers
is saved. This allows e.g. to have one cache file
independent of the user's working directory.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-06-24 14:53:27 +02:00
Fabiano Fidêncio
3adf9e250f Merge pull request #9875 from zvonkok/gha-no-sudo-arm64
ci: gha no sudo arm64
2024-06-21 15:28:54 +02:00
Wainer Moschetta
f7e0d6313b Merge pull request #9865 from wainersm/qemu-coco-dev_updates
runtime: updates to qemu-coco-dev configuration
2024-06-21 10:14:30 -03:00
Fabiano Fidêncio
2d552800f2 Merge pull request #9876 from zvonkok/gha-no-sudo-s390x
ci: remove sudo from s390x build
2024-06-21 15:00:31 +02:00
Saul Paredes
44afb4aa5f genpolicy: ignore SeccompProfile in PodSpec
Ignore SeccompProfile in PodSpec

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-20 09:42:17 -07:00
Dan Mihai
7aeaf2502a Merge pull request #9856 from microsoft/danmihai1/new-policy-rules
genpolicy: reject untested CreateContainer field values
2024-06-20 09:34:53 -07:00
GabyCT
9320c2e484 Merge pull request #9845 from GabyCT/topic/fixartifacts
gha: Do not fail when collecting artifacts
2024-06-20 10:15:53 -06:00
Hyounggyu Choi
959a277dc5 Merge pull request #9886 from BbolroC/kernel-config-uv-uapi-s390x
kernel: Add CONFIG_S390_UV_UAPI for s390x
2024-06-20 16:05:15 +02:00
Steve Horsman
d5b4da7331 Merge pull request #9881 from stevenhorsman/remote-hypervisor-policy
runtime: Support policy in remote hypervisor
2024-06-20 14:01:29 +01:00
Hyounggyu Choi
9cb12dfa88 kernel: Add CONFIG_S390_UV_UAPI for s390x
While enabling the attestation for IBM SE, it was observed that
a kernel config `CONFIG_S390_UV_UAPI` is missing.
This config is required to present an ultravisor in the guest VM.
Ths commit adds the missing config.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-20 13:15:33 +02:00
Lukáš Doktor
b08c019003 ci.ocp: Ensure we smoke-test with the right runtime class
we do encourage people to set the KATA_RUNTIME, but it is only used by
the webhook. Let's define it in the main `test.sh` and use it in the
smoke test to ensure the user-defined runtime is smoke-tested rather
than hard-coded kata-qemu one.

Related to: #9804

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-06-20 11:15:02 +02:00
Fabiano Fidêncio
0f2a4d202e Merge pull request #9884 from fidencio/topic/re-enable-tdx-ci
ci: tdx: Re-enable TDX CI
2024-06-20 06:39:06 +02:00
GabyCT
02075f73e9 Merge pull request #9874 from GabyCT/topic/fixvarnerdctl
tests: nerdctl: Fix variables names and remove network
2024-06-19 13:43:25 -06:00
Fabiano Fidêncio
2bab0f31d7 ci: tdx: Re-enable TDX CI
Now, using vanilla kubernetes, let's re-enable the TDX CI and hope it
becomes more stable than it used to be.

The cleanup-snapshotter is now taking ~4 minutes, and that matches with
the other platforms, mainly considering there's a sum of 210 seconds
sleep in the process.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-19 20:08:28 +02:00
Greg Kurz
81972f6ffc Merge pull request #9149 from ryansavino/upgrade-to-qemu-8.2.1
qemu: upgrade to 8.2.4
2024-06-19 19:10:02 +02:00
stevenhorsman
779754dcf6 runtime: Support policy in remote hypervisor
Move the `sandbox.agent.setPolicy` call out of the remoteHypervisor
if, block, so we can use the policy implementation on peer pods

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-19 16:43:53 +01:00
Fabiano Fidêncio
f9862e054c Merge pull request #9882 from fidencio/topic/ci-tdx-use-vanilla-k8s
ci: tdx: Use vanilla k8s instead of k3s
2024-06-19 17:33:00 +02:00
Pavel Mores
6a4919eeb9 runtime-rs: fix misleading log message
get_vmm_master_tid() currently returns an error with the message "cannot
get qemu pid (though it seems running)" when it finds a valid
QemuInner::qemu_process instance but fails to extract the PID out of it.

This condition however in fact means that a qemu child process was running
(otherwise QemuInner::qemu_process would be None) but isn't anymore (id()
returns None).

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:15:24 +02:00
Pavel Mores
af5492e773 runtime-rs: made Qemu::stop_vm() idempotent
Since Hypervisor::stop_vm() is called from the WaitProcess request handling
which appears to be per-container, it can be called multiple times during
kata pod shutdown.  Currently the function errors out on any subsequent
call after the initial one since there's no VM to stop anymore.  This
commit makes the function tolerate that condition.

While it seems conceivable that sandbox shouldn't be stopped by WaitProcess
handling, and the right fix would then have to happen elsewhere, this
commit at least makes qemu driver's behaviour consistent with other
hypervisor drivers in runtime-rs.

We also slightly improve the error message in case there's no
QemuInner::qemu_process instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:15:24 +02:00
Pavel Mores
5fbbff9e5e runtime-rs: remove attempt to access sandbox bundle from container bundle
Since no objections were raised in the linked issue (#9847) this commit
removes the attempt to derive sandbox bundle path from container bundle
path.  As described in more detail in the linked issue, this is container
runtime specific and doesn't seem to serve any purpose.

As for implementation, we hoist the only part of
get_shim_info_from_sandbox() that's still useful (getting the socket
address) directly into the caller and remove the function altogether.

Fixes #9847

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:09:15 +02:00
Fabiano Fidêncio
7127178acc ci: tdx: Use vanilla k8s instead of k3s
We've noticed a bunch of issues related to deploying and deleting the
nydus-snapshotter.  As we don't see the same issues on other machines
using vanilla kubernetes, let's avoid using k3s for now follow the flow.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-19 16:56:15 +02:00
Zvonko Kaiser
beab17f765 Merge pull request #9877 from zvonkok/gha-no-sudo-ppc64
ci: gha no sudo ppc64
2024-06-19 14:02:05 +02:00
Zvonko Kaiser
d783ddaf03 ci: Remove not needed chown for ppc64
Now that all artifacts are owned by $USER no extra step needed
to adjust ownership

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:56:45 +00:00
Zvonko Kaiser
5bc37e39d5 ci: remove sudo from ppc64 build
We can now do the same for ppc64 that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:55:45 +00:00
Zvonko Kaiser
c341234c0b ci: remove sudo from s390x build
We can now do the same for s390x that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:53:33 +00:00
Zvonko Kaiser
3beb460a97 ci: Remove not needed chown for arm64
Now that all artifacts are owned by $USER no extra step needed
to adjust ownership

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:48:00 +00:00
Zvonko Kaiser
445b389b16 ci: remove sudo from arm64 build
We can now do the same for arm64 that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:46:51 +00:00
Gabriela Cervantes
6ec7971f7a tests: nerdctl: Fix variables names and remove network
This PR fixes the variables names for the network that was created as well
removes the network that were created for the tests to ensure a clean environment
when running all the tests and avoid failures specially on baremental environments
that network already exists.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-18 23:00:49 +00:00
Dan Mihai
4df66568cf genpolicy: reject untested CreateContainer field values
Reject CreateContainerRequest field values that are not tested by
Kata CI and that might impact the confidentiality of CoCo Guests.

This change uses a "better safe than sorry" approach to untested
fields. It is very possible that in the future we'll encounter
reasonable use cases that will either:

- Show that some of these fields are benign and don't have to be
  verified by Policy, or
- Show that Policy should verify legitimate values of these fields

These are the new CreateContainerRequest Policy rules:

    count(input.shared_mounts) == 0
    is_null(input.string_user)

    i_oci := input.OCI
    is_null(i_oci.Hooks)
    is_null(i_oci.Linux.Seccomp)
    is_null(i_oci.Solaris)
    is_null(i_oci.Windows)

    i_linux := i_oci.Linux
    count(i_linux.GIDMappings) == 0
    count(i_linux.MountLabel) == 0
    count(i_linux.Resources.Devices) == 0
    count(i_linux.RootfsPropagation) == 0
    count(i_linux.UIDMappings) == 0
    is_null(i_linux.IntelRdt)
    is_null(i_linux.Resources.BlockIO)
    is_null(i_linux.Resources.Network)
    is_null(i_linux.Resources.Pids)
    is_null(i_linux.Seccomp)
    i_linux.Sysctl == {}

    i_process := i_oci.Process
    count(i_process.SelinuxLabel) == 0
    count(i_process.User.Username) == 0

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-18 18:09:31 +00:00
Wainer Moschetta
cf372f41bf Merge pull request #9869 from fidencio/topic/disable-tdx-ci
ci: tdx: Disable TDX CI
2024-06-18 14:47:38 -03:00
Gabriela Cervantes
671d9af456 metrics: Improve variable definition in memory inside containers script
This PR improves the variable definition in memory inside
the container script for metrics. This change declares and assigns
the variables separately to avoid masking return values.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-18 16:56:12 +00:00
Gabriela Cervantes
eeb467bdc2 gha: Do not fail when collecting artifacts
This PR will avoid the failures when collecting artifacts for the gha.
This will ensure that we collect and archive system's data for the
purpose of debugging.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-18 16:05:23 +00:00
Zvonko Kaiser
b1909e940e deploy: Add busybox target
For a minimal initrd/image build we may want to leverage busybox.
This is part number two of the NVIDIA initrd/image build

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-18 15:31:00 +00:00
Wainer Moschetta
36093e86e0 Merge pull request #9863 from wainersm/kata-deploy_yq
kata-deploy: always copy ci/install_yq.sh
2024-06-18 10:05:41 -03:00
Fabiano Fidêncio
587f4d45de ci: tdx: Disable TDX CI
TDX CI has been having some issues with the Nydus snapshotter cleanup,
which has been stuck for hours depending every now and then.

With this in mind, let's disable the TDX CI, so we avoid it blocking the
progress of Kata Containers project, and we re-enable it as soon as we
have it solved on Intel's side.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-18 10:30:40 +02:00
markyangcc
a28bf266f9 runtime: fix missing of VhostUserDeviceReconnect parameter assignment
Commit 'ca02c9f5124e' implements the vhost-user-blk reconnection functionality,
However, it has missed assigning VhostUserDeviceReconnect when new the QEMU
HypervisorConfig, resulting in VhostUserDeviceReconnect always set to default value 0.

Real change is this line, most of changes caused by go format,

return vc.HypervisorConfig{
	// ...
	VhostUserDeviceReconnect: h.VhostUserDeviceReconnect,
}, nil

Fixes: #9848
Signed-off-by: markyangcc <mmdou3@163.com>
2024-06-18 12:15:10 +08:00
Alex Lyn
388cd7dde4 Merge pull request #9772 from pmores/add-base-qmp-framework
runtime-rs: add base qmp framework
2024-06-18 09:53:28 +08:00
Alex Lyn
275c498dc9 Merge pull request #9834 from lifupan/main
sandbox: fix the issue of failed to get the vmm master tid
2024-06-18 08:57:21 +08:00
Alex Lyn
d3fb6bfd35 Merge pull request #9860 from stevenhorsman/tokio-vulnerability-bump
Tokio vulnerability bump
2024-06-18 08:35:34 +08:00
Wainer dos Santos Moschetta
bdbee78517 runtime: allow default_{vcpus,memory} annotations to qemu-coco-dev
This is a counterpart of commit abf52420a4 for the qemu-coco-dev
configuration. By allowing default_vcpu and default_memory annotations
users can fine-tune the VM based on the size of the container
image to avoid issues related with pulling large images in the guest.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 18:59:52 -03:00
Wainer dos Santos Moschetta
baa8d9d99c runtime: set shared_fs=none to qemu-coco-dev configuration
Just like the TEE configurations (sev, snp, tdx) we want to have the
qemu-coco-dev using shared_fs=none.

Fixes: #9676
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 18:42:46 -03:00
Wainer Moschetta
b8d7a8c546 Merge pull request #9862 from BbolroC/improve-kubectl-retry
tests: Use selector rather than pod name for kubectl logs/describe
2024-06-17 18:33:24 -03:00
Hyounggyu Choi
6b065f5609 tests: Use selector rather than pod name for kubectl logs/describe
The following error was observed during the deployment of nydus snapshotter:

```
Error from server (NotFound):
the server could not find the requested resource ( pods/log nydus-snapshotter-5v82v)
  'kubectl logs nydus-snapshotter-5v82v -n nydus-system' failed after 3 tries
    Error: Process completed with exit code 1.
```

This error can occur when a pod is re-created by a daemonset during the retry interval.
This commit addresses the issue by using `--selector` rather than the pod name
for `kubectl logs/describe`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-17 22:27:50 +02:00
Wainer Moschetta
7df221a8f9 Merge pull request #9833 from wainersm/qemu-rs_tests
tests/k8s: run for qemu-runtime-rs on AKS
2024-06-17 16:59:46 -03:00
Zvonko Kaiser
5f11c0f144 Merge pull request #9861 from zvonkok/release-3.6.0
release: Bump VERSIONS file to 3.6.0
2024-06-17 20:35:29 +02:00
Wainer Moschetta
b6a28bd932 Merge pull request #9786 from microsoft/saulparedes/add_back_insecure_registry_pull
genpolicy: add back support for insecure
2024-06-17 15:21:25 -03:00
Wainer Moschetta
68415dabcd Merge pull request #9815 from msanft/fix/genpolicy/flag-name
genpolicy: fix settings path flag name
2024-06-17 15:13:25 -03:00
Wainer dos Santos Moschetta
08eaa60b59 CI: disable all run-kata-deploy-tests-on-garm jobs
The following jobs have failed more than 50% on nightly CI.

run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, k0s)
run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, rke2)
run-kata-deploy-tests-on-garm / run-kata-deploy-tests (qemu, k0s)

Instead of removing only those jobs, let's skip the kata-deploy-tests
on GARM completely so we can try to fix all the issues (or maybe
drop the jobs altogether).

Issue: #9854
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 14:39:38 -03:00
Steve Horsman
4a41cee534 Merge pull request #9838 from zvonkok/gha-no-sudo
CI: remove sudo from GHA
2024-06-17 16:23:39 +01:00
Wainer dos Santos Moschetta
e517167825 kata-deploy: always copy ci/install_yq.sh
To build the build-kata-deploy image, it should be copied ci/install_yq.sh to
tools/packaging/kata-deploy/local-build/dockerbuild as this script will install
yq within the image. Currently, if
tools/packaging/kata-deploy/local-build/dockerbuild/install_yq.sh exists then
make won't copy it again. This can raise problems as, for example, the current
update of yq version (commit c99ba42d) in ci/install_yq.sh won't force the
rebuild of the build-kata-deploy image.

Note: this isn't a problem on a fresh dev or CI environment.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 12:18:22 -03:00
Zvonko Kaiser
618121a654 release: Bump VERSIONS file to 3.6.0
Let's bump the VERSIONS file and start preparing for a new release of
the project.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-17 12:06:46 +00:00
stevenhorsman
53659f1ede libs: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
35f6be97df runtime-rs: Update tokio dependency
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

If possible it would be good to add the many runtime-rs creates into the
runtime-rs workspace and provide a centralised version to avoid the updates
in many places.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
3bb1a67d80 agent-ctl: Update rustjail dependencies
- Run `cargo update -p rustjail` to pick up rustjail's bump of
tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
d2d35d2dcc runk: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
adda401a8c genpolicy: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
b7928f465e agent: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:02:47 +01:00
Zvonko Kaiser
5c2f3f34a8 CI: remove sudo from GHA
Now that all artifacts are owned by $USER we can start
to remove sudo from our GHA

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-17 11:06:56 +00:00
Steve Horsman
cce735a09e Merge pull request #9840 from stevenhorsman/bump-agent-rust-1.75.0
versions: Bump rust toolchain
2024-06-17 11:28:07 +01:00
Fupan Li
b218c4bc10 Merge pull request #9836 from lifupan/main_fix
sandbox: fix the issue of double initial_size_manager config
2024-06-17 09:15:51 +08:00
Fabiano Fidêncio
9b5dd854db Merge pull request #9726 from GabyCT/topic/unodeport
tests: kbs: Use nodeport deployment from upstream trustee
2024-06-16 22:31:27 +02:00
Wainer dos Santos Moschetta
d4f664b73b CI: disable run-kata-monitor-tests / run-monitor (containerd, lts) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: #9853
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:27:04 -03:00
Wainer dos Santos Moschetta
cbf0b7ca7b CI: disable run-basic-amd64-tests / run-nerdctl-tests (clh) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: #9852
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:17:26 -03:00
Wainer dos Santos Moschetta
562820449e CI: disable run-basic-amd64-tests / run-vfio (qemu) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

The clh variation was disabled on commit 5f5274e699 so this change will
actually result on all the VFIO jobs disabled. Instead of delete the entire
entry from this workflow yaml (or comment the entry), I preferred to use
`if: false` which will make the jobs appear on the UI as skipped.

Issue: 9851
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:09:59 -03:00
GabyCT
4800e242a4 Merge pull request #9832 from GabyCT/topic/fixsets
tests: setup: Improve setup script for kubernetes tests
2024-06-14 11:14:05 -06:00
Bo Chen
a68aeca356 Merge pull request #9575 from likebreath/0430/clh_v39.0
versions: Upgrade to Cloud Hypervisor v39.0
2024-06-14 09:10:19 -07:00
stevenhorsman
e23b929ba0 versions: Bump rust toolchain
- Bump the rust version used to build the agent to 1.75.0 as
agreed on in the AC meeting

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-14 16:45:16 +01:00
stevenhorsman
3fb176970f dragonball: Fix device manager warning
- Fix the lint error:
```
error: you seem to use `.enumerate()` and immediately discard the index
   --> src/device_manager/mod.rs:427:33
    |
427 |         for (_index, device) in self.virtio_devices.iter().enumerate() {
    |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
 by removing the unnecessary enumerate

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-14 16:45:16 +01:00
stevenhorsman
1ea2671f2f dragonball: Fix lint with rust 1.75.0
The ci failed with:
```
error: use of `or_insert_with` to construct default value
   --> src/address_space_manager.rs:650:14
    |
650 |             .or_insert_with(NumaNode::new);
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `or_default()`
    |
```

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-14 16:45:16 +01:00
Steve Horsman
ab8a9882c1 Merge pull request #9818 from EmmEff/fix-spelling
runtime: fix minor spelling issues
2024-06-14 13:12:56 +01:00
Steve Horsman
99bf95f773 Merge pull request #9827 from littlejawa/fix_panic_on_metrics_gathering
runtime: avoid panic on metrics gathering
2024-06-14 11:12:43 +01:00
Steve Horsman
3eba4211f3 Merge pull request #9843 from microsoft/danmihai1/install_yq
ci: fix the expected yq version string
2024-06-14 10:26:21 +01:00
Pavel Mores
380f8ad03f runtime-rs: add base vCPU hotplugging support
We take advantage of the Inner pattern to enable QemuInner::resize_vcpu()
take `&mut self` which we need to call non-const functions on Qmp.

This runs on Intel architecture but will need to be verified and ported
(if necessary) to other architectures in the future.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Pavel Mores
8231c6c4a3 runtime-rs: instantiate Qmp as (optional) member of QemuInner
The QMP_SOCKET_FILE constant in cmdline_generator.rs is made public to make
it accessible from QemuInner.  This is fine for now however if the constant
needs to be accessed from additional places in the future we could consider
moving it to somewhere more visible.

The Debug impl for Qmp is empty since first, we don't actually want it,
it's only forced by Hypervisor trait bounds, and second, it doesn't have
anything to display anyway.  If Qmp gets any members in the future that
can be meaningfully displayed they should be handled by Qmp's Debug::fmt().

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Pavel Mores
6fdb262dca runtime-rs: add Qmp object to encapsulate QMP functionality
The constructor handles QMP connection initialisation, too, so there can
be non-functional Qmp instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Manuel Huber
62fd84dfd8 build: allow rootfs builds w/o git or VERSION file deps
We set the VERSION variable consistently across Makefiles to
'unknown'  if the file is empty or not present.
We also use git commands consistently for calculating the COMMIT,
COMMIT_NO variables, not erroring out when building outside of
a git repository.
In create_summary_file we also account for a missing/empty VERSION
file.
This makes e.g. the UVM build process in an environment where we
build outside of git with a minimal/reduced set of files smoother.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2024-06-13 22:46:52 +00:00
Dan Mihai
824287d64a Merge pull request #9844 from microsoft/danmihai1/k8s-policy-pvc
tests: fix yq command line in k8s-policy-pvc
2024-06-13 15:07:15 -07:00
Wainer dos Santos Moschetta
73ab5942fb tests/k8s: run for qemu-runtime-rs on AKS
The following tests are disabled because they fail (alike with dragonball):

- k8s-cpu-ns.bats
- k8s-number-cpus.bats
- k8s-sandbox-vcpus-allocation.bats

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-13 16:20:59 -03:00
Mike Frisch
c2f61b0fe3 runtime: spelling fixes
Minor spelling fixes in runtime log messages.

Signed-off-by: Mike Frisch <mikef17@gmail.com>
2024-06-13 12:11:34 -04:00
Dan Mihai
56f9e23710 tests: fix yq command line in k8s-policy-pvc
Fix the collision between:
- https://github.com/kata-containers/kata-containers/pull/9377
- https://github.com/kata-containers/kata-containers/pull/9706

One enabled a newer yq command line format and the other used the
older format. Both passed CI because they were not tested together.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-13 16:06:15 +00:00
Dan Mihai
23e99e264c ci: fix the expected yq version string
I get:

~/gopath/bin/yq --version
yq (https://github.com/mikefarah/yq/) version v4.40.7

Also add support for set -o xtrace to install_yq.sh.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-13 15:52:26 +00:00
Ryan Savino
0430794952 qemu: upgrade to 8.2.4
There is a known issue in qemu 7.2.0 that causes kernel-hashes to fail the verification of the launch binaries for the SEV legacy use case.

Upgraded to qemu 8.2.4.
new available features disabled.

Fixes: #9148

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-13 10:19:42 -05:00
Greg Kurz
b85b1c1058 Merge pull request #9790 from gkurz/kill-some-dead-runtime-code
Kill some dead runtime code
2024-06-13 15:45:51 +02:00
gaohuatao
4cb4e44234 runtime-rs: fix the bug of func count_files
When the total number of files observed is greater than limit, return -1 directly.
runtime has fixed this bug, it should b ported to runtime-rs.

Fixes:#9829

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2024-06-13 16:02:33 +08:00
Fupan Li
cd68ef372f sandbox: fix the issue of double initial_size_manager config
It shouldn't call the initial_size_manager's setup_config
in the load_config since it had been called in the sandbox's
try_init function.

Fixes: #9778

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-06-13 15:44:51 +08:00
Fupan Li
61687992f4 sandbox: fix the issue of failed to get the vmm master tid
For kata container, the container's pid is meaning less to
containerd/crio since the container's pid is belonged to VM,
and containerd/crio couldn't use it. Thus we just return any
tid of kata shim or hypervisor. But since the hypervisor had
been stopped before deleting the container, and it wouldn't
get the hypervisor's tid for some supported hypervisor, thus
we'd better to return the kata shim's pid instead of hypervisor's
tid.

Fixes: #9777

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-06-13 10:27:04 +08:00
Fabiano Fidêncio
56423cbbfe Merge pull request #9706 from burgerdev/burgerdev/genpolicy-devices
genpolicy: add support for devices
2024-06-12 23:03:41 +02:00
Wainer Moschetta
d971e5ae68 Merge pull request #9537 from wainersm/kata-deploy-crio
kata-deploy: configuring CRI-O for guest-pull image pulling
2024-06-12 17:27:00 -03:00
Gabriela Cervantes
c36c300fd6 tests: kbs: Use nodeport deployment from upstream trustee
This PR uses the nodeport deployment from upstream trustee.
To ensure our deployment is as close to upstream trustee replace
the custom nodeport handling and replace it with nodeport
kustomized flavour from the trustee project.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-12 20:01:59 +00:00
Gabriela Cervantes
0066aebd84 tests: setup: Improve setup script for kubernetes tests
This PR makes general improvements like definition of variables and
the use of them to improve the general setup script for kubernetes
tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-12 19:39:54 +00:00
GabyCT
461b6e7c93 Merge pull request #9821 from GabyCT/topic/fixts
metrics: Use function definition to have uniformity
2024-06-12 10:04:28 -06:00
Fabiano Fidêncio
3a0247ed43 Merge pull request #9819 from stevenhorsman/config-envvar-precedence
agent: config: Ensure envs take precedence
2024-06-12 11:26:02 +02:00
Julien Ropé
9c86eb1d35 runtime: avoid panic on metrics gathering
While running with a remote hypervisor, whenever kata-monitor tries to access
metrics from the shim, the shim does a "panic" and no metric can be gathered.

The function GetVirtioFsPid() is called on metrics gathering, and had a call
to "panic()". Since there is no virtiofs process for remote hypervisor, the
right implementation is to return nil. The caller expects that, and will skip
metrics gathering for virtiofs.

Fixes: #9826

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-06-12 10:02:44 +02:00
Xuewei Niu
92cc5e0adb Merge pull request #9781 from gaohuatao-1/ght/shm 2024-06-12 12:39:28 +08:00
Moritz Sanft
84903c898c genpolicy: fix settings path flag name
This corrects the warning to point to the \`-j\` flag,
which is the correct flag for the JSON settings file.
Previously, the warning was confusing, as it pointed to
the \`-p\` flag, which specifies to the path for the Rego ruleset.

Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
2024-06-11 21:17:18 +02:00
Greg Kurz
1acf8d0c35 govmm: Drop QEMU's NoShutdown knob
Code is not used.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Greg Kurz
cb5b548ad7 govmm: Drop QEMU's Daemonize knob
Code isn't used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Greg Kurz
33eaf69d5f virtcontainers: Drop QEMU's Daemonize knob
QEMU isn't started as daemon anymore and this won't change (see #5736
for details). Drop the related code.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Wainer Moschetta
f66a5b6287 Merge pull request #9807 from wainersm/qemu-rs_kata-deploy
kata-deploy: add qemu-runtime-rs runtimeClass
2024-06-11 14:50:01 -03:00
Dan Mihai
d47f40210a Merge pull request #9808 from microsoft/saulparedes/oci_from_settings
genpolicy: load OCI version from settings
2024-06-11 10:42:04 -07:00
Gabriela Cervantes
a96ff49060 metrics: Use function definition to have uniformity
This PR uses the function definition to have uniformity across
all the launch times script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-11 17:36:08 +00:00
Saul Paredes
3e9d6c11a1 genpolicy: add back support for insecure
registries

Adding back changes from
77540503f9.

Fixes: #9008

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-11 09:42:23 -07:00
Bo Chen
2398442c58 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v39.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8694, #9574

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-11 09:42:17 -07:00
Bo Chen
7a82894502 versions: Upgrade to Cloud Hypervisor v39.0
This patch upgrades Cloud Hypervisor to v39.0 from v36.0, which contains
fixes of several security advisories from dependencies. Details can be
found from #9574.

Fixes: #8694, #9574

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-11 09:42:16 -07:00
Wainer dos Santos Moschetta
be9990144a workflow: run kata-deploy tests to qemu-runtime-rs on AKS
Start testing the ability of kata-deploy to install and configure
the qemu-runtime-rs runtimeClass.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-11 12:58:47 -03:00
Wainer dos Santos Moschetta
4f398cc969 kata-deploy: add qemu-runtime-rs runtimeClass
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: #9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-11 12:58:47 -03:00
stevenhorsman
40e02b34cb agent: config: Ensure envs take precedence
- Update the config parsing logic so that when reading from the
agent-config.toml file any envs are still processed
- Add units tests to formalise that the envs take precedence over values
from the command line and the config file

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-11 16:31:10 +01:00
Steve Horsman
59ff40f054 Merge pull request #9811 from mkulke/mkulke/use-kebabcase-for-enum-values-in-config-file-parsing
agent: convert enum vals to kebab-case in cfg file
2024-06-11 14:49:30 +01:00
gaohuatao
638e9acf89 runtime: fix the bug of func countFiles
When the total number of files observed is greater than limit, return (-1, err).
When the returned err is not nil, the func countFiles should return -1.

Fixes:#9780

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2024-06-11 18:17:18 +08:00
Alex Lyn
1c8db85d54 Merge pull request #9784 from Apokleos/bufix-testcases
kata-types: fix bug in kata-types several test cases
2024-06-11 10:01:45 +08:00
Saul Paredes
6a84562c16 genpolicy: load OCI version from settings
Load OCI version from genpolicy-settings.json and validate it in
rules.rego

Fixes: #9593

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-10 15:30:39 -07:00
GabyCT
0c5849b68b Merge pull request #9809 from microsoft/danmihai1/yq-breaking-change
tests: k8s: use newer yq command line format
2024-06-10 16:29:59 -06:00
Wainer Moschetta
ade69e44f9 Merge pull request #9785 from BbolroC/kubectl-retry
CI: Introduce retry mechanism for kubectl in gha-run.sh
2024-06-10 18:33:34 -03:00
Magnus Kulke
abc704a720 agent: convert enum vals to kebab-case in cfg file
fixes #9810

Add an annotation to the enum values in the agent config that will
deserialize them using a kebab-case conversion, aligning the behaviour
to parsing of params specified via kernel cmdline.

drive-by fix: add config override for guest_component_procs variable

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-06-10 21:55:05 +02:00
Dan Mihai
32198620a9 tests: k8s: use newer yq command line format
Fix the recent collision between:
- https://github.com/kata-containers/kata-containers/pull/9377
- https://github.com/kata-containers/kata-containers/pull/9725

One enabled a newer yq command line format and the other used the older
format. Both passed CI because they were not tested together.

Fixes: #9789

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-10 18:48:25 +00:00
Dan Mihai
079a0a017c Merge pull request #9557 from portersrc/ci-debug-output-nydus-pod
CI: describe pod on k8s-create-pod wait failure
2024-06-10 08:17:54 -07:00
Ryan Savino
84280115f6 Merge pull request #9151 from niteeshkd/nd_snp_kernel_hashes
runtime: enable kernel-hashes for SNP confidential container
2024-06-07 18:19:51 -05:00
GabyCT
03bcc167a4 Merge pull request #9779 from GabyCT/topic/fixcoscript
tests: Fix indentation in common script
2024-06-07 15:37:10 -06:00
Wainer Moschetta
7a28535277 Merge pull request #9800 from fidencio/topic/ci-tdx-re-enable-some-of-the-tests
ci: tdx: Re-enable a bunch of volume related tests
2024-06-07 16:17:19 -03:00
Hyounggyu Choi
8ff128dda8 CI: Introduce retry mechanism for kubectl in gha-run.sh
Frequent errors have been observed during k8s e2e tests:

- The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
- Error from server (ServiceUnavailable): the server is currently unable to handle the request
- Error from server (NotFound): the server could not find the requested resource

These errors can be resolved by retrying the kubectl command.

This commit introduces a wrapper function in common.sh that runs kubectl up to 3 times
with a 5-second interval. Initially, this change only covers gha-run.sh for Kubernetes.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-07 18:24:19 +02:00
Fabiano Fidêncio
81c221c1b4 ci: k8s: tdx: Re-enable volume tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:13:36 +02:00
Fabiano Fidêncio
9db9d35198 ci: k8s: tdx: Re-enable projected-volume tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:12:36 +02:00
Fabiano Fidêncio
f6a6cba8ca ci: k8s: tdx: Re-enable nested-configmap-secret tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:12:06 +02:00
Fabiano Fidêncio
957d0cccf6 ci: k8s: tdx: Re-enable inotify tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:10:39 +02:00
Fabiano Fidêncio
fc6f662ae0 ci: k8s: tdx: Re-enable credentials-secrets tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:08:29 +02:00
Fabiano Fidêncio
5741c6d3e6 Merge pull request #9768 from fidencio/topic/ci-tdx-enable-cdh-test
ci: kbs: Enable CDH tests for TDX
2024-06-07 17:59:12 +02:00
Greg Kurz
afeb98d73f Merge pull request #9782 from ldoktor/ci-centos-9
ci.ocp: Switch base to centos-9
2024-06-07 13:15:02 +02:00
Fabiano Fidêncio
fde457589e ci: kbs: tdx: Enable basic attestation tests
Let's stop skipping the CDH tests for TDX, as know we should have an
environmemnt where it can run and should pass. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 12:18:50 +02:00
Fabiano Fidêncio
cac525059e ci: kbs: tdx: Use the hostname ip instead of localhost for the PCCS
We must ensure we use the host ip to connect to the PCCS running on the
host side, instead of using localhost (which has a different meaning
from inside the KBS pod).

The reason we're using `hostname -i` isntead of the helper functions, is
because the helper functions need the coco-kbs deployed for them to
work, and what we do is before the deployment.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 12:18:07 +02:00
Alex Lyn
27685c91e5 kata-types: fix bug in kata-types several test cases
(1) As mis-use of cap.set causing previous Caps lost which
causing assert! failed, just replacing cap.set with cap.add.

(2) It will return error if there's no such name setting when
do update_config_by_annotation {
    ...
if config.runtime.name.is_empty() {
            return Err(io::Error::new(
                io::ErrorKind::InvalidData,
                "Runtime name is missing in the configuration",
            ));
        }
...
}

Fixes #9783

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-06-07 09:16:23 +08:00
David Esparza
822c641b58 Merge pull request #9760 from amshinde/kata-manager-link-runc
kata-manager: Add symlinks for runc and slirp4netns
2024-06-06 12:55:57 -06:00
Lukáš Doktor
699376c535 ci.ocp: Switch base to centos-9
Centos8 is EOL and repos are not available anymore. Centos9 contains the
same packages and should do well as a base for testing.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-06-06 09:03:17 +02:00
Chris Porter
4172ccb3a0 CI: describe pod on k8s-create-pod wait failure
This is generally useful debug output on test failures,
and specifically this has been useful for nydus-related
issues recently.

Signed-off-by: Chris Porter <porter@ibm.com>
2024-06-05 12:37:53 -04:00
Gabriela Cervantes
264c7e9473 tests: Fix indentation in common script
This PR fixes the indentation in common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-05 15:52:40 +00:00
Niteesh Dubey
1dbf5208ac versions: Upgrade ovmf
This is required to support SEV-SNP confidential container with kernel-hashes.
Since this ovmf is latest stable version, it is good to upgrade for tdx
and Vanilaa builds too.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-06-05 15:02:02 +00:00
Niteesh Dubey
62d3d7c58f runtime: enable kernel-hashes for SNP confidential container
This is required to provide the hashes of kernel, initrd and cmdline
needed during the attestation of the coco.

Fixes: #9150

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-06-05 15:02:02 +00:00
Steve Horsman
b30d085271 Merge pull request #9702 from ildikov/blog-submission-guide
docs: Adding blog submission guidelines
2024-06-05 09:03:19 +01:00
Amulya Meka
b323afeda9 Merge pull request #9214 from Amulyam24/oras
kata-deploy: install oras using release artefacts on ppc64le
2024-06-05 11:40:55 +05:30
Fabiano Fidêncio
138ef2c55f Merge pull request #9678 from AdithyaKrishnan/main
TEEs: Skip a few CI tests for SEV/SNP
2024-06-04 23:42:51 +02:00
GabyCT
ba30f0804a Merge pull request #9770 from GabyCT/topic/fixvad
tests: Use variable definition for better uniformity
2024-06-04 15:23:34 -06:00
Wainer dos Santos Moschetta
af4f9afb71 kata-deploy: add PULL_TYPE handler for CRI-O
A new PULL_TYPE environment variable is recognized by the kata-deploy's
install script to allow it to configure CRIO-O for guest-pull image pulling
type.

The tests/integration/kubernetes/gha-run.sh change allows for testing it:
```
export PULL_TYPE=guest-pull
cd tests/integration/kubernetes
./gha-run.sh deploy-k8s
```

Fixes #9474
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-04 14:02:01 -03:00
GabyCT
6c2e8bed77 Merge pull request #9725 from 3u13r/feat/genpolicy/filter-by-runtime
genpolicy: add ability to filter for runtimeClassName
2024-06-04 10:06:14 -06:00
Hyounggyu Choi
869f89c338 Merge pull request #9773 from BbolroC/use-qemu-coco-dev-s390x
GHA: Use qemu-coco-dev for k8s nydus test on s390x
2024-06-04 17:49:38 +02:00
Gabriela Cervantes
cafba23f3e tests: Use variable definition for better uniformity
This PR replaces the name to use a variable that is already defined
to have a better uniformity across the general script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-04 15:49:27 +00:00
Wainer Moschetta
2b8cdd9ff2 Merge pull request #9765 from wainersm/disable_failing_jobs
CI: disable jobs that failed > 50% on nightly CI recently - part 1
2024-06-04 12:05:36 -03:00
Hyounggyu Choi
246ee83768 GHA: Use qemu-coco-dev for k8s nydus test on s390x
In line with the changes for x86_64, the k8s nydus test for s390x should
also use `qemu-coco-dev` for `KATA_HYPERVISOR`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-04 15:49:23 +02:00
Hyounggyu Choi
3aff6c5bd8 CI: Retry fetching node_start_time when it is empty
It was observed that the `node_start_time` value is sometimes empty,
leading to a test failure.

This commit retries fetching the value up to 3 times.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-04 15:41:15 +02:00
Zvonko Kaiser
647560539f Merge pull request #9769 from zvonkok/initrd-image-no-sudo
ci: remove sudo and make sure artifacts is owned by user
2024-06-04 07:16:51 +02:00
Wainer Moschetta
b5561074c3 Merge pull request #9377 from beraldoleal/yqbump
deps: bumping yq to v4.40.7
2024-06-03 14:34:58 -03:00
Ildiko Vancsa
5e03bec26b docs: Adding blog submission guidelines
The Kata blog was recently moved to the project's website. The content
of the blog is stored together with the rest of the website source on
GitHub.

This patch adds a short guide that describes how to submit a new
blog post as a PR, to appear on the project's website.

Signed-off-by: Ildiko Vancsa <ildiko.vancsa@gmail.com>
2024-06-03 08:58:05 -07:00
GabyCT
6c7affbd85 Merge pull request #9741 from GabyCT/topic/staticcheck
tests: Fix indentation in static checks script
2024-06-03 09:43:23 -06:00
Zvonko Kaiser
a48c084e13 ci: remove sudo and make sure image is owed by user
The image build needs special handling since we're doing a lot of
privileged operations.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-03 15:29:06 +00:00
Fabiano Fidêncio
34d45f0868 Merge pull request #9749 from mkulke/mkulke/configure-guest-components-spawning
CoCo: introduce config for guest-components procs
2024-06-03 15:50:36 +02:00
Ryan Savino
72dc823059 tests: k8s: sev: snp: skip "setting sysctl" test
This test fails when using `shared_fs=none` with the nydus snapshotter.
Issue tracked here: #9666
Skipping for now.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:17 -05:00
Ryan Savino
3f3be54893 tests: k8s: sev: snp: skip initContainers shared vol test
This test is failing due to the initContainers not being properly
handled with the guest image pulling.
Issue tracked here: #9668
Skipping for now.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:17 -05:00
Ryan Savino
35dfb730ce tests: k8s: sev: snp: skip "kill all processes in container" test
This test fails when using `shared_fs=none` with the nydus napshotter,
Issue tracked here: #9664
Skipping for now.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
Ryan Savino
62cc1dec4c tests: replace docker debug alpine image with ghcr
docker alpine latest image is rate limited.
Need to use ghcr.io image.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
ChengyuZhu6
1820b02993 tests: replace busybox from docker with quay in guest pull
To prevent download failures caused by high traffic to the Docker image,
opt for quay.io/prometheus/busybox:latest over docker.io/library/busybox:latest .

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-06-03 01:14:16 -05:00
Ryan Savino
6c646dc96d tests: k8s: sev: snp: add runtime annotation for sev and snp
sev and snp cases added to the KATA_HYPERVISOR switch.

Signed-off-by: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
Ryan Savino
6db08ed620 runtime: sev: snp: Use shared_fs=none
Disabling 9p for SEV and SNP TEEs.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
Ryan Savino
668959408d tests: ensure kata_deploy cleanup even if namespace deletion fails
the test cluster namespace deletion failing causes kata_deploy to not get cleaned up.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:15 -05:00
Wainer dos Santos Moschetta
c9f93fc507 github: add actionlint configuration file
Added configuration file with rules to exclude some self-hosted
runners from the linter warnings.

Related-with: #9646
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:46:09 -03:00
Wainer dos Santos Moschetta
5f5274e699 CI: disable run-basic-amd64-tests / run-vfio (clh) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: 9764
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:34:45 -03:00
Wainer dos Santos Moschetta
9154ce9051 CI: disable run-basic-amd64-tests / run-tracing jobs
These jobs have failed more than 50% on nightly CI. Remove them from the list of
execution until we don't have a fix.

Issue: 9763
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:26:58 -03:00
Wainer dos Santos Moschetta
ac4d48ad17 CI: disable run-kata-monitor-tests / run-monitor (qemu, containerd) job
This job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: 9761
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:21:21 -03:00
Archana Shinde
7a3e13fae8 kata-manager: Add symlinks for runc and slirp4netns
For nerdctl install, add symlinks for runc and slirp4netns in the
binary install path.
runc link comes in handy for running runc containers with nerdctl fir
quick tests.
slirp4netns allows for running containers with user mode networking
useful in case of rootless containers.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-05-31 13:53:42 -07:00
Markus Rudy
13310587ed genpolicy: check requested devices
CreateContainerRequest objects can specify devices to be created inside
the guest VM. This change ensures that requested devices have a
corresponding entry in the PodSpec.

Devices that are added to the pod dynamically, for example via the
Device Plugin architecture, can be allowlisted globally by adding their
definition to the settings file.

Fixes: #9651
Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-05-31 22:05:49 +02:00
Wainer Moschetta
f093c4c190 Merge pull request #9754 from wainersm/qemu_coco_dev-enable_policy_tests
tests/k8s: enable policy tests for qemu-coco-dev
2024-05-31 15:09:25 -03:00
Markus Rudy
ea578f0a80 genpolicy: add support for VolumeDevices
This adds structs and fields required to parse PodSpecs with
VolumeDevices and PVCs with non-default VolumeModes.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-05-31 19:34:14 +02:00
Beraldo Leal
d3a5eb299a tools: bumping kernel config version
Lets make ci happy.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
53b8158a81 tests: adding debug and skip to kata-deploy
If a test is failing during setup, makes no much sense to run the suite.
Let's skip and add some debug messages.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
9171821d57 tests: add debug message to check return code
Lets add this message to make sure sh is starting properly.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
f91fbef184 tests: increase time after sh execution
Increased sleep duration to ensure the shell process starts.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
ba5d2e54c2 tests: remove object separation mark from eof
End of file should not end with --- mark. This will confuse tools like
yq and kubectl that might think this is another object.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
3e8b4806b8 tests: increase debug messages for kata-deploy
When the timeout happens we can't tell much information about the nodes.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
c99ba42d62 deps: bumping yq to v4.40.7
Since yq frequently updates, let's upgrade to a version from February to
bypass potential issues with versions 4.41-4.43 for now. We can always
upgrade to the newest version if necessary.

Fixes #9354
Depends-on:github.com/kata-containers/tests#5818

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
4f6732595d ci: skip go version check
golang.mk is not ready to deal with non GOPATH installs. This is
breaking test on s390x.

Since previous steps here are installing go and yq our way, we could
skip this aditional check. A full refactor to golang.mk would be needed
to work with different paths.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Greg Kurz
7886ed6670 Merge pull request #9751 from wainersm/k8s_print_logs_on_fail
tests/k8s: print logs on fail only (k8s-confidential-attestation.bats)
2024-05-31 14:47:27 +02:00
Fabiano Fidêncio
44df674232 Merge pull request #9757 from fidencio/topic/ci-tdx-skip-empty-dir-tests
ci: k8s: Skip empty dir tests also for TDX
2024-05-31 13:18:35 +02:00
Magnus Kulke
9f04dc4c8b agent: introduce config for coco attestion procs
fixes #9748

A configuration option `guest_component_procs` has been introduced that
indicates which guest component processes are supposed to be spawned by
the agent. The default behaviour remains that all of those processes are
actively spawned by the agent. At the moment this is based on presence
of binaries in the rootfs and the guest_component_api_rest option.

The new option is incremental:

none -> attestation-agent -> confidential-data-hub -> api-server-rest

e.g. api-server-rest implies attestation-agent and confidential-data-hub

the `none` option has been removed from guest_component_api_rest, since
this is addresses by the introduced option.

To not change expected behaviour for  non-coco guests we still will still
only attempt to spawn the processes if the requested attestation binaries
are present on the rootfs, and issue in warning in those cases.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-05-31 12:15:41 +02:00
Amulyam24
eadcb868f4 kata-deploy: install oras using release artefacts on ppc64le
We are currently building Oras from source on ppc64le. Now that they offically release the artefacts
for power, consume them to install Oras.

Fixes: #9213

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-05-31 14:16:14 +05:30
Zvonko Kaiser
0321a3adcc Merge pull request #8944 from zvonkok/update-threat-model
threat-model: Add VFIO, ACPI and KVM/VMM threat-model descriptions
2024-05-31 10:38:27 +02:00
Fabiano Fidêncio
03a7cf4b02 ci: k8s: Skip empty dir tests also for TDX
Wainer noticed this is failing for the coco-qemu-dev case, and decided
to skip it, notifying me that he didn't fully understand why it was not
failing on TDX.

Turns out, though, this is also failing on TDX, and we need to skip it
there as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-31 09:59:46 +02:00
Fabiano Fidêncio
72a71ff2bf Merge pull request #9737 from zvonkok/kata-deploy-no-sudo
ci: kata-deploy no sudo
2024-05-31 09:55:24 +02:00
Zvonko Kaiser
dd89d35b75 Merge pull request #9747 from zvonkok/remove-git-config
ci: Remove all git config safe.directory
2024-05-31 07:25:28 +02:00
Leonard Cohnen
1d1690e2a4 genpolicy: add ability to filter for runtimeClassName
Add the CLI flag --runtime-class-names, which is used during
policy generation. For resources that can define a
runtimeClassName (e.g., Pods, Deployments, ReplicaSets,...)
the value must have any of the --runtime-class-names as
prefix, otherwise the resource is ignored.

This allows to run genpolicy on larger yaml
files defining many different resources and only generating
a policy for resources which will be deployed in a
confidential context.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-05-31 03:17:02 +02:00
Wainer dos Santos Moschetta
3333f8ddfd tests/k8s: enable policy tests for qemu-coco-dev
So qemu-coco-dev is on pair with the TEE configurations.

Fixes: #9753
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-30 21:51:15 -03:00
Wainer Moschetta
83fa813700 Merge pull request #9694 from wainersm/qemu_coco_dev-k8s-guest-pull
tests: enable guest-pull on all k8s tests for the qemu-coco-dev configuration
2024-05-30 21:48:11 -03:00
Wainer dos Santos Moschetta
55ae98eb28 tests/k8s: print logs on fail only (k8s-confidential-attestation.bats)
Use the variable BATS_TEST_COMPLETED which is defined by the bats framework
when the test finishes. `BATS_TEST_COMPLETED=` (empty) means the test failed,
so the node syslogs will be printed only at that condition.

Fixes: #9750
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-30 17:19:33 -03:00
Wainer Moschetta
66e3b88694 Merge pull request #9746 from wainersm/nydus_snapshotter_pin
ci: pin the nydus-snapshotter image version
2024-05-30 16:49:10 -03:00
Wainer dos Santos Moschetta
3e18fe7805 tests/k8s: skip file volume tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: #9667
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-30 14:50:59 -03:00
Zvonko Kaiser
063db516f2 ci: Remove all git config safe.directory
Now with the sudo less build we should be good
to remove those hacks.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-30 15:12:28 +00:00
Zvonko Kaiser
d8889684f0 ci: kata-deploy no sudo
Build/push/manage aritfacts without sudo

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-30 15:07:27 +00:00
Wainer dos Santos Moschetta
5faf9ca344 ci: pin the nydus-snapshotter image version
It's cloning the nydus-snapshotter repo from the version specified in
versions.yaml, however, the deployment files are set to pull in the
latest version of the snapshotter image. With this version we are
pinning the image version too.

This is a temporary fix as it should be better worked out at nydus-snapshotter
project side.

Fixes: #9742
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-30 11:21:16 -03:00
Greg Kurz
b3cb19b6a7 Merge pull request #9639 from emanuellima1/rng-impl
runtime-rs: Add RNG to QEMU cmdline
2024-05-30 12:00:11 +02:00
Zvonko Kaiser
7cc0ebe75e Merge pull request #9743 from zvonkok/tools-fix
ci: Fix tools builder images
2024-05-30 11:53:34 +02:00
Zvonko Kaiser
02a7f8c852 ci: Fix tools builder images
We weren't considering changes of the tools script dir
adding a fourth hash to accomodate this

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-30 08:10:42 +00:00
Fabiano Fidêncio
97806dbdaa Merge pull request #9732 from zvonkok/shim-v2-no-sudo
ci: shim-v2 no sudo
2024-05-30 07:01:04 +02:00
Wainer dos Santos Moschetta
37894923c1 tests/k8s: skip empty dir volumes tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
79a8b31ec5 tests/k8s: skip shared volume tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: #9668
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
aa1a37081e tests/k8s: skip sysctls tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: #9666
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
0e81ced9f1 tests/k8s: skip kill-all-process tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: #9664
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
18896efa3c tests/k8s: skip seccomp tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.
Unlike other tests that I've seen failing on this scenario, k8s-seccomp.bats
fails after a couple of consecutive executions, so it's that kind of failure
that happens once in a while.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
b62ad71c43 tests/k8s: add runtime handler annotation for qemu-coco-dev
This will enable the k8s tests to leverage guest pulling when
PULL_TYPE=guest-pull for qemu-coco-dev runtimeclass.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
089c7ad84a tests/k8s: add runtime handler annotation only for guest-pull
The runtime handler annotation is required for Kubernetes <= 1.28 and
guest-pull pull type. So leverage $PULL_TYPE (which is exported by CI jobs)
to conditionally apply the annotation.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
GabyCT
0eddfdc74f Merge pull request #9731 from zvonkok/pause-no-sudo
ci: pause-image no sudo
2024-05-29 11:48:41 -06:00
Zvonko Kaiser
7354c427f9 Merge pull request #9734 from zvonkok/virtiofsd-no-sudo
ci: virtiofsd no sudo
2024-05-29 19:31:25 +02:00
GabyCT
3c91aa0475 Merge pull request #9739 from zvonkok/initramfs-no-sudo
ci: initramfs no sudo
2024-05-29 11:28:59 -06:00
Hyounggyu Choi
40d2306f95 Merge pull request #9729 from zvonkok/agent-no-sudo-build
ci: build agent without sudo
2024-05-29 19:27:56 +02:00
GabyCT
03be220482 Merge pull request #9730 from zvonkok/kernel-no-sudo
ci: kernel no sudo
2024-05-29 10:23:31 -06:00
GabyCT
a32058913a Merge pull request #9679 from amshinde/kata-manager-install-cni
kata-manager: Copy cni files under /opt/cni
2024-05-29 10:20:34 -06:00
GabyCT
a5808a556d Merge pull request #9733 from zvonkok/tools-no-sudo
ci: tools no sudo
2024-05-29 10:19:17 -06:00
GabyCT
e94b09839d Merge pull request #9736 from zvonkok/qemu-no-sudo
ci: qemu no sudo
2024-05-29 10:18:34 -06:00
GabyCT
6d58fce4a9 Merge pull request #9677 from GabyCT/topic/memoryusags
metrics: Improve variable definition in memory usage script
2024-05-29 10:16:56 -06:00
Emanuel Lima
138d985c64 runtime-rs: Add RNG to QEMU cmdline
It creates this line, as the Golang runtime does:
-object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-05-29 13:11:00 -03:00
Hyounggyu Choi
6ba2461404 Merge pull request #9728 from zvonkok/coco-guest-comp-no-sudo
ci: guest-components without sudo
2024-05-29 17:55:43 +02:00
Gabriela Cervantes
09c3e08f6a tests: Fix indentation in static checks script
This PR fixes the indentation in the static checks script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-29 15:43:44 +00:00
Xuewei Niu
c297a7891c Merge pull request #9723 from zvonkok/hotunplug-fix
vfio: Fix hot-unplug
2024-05-29 22:02:05 +08:00
Zvonko Kaiser
25c784c568 ci: shim-v2 no sudo
Build shim-v2 without sudo docker this is not needed. This is part 6 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-29 09:24:54 +00:00
Zvonko Kaiser
84a9773cec ci: initramfs no sudo
BUild initramfs  without sudo docker this is not needed. This is part 10 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-29 09:20:39 +00:00
Zvonko Kaiser
7dc47c8150 ci: qemu no sudo
Build qemu without sudo docker this is not needed. This is part 9 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 16:12:06 +00:00
Zvonko Kaiser
4a455bf24a ci: virtiofsd no sudo
build virtiofsd without sudo docker this is not needed. This is part 8 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 14:19:58 +00:00
Wainer Moschetta
9896f69827 Merge pull request #9414 from ldoktor/ci-bisection
ci.ocp: Document openshift pipeline and manual bisection
2024-05-28 11:17:09 -03:00
Zvonko Kaiser
dd04d26cb0 ci: tools no sudo
Build tools without sudo docker this is not needed. This is part 7 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 13:57:20 +00:00
Zvonko Kaiser
6c9c0306ac ci: pause-image no sudo
Build pause-image without sudo docker this is not needed. This is part 5 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 11:31:59 +00:00
Hyounggyu Choi
e8c06301d7 Merge pull request #9727 from zvonkok/ovmf-no-sudo
ci: ovmf without sudo
2024-05-28 13:29:00 +02:00
Zvonko Kaiser
c95ae5a502 ci: kernel no sudo
Build kernel without sudo docker this is not needed. This is part 4 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 11:19:08 +00:00
Zvonko Kaiser
8fab5dd584 ci: build agent without sudo
Build agent without sudo docker this is not needed. This is part 3 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 09:55:32 +00:00
Zvonko Kaiser
1e4cbc4fcd ci: guest-components wihout sudo
Build guest-components without sudo docker this is not needed. This is part 2 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 09:03:14 +00:00
Zvonko Kaiser
b76938b922 ci: ovmf without sudo
Build ovmf without sudo docker this is not needed. This is part 1 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 08:25:27 +00:00
Zvonko Kaiser
c6c20ac253 docs: Format the threat-model to 80 chars
Truncate long lines to reasonable 80 characters

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 07:39:26 +00:00
Zvonko Kaiser
d4832b3b74 vfio: Fix hotpunplug
We need to remove the device from the tracking map, a container
restart will increment the bus index and we will get out of root-ports
and crash the machine.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 07:37:30 +00:00
Zvonko Kaiser
a7931115a0 Merge pull request #8861 from zvonkok/config-pcie-root-switch-port
gpu: reintroduce pcie_root_port and add pcie_switch_port
2024-05-27 13:17:57 +02:00
Fabiano Fidêncio
3276bb52b6 Merge pull request #9721 from fidencio/topic/ci-kata-deploy-improvements-and-fixes
kata-deploy / kata-cleanup / ci: Fixes and improvements to kata-deploy / kata-cleanup and its usage in the CI
2024-05-27 12:29:40 +02:00
Zvonko Kaiser
4c93bb2d61 qemu: Add CDI device handling for any container type
We need special handling for pod_sandbox, pod_container and
single_container how and when to inject CDI devices

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-27 10:13:01 +00:00
Zvonko Kaiser
c7b41361b2 gpu: reintroduce pcie_root_port and add pcie_switch_port
In Kubernetes we still do not have proper VM sizing
at sandbox creation level. This KEP tries to mitigates
that: kubernetes/enhancements#4113 but this can take
some time until Kube and containerd or other runtimes
have those changes rolled out.

Before we used a static config of VFIO ports, and we
introduced CDI support which needs a patched contianerd.
We want to eliminate the patched continerd in the GPU case
as well.

Fixes: #8860

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-27 10:13:01 +00:00
Fupan Li
6f6a164451 Merge pull request #9268 from zvonkok/kata-agent-createcontainer
kata-agent: CreateContainer Hook
2024-05-27 16:36:22 +08:00
Fabiano Fidêncio
e81e8a4527 tests: kata-deploy: Adjust timeout
10 minutes is waay too long.  Let's give it 4 minutes only.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 06:23:00 +02:00
Fabiano Fidêncio
fba5793c0d tests: kata-deploy: Run the tests from "${repo_root_dir}"
Let's see if it helps with issues like:
```
error: must build at directory: not a valid directory: evalsymlink
failure on
'"/home/runner/actions-runner/_work/kata-containers/kata-containers/tests/functional/kata-deploy/../../..//tools/packaging/kata-deploy/kata-cleanup/overlays/k0s"'
: lstat
/home/runner/actions-runner/_work/kata-containers/kata-containers/tests/functional/kata-deploy/":
no such file or directory
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 06:23:00 +02:00
Fabiano Fidêncio
8a8a7ea0e5 tests: kata-deploy: Show more logs in the setup()
This will also help us to better understand possible failures with the
CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
47d9589e9b tests: kata-deploy: Show output of passing tests
This will help us to debug failures and compare passing and failures
outputs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
dbd0d4a090 gha: Only do preventive cleanups for baremetal
This takes a few minutes that could be saved, so let's avoid doing this
on all the platforms, but simply do this when it's needed (the baremetal
use case).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
ee2ef0641c tests: k8s: Allow passing "all" to run all the tests
Currently only "baremetal" runs all the tests, but we could easily run
"all" locally or using the github provided runners, even when not using
a "baremetal" system.

The reason I'd like to have a differentiation between "all" and
"baremetal" is because "baremetal" may require some cleanup, which "all"
can simply skip if testing against a fresh created VM.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
556227cb51 tests: Add the possibility to deploy k0s / rke2
For now we've only exposed the option to deploy kata-deploy for k3s and
vanilla kubernetes when using containerd.

However, I do need to also deploy k0s and rke2 for an internal CI, and
having those exposed here do not hurt, and allow us to easily expand the
CI at any time in the future.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
e3c2f0b0f1 kata-cleanup: Add k0s kustomization
k0s was added to kata-deploy, but it's kata-cleanup counterpart was
never added.  Let's fix it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
f15d40f8fb kata-deploy: Fix k0s deployment
k0s deployment has been broken since we moved to using `tomlq` in our
scripts.  The reason is that before using `tomlq` our script would,
involuntarily, end up creating the file.

Now, in order to fix the situation, we need to explicitly create the
file and let `tomlq` add the needed content.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Alex Lyn
713c929a64 Merge pull request #9656 from pmores/document-qemu-rs-conventions
runtime-rs: document architecture & implementation conventions in qem…
2024-05-27 10:38:58 +08:00
Xuewei Niu
bb7a1c56e9 Merge pull request #9693 from sidneychang/9690/Adjust-indentation 2024-05-27 00:20:34 +08:00
Alex Lyn
55dbf6121a Merge pull request #9604 from Apokleos/qmp-cmdline01
runtime-rs: add QMP support for Qemu(part I)
2024-05-26 20:22:59 +08:00
Alex Lyn
028b10ce7a Merge pull request #9687 from l8huang/vfio-pci-gk
agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device
2024-05-26 17:48:25 +08:00
Steve Horsman
b89c3e35dd Merge pull request #9583 from cncal/update_check_error_message
runtime: make kata-runtime check error more understandable when /dev/kvm doesn't exist
2024-05-24 17:49:43 +01:00
Alex Lyn
41fb7aeb89 runtime-rs: add QMP params suppport in cmdline
Fixes: #9603

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-05-24 22:16:24 +08:00
Alex Lyn
7ed6c6896b runtime-rs: add an option dbg_monitor_socket for HMP support
This option allows to add a debug monitor socket when
`enable_debug = true` to control QEMU within debugging case.

Fixes: #9603

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-05-24 22:16:17 +08:00
Lei Huang
3624573b12 agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device
The `update_env_pci()` function need the PCI address mapping to
translate the host PCI address to guest PCI address in below
environment variables:
- PCIDEVICE_<prefix>_<resource-name>_INFO
- PCIDEVICE_<prefix>_<resource-name>

So collect PCI address mapping for both vfio-pci-gk and
vfio-pci devices.

Fixes #9614

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-05-23 21:20:01 -07:00
Fupan Li
d73876252e Merge pull request #9690 from justxuewei/agent-timeout
runtime-rs: Remove obsoleted dial_timeout config
2024-05-24 10:31:12 +08:00
Zvonko Kaiser
3affd83e14 Merge pull request #9605 from l8huang/skip-env
kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO
2024-05-23 18:45:00 +02:00
Fabiano Fidêncio
44d6cb7791 Merge pull request #9698 from wainersm/k8s_tests_disable_fail_fast
tests/k8s: disable "fail-fast" behavior by default
2024-05-23 18:28:00 +02:00
Fabiano Fidêncio
d83cf39ba1 Merge pull request #9680 from kata-containers/dependabot/go_modules/src/runtime/go_modules-5e29427af7
build(deps): bump golang.org/x/net from 0.24.0 to 0.25.0 in /src/runtime in the go_modules group across 1 directory
2024-05-23 12:55:29 +02:00
Fabiano Fidêncio
d9ee950d8f Merge pull request #9696 from wainersm/skip_custom_dns_test
tests/k8s: skip custom DNS tests on confidential jobs
2024-05-22 23:57:21 +02:00
GabyCT
e08ad8d1b7 Merge pull request #9686 from GabyCT/topic/fixbootclh
metrics: Fix minvalue for boot time
2024-05-22 15:46:50 -06:00
Wainer dos Santos Moschetta
76735df427 tests/k8s: disable "fail-fast" behavior by default
The k8s test suite halts on the first failure, i.e., failing-fast. This
isn't the behavior that we used to see when running tests on Jenkins and it
seems that running the entire test suite is still the most productive way. So
this disable fail-fast by default.

However, if you still wish to run on fail-fast mode then just export
K8S_TEST_FAIL_FAST=yes in your environment.

Fixes: #9697
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-22 18:27:44 -03:00
Fabiano Fidêncio
8eb061cd5b Merge pull request #9681 from GabyCT/topic/etdx
gha: Enable install kbs and coco components for TDX, but still skip the CDH test
2024-05-22 23:18:42 +02:00
Wainer dos Santos Moschetta
43766cdb96 tests/k8s: skip custom DNS tests on confidential jobs
This test has failed in confidential runtime jobs. Skip it
until we don't have a fix.

Fixes: #9663
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-22 17:08:22 -03:00
Fabiano Fidêncio
904370ecd6 tests: attestation: tdx: Skip test for now
Skipping the test will allow us to have the TDX CI running while we
debug the test.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:04:13 +02:00
Fabiano Fidêncio
414d716eef tests: kbs: Enable cli installation also on CentOS
One of our machines is running CentOS 9 Stream, and we could easily
verify that we can build and install the kbs client there, thus we're
expanding the installation script to also support CentOS 9 Stream.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Fabiano Fidêncio
27d7f4c5b8 tests: kbs: Fix rust installation
`externals.coco-kbs.toolchain` is not defined, get the rust_version from
`externals.coco-trustee.toolchain` instead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Fabiano Fidêncio
fa8b5c76b8 tests: kbs: Add more info for the TDX deployment
Ditto in the commit shortlog.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Fabiano Fidêncio
6ffd7b8425 versions: trustee: Bump version to 6adb8383309cbb7
We're bumping the version in order to bring in the customisation needed
for setting up a custom pccs, which is needed for the KBS integration
tests with Kata Containers + TDX.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Fabiano Fidêncio
dbd1fa51cd tests: kbs: Don't assume /tmp/trustee exists in the machine
Instead, check if the directory exists before pushd'ing into it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Gabriela Cervantes
f698caccc0 gha: Enable install kbs and coco components for TDX
This PR enables the installation and unistallation of the kbs client
as well as general coco components needed for the TDX GHA CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-22 20:01:57 +02:00
GabyCT
eaaab19763 Merge pull request #9685 from GabyCT/topic/fixic
tests: Fix indentation in confidential common script
2024-05-22 11:53:33 -06:00
Gabriela Cervantes
29a10f1373 metrics: Fix minvalue for boot time
This PR fixes the minvalue for boot time to avoid the random failures
of the GHA CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-22 17:52:51 +00:00
GabyCT
0b32360ab4 Merge pull request #9684 from stevenhorsman/add-arch-to-component-cache-tags
ci: cache: Add arch suffix to all cache tags
2024-05-22 09:24:28 -06:00
Fabiano Fidêncio
0e33ecf7fc Merge pull request #9653 from JakubLedworowski/fixes-9497-ensure-quote-generation-service-is-added-to-qemu-cmd-2
runtime: Enable connection to Quote Generation Service (QGS)
2024-05-22 15:49:23 +02:00
sidneychang
8938f35627 runtime-rs: Adjust indentation in ifneq statements within Makefile.
Replace tab indentation with spaces for the three lines within the ifneq statements, aligning them with the surrounding code.

Fixes:#9692

Signed-off-by: sidneychang <2190206983@qq.com>
2024-05-22 20:24:35 +08:00
Fabiano Fidêncio
94f7bbf253 Merge pull request #9682 from fidencio/topic/allow-increasing-cpus-and-memory-via-annotation-for-tdx
runtime: tdx: Allow default_{cpu,memory} annotations
2024-05-22 12:07:28 +02:00
Xuewei Niu
d31616cec3 runtime-rs: Remove obsoleted dial_timeout config
The `dial_timeout` works fine for Runtime-go, but is obsoleted in
Runtime-rs.

When the pod cannot connect to the Agent upon starting, we need to adjust
the `reconnect_timeout_ms` to increase the number of connection attempts to
the Agent.

Fixes: #9688

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-05-22 17:57:05 +08:00
Jakub Ledworowski
fc680139e5 runtime: Enable connection to Quote Generation Service (QGS)
For the TD attestation to work the connection to QGS on the host is needed.
By default QGS runs on vsock port 4050, but can be modified by the host owner.
Format of the qemu object follows the SocketAddress structure, so it needs to be provided in the JSON format, as in the example below:
-object '{"qom-type":"tdx-guest","id":"tdx","quote-generation-socket":{"type":"vsock","cid":"2","port":"4050"}}'

Fixes: #9497
Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>
2024-05-22 11:16:24 +02:00
Alex Lyn
0331859740 Merge pull request #9642 from gkurz/drop-unused-knobs-qemu-rs
runtime-rs: Drop some useless QEMU arguments
2024-05-22 16:13:14 +08:00
Alex Lyn
ce030d1804 Merge pull request #9641 from cmaf/runtime-resize-mem-1
runtime: Add missing check in ResizeMemory for CH
2024-05-22 14:05:30 +08:00
Alex Lyn
b7af00be2a Merge pull request #9624 from cncal/bugfix_duplicated_devices
runtime: fix duplicated devices requested to the agent
2024-05-22 12:45:46 +08:00
Steve Horsman
f41f642b90 Merge pull request #9635 from kata-containers/dependabot/go_modules/src/runtime/go_modules-f0df977846
build(deps): bump github.com/containerd/containerd from 1.7.11 to 1.7.16 in /src/runtime in the go_modules group across 1 directory
2024-05-21 21:19:32 +01:00
Steve Horsman
9b0ed3dfa7 Merge pull request #9657 from ajaypvictor/remote-hyp-annotations
runtime: Disable number of cpu comparison on remote hypervisor scenario
2024-05-21 21:19:12 +01:00
Hyounggyu Choi
92101fc61f Merge pull request #9658 from BbolroC/migrate-vfio-ap-test
CI: Migrate vfio-ap test files from tests repo
2024-05-21 20:21:09 +02:00
Lei Huang
b0a91b0d13 kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO
The new version of sriov-network-device-plugin adds an env
`PCIDEVICE_<prefix>_<resource-name>_INFO`, which has a json
value; kata-agent can't parse it as env
`PCIDEVICE_<prefix>_<resource-name>` which has value in format
"DDDD:BB:SS.F".

This change updates env `PCIDEVICE_<prefix>_<resource-name>_INFO`.

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-05-21 10:46:41 -07:00
stevenhorsman
db4818fe1d ci: cache: Enforce tag length limit
Container tags can be a maximum of 128 characters long
so calculate the length of the arch suffix and then restrict
the tag to this length subtracted from 128

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 18:03:45 +01:00
Gabriela Cervantes
c9e91db16f tests: Fix indentation in confidential common script
This PR fixes the indentation in the confidential common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-21 16:33:46 +00:00
stevenhorsman
d6afd77eae ci: cache: Update agent cache to use the full commit hash
- Previously I copied the logic that abbreviated the commit hash
from the versioning, but looking at our versions.yaml the clear pattern
is that when pointing at commits of dependencies we use the full
commit hash, not the abbreviated one, so for consistency I think we should
do the same with the components that we make available

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 16:51:16 +01:00
stevenhorsman
d46b6a3879 ci: cache: Add arch suffix to all cache tags
As we have multi-arch builds for nearly all components, we want to ensure
that all the cache tags we set have the architecture suffix, not just the
`TARGET_BRANCH` one.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 11:25:07 +01:00
stevenhorsman
865fa9da15 runtime: Resolve go static-checks failure
Remove `rand.Seed` call to resolve the following failure:
```
rand.Seed is deprecated: As of Go 1.20 there is no reason to call Seed with a random value.
```

The go rand.Seed docs: https://pkg.go.dev/math/rand@go1.20#Seed
back this up and states:
> If Seed is not called, the generator is seeded randomly at program startup.
so I believe we can just delete the call.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 11:08:59 +01:00
Fabiano Fidêncio
abf52420a4 runtime: tdx: Allow default_{cpu,memory} annotations
For now, let's allow the users to set the default_cpu and default_memory
when using TDX, as they may hit issues related to the size of the
container image that must be pulled and unpacked inside the guest,

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-21 10:26:39 +02:00
stevenhorsman
75a201389d runtime: update go version in go.mod
- Make due to us bumping the golang version used in our CI
but `make vendor` fails without the go version in the runtime go.mod
being increased, so update this and run go mod tidy

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 09:11:46 +01:00
dependabot[bot]
735185b15c build(deps): bump github.com/containerd/containerd
Bumps the go_modules group with 1 update in the /src/runtime directory: [github.com/containerd/containerd](https://github.com/containerd/containerd).


Updates `github.com/containerd/containerd` from 1.7.11 to 1.7.16
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](https://github.com/containerd/containerd/compare/v1.7.11...v1.7.16)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: direct:production
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-21 09:11:46 +01:00
Ajay Victor
abe607b0c7 runtime: Disable number of cpu comparison on remote hypervisor scenario
Fixes https://github.com/kata-containers/kata-containers/issues/9238

Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>
2024-05-21 13:34:21 +05:30
dependabot[bot]
01868b2849 ---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-20 22:06:41 +00:00
Fabiano Fidêncio
8879e3bc45 Merge pull request #9452 from GabyCT/topic/tdxcoco
gha: Add support to install KBS to k8s TDX GHA workflow
2024-05-20 23:28:52 +02:00
Fabiano Fidêncio
072b929b6f Merge pull request #9660 from malt3/fix/genpolicy/namespace_empty_string
genpolicy: detect empty string in ns as default
2024-05-20 21:34:13 +02:00
Gabriela Cervantes
cfdef7ed5f tests/k8s: Use custom intel DCAP configuration
This PR adds the use of custom Intel DCAP configuration when
deploying the KBS.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-20 18:44:57 +00:00
Gabriela Cervantes
cace2fd340 metrics: Improve variable definition in memory usage script
This PR improves general format like variable definition to have
uniformity across the memory usage script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-20 16:14:59 +00:00
Fabiano Fidêncio
97056b017d Merge pull request #9675 from stevenhorsman/release-build-tarballs-inherit-secrets
gha: release: Set inherit secrets on tarball builds
2024-05-20 18:06:38 +02:00
Fabiano Fidêncio
b8b3bcc492 Merge pull request #9671 from bikesheddev/fix/kata-deploy-unbound-variable
fix: kata-deploy.sh VERSION_ID unbound-variable
2024-05-20 17:22:55 +02:00
Fabiano Fidêncio
94cff3f74e Merge pull request #9315 from fidencio/topic/adapt-TEEs-for-shared_fs-none
TEEs: Use `shared_fs=none` for TDX
2024-05-20 17:17:36 +02:00
Fabiano Fidêncio
cffeb0ffb8 Merge pull request #9673 from fidencio/topic/revert-aks-workaround
Revert "ci: azure: Workaround azure cli installation script"
2024-05-20 16:16:55 +02:00
stevenhorsman
f271983aeb gha: release: Set inherit secrets on tarball builds
Now we have updated the release builds to push
artefacts to
our registry for the release, so we can cache the images, we need to
set `secrets: inherit` for all architecture's tarball builds
so that we can log into quay.io and ghcr in those steps

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-20 14:19:17 +01:00
Fabiano Fidêncio
25c9cf32ff Revert "ci: azure: Workaround azure cli installation script"
This reverts commit 5ff53e4d1c, as the
script was fixed by MSFT, at least according to:
https://github.com/Azure/azure-cli/issues/28984

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-20 14:38:46 +02:00
vac (Brendan)
d812007b99 kata-deploy: Fix unbound VERSION_ID
VERSION_ID is not guaranteed to be specified in os-release, this
makes kaka-deploy breaks in rolling distros like arch linux and void
linux.

Note that operating system vendors may choose not to provide
version information, for example to accommodate for rolling releases.
In this case, VERSION and VERSION_ID may be unset.
Applications should not rely on these fields to be set.

Signed-off-by: vac <dot.fun@protonmail.com>
2024-05-20 19:48:31 +08:00
Tim Zhang
857d2bbc8e agent: Fix ctr exec stuck problem
Fixes: #9532

Close stdin when write_stdin receives data of length 0.

Stop call notify_term_close() in close_stdin, because it could
discard stdout unexpectedly.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2024-05-20 14:52:14 +08:00
Fabiano Fidêncio
e8ebe18868 tests: k8s: tdx: Skip liveness probe test
This test doesn't fail with the guest image pulling, but it for sure
should. :-)

We can see in the bats logs, something like:
```
Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  31s               default-scheduler  Successfully assigned kata-containers-k8s-tests/liveness-exec to 984fee00bd70.jf.intel.com
  Normal   Pulled     23s               kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 345ms (345ms including waiting)
  Normal   Started    21s               kubelet            Started container liveness
  Warning  Unhealthy  7s (x3 over 13s)  kubelet            Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
  Normal   Killing    7s                kubelet            Container liveness failed liveness probe, will be restarted
  Normal   Pulled     7s                kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 389ms (389ms including waiting)
  Warning  Failed     5s                kubelet            Error: failed to create containerd task: failed to create shim task: the file /bin/sh was not found: unknown
  Normal   Pulling    5s (x3 over 23s)  kubelet            Pulling image "quay.io/prometheus/busybox:latest"
  Normal   Pulled     4s                kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 342ms (342ms including waiting)
  Normal   Created    4s (x3 over 23s)  kubelet            Created container liveness
  Warning  Failed     3s                kubelet            Error: failed to create containerd task: failed to create shim task: failed to mount /run/kata-containers/f0ec86fb156a578964007f7773a3ccbdaf60023106634fe030f039e2e154cd11/rootfs to /run/kata-containers/liveness/rootfs, with error: ENOENT: No such file or directory: unknown
  Warning  BackOff    1s (x3 over 3s)   kubelet            Back-off restarting failed container liveness in pod liveness-exec_kata-containers-k8s-tests(b1a980bf-a5b3-479d-97c2-ebdb45773eff)
```

Let's skip it for now as we have an issue opened to track it down:
https://github.com/kata-containers/kata-containers/issues/9665

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 21:59:29 +02:00
Fabiano Fidêncio
a2c70222a8 tests: k8s: tdx: Skip initContainerd shared vol test
This is another one that is related to initContainers not being properly
handled with the guest image pulling.

Let's skip it for now as we have
https://github.com/kata-containers/kata-containers/issues/9668 to track
it down.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 20:58:45 +02:00
Fabiano Fidêncio
9d56145499 tests: k8s: tdx: Skip volume related tests
Similarly to firecracker, which doesn't have support for virtio-fs /
virtio-9p, TDX used with `shared_fs=none` will face the very same
limitations.

The tests affected are:
* k8s-credentials-secrets.bats
* k8s-file-volume.bats
* k8s-inotify.bats
* k8s-nested-configmap-secret.bats
* k8s-projected-volume.bats
* k8s-volume.bats

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 19:38:49 +02:00
Fabiano Fidêncio
606a62a0a7 tests: k8s: tdx: Skip "Setting sysctl" test
This test fails when using `shared_fs=none` with the nydus-snapshotter,
and we're tracking the issue here:
https://github.com/kata-containers/kata-containers/issues/9666

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 19:38:38 +02:00
Fabiano Fidêncio
937b2d5806 tests: k8s: tdx: Skip "Kill all processes in container" test
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
https://github.com/kata-containers/kata-containers/issues/9664

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
03ce41b743 tests: k8s: tdx: Skip "Check custom dns" test
The test has been failing on TDX for a while, and an issue has been
created to track it down, see:
https://github.com/kata-containers/kata-containers/issues/9663

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
1a8a4d046d tests: k8s: setup: Improve / Fix logs
Let's make sure the logs will print the correct annotation and its
value, instead of always mentioning "kernel" and "initrd".

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
3f38309c39 tests: k8s: tdx: Stop running k8s-guest-pull-image.bats
We're doing that as all tests are going to be running with
`shared_fs=none`, meaning that we don't need any specific test for this
case anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:00 +02:00
Fabiano Fidêncio
e84619d54b tests: k8s: tdx: Add add_runtime_handler_annotations function
This function will set the needed annotation for enforcing that the
image pull will be handled by the snapshotter set for the runtime
handler, instead of using the default one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:49:07 +02:00
Fabiano Fidêncio
f2de259387 runtime: tdx: Use shared_fs=none
We shouldn't be using 9p, at all, with TEEs, as off right now we have no
way to ensure the channels are encrypted.  The way to work this around
for now is using guest pull, either with containerd + nydus snapshotter
or with CRI-O; or even tardev snapshotter for pulling on the host (which
is the approach used by MSFT).

This is only done for TDX for now, leaving the generic, AMD, and IBM
related stuff for the folks working on those to switch and debug
possible issues on their environment.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:47:09 +02:00
Fabiano Fidêncio
5b257685d9 Merge pull request #9662 from dborquez/fix_launchtimes_timestamp_generation
Fix launch times timestamp generation.
2024-05-18 21:11:09 +02:00
Fabiano Fidêncio
94786dc939 Merge pull request #9659 from stevenhorsman/remove-non-printable-tag-characters
ci: cache: Filter out non-printable characters from tag
2024-05-18 14:47:07 +02:00
Fabiano Fidêncio
874cda0e51 Merge pull request #9655 from BbolroC/add-arch-to-initramfs
CI: Append arch type to initramfs-cryptsetup image
2024-05-18 14:31:57 +02:00
Malte Poll
babdab9078 genpolicy: detect empty string in ns as default
In Kubernetes, the following values for namespace are equivalent and all refer to the default namespace:

- ` ` (namespace field missing)
- `namespace: ""` (namespace field is the empty string)
- `namespace: "default"`(namespace field has the explicit value `default`)

Genpolicy currently does not handle the empty string case correctly.

Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>
2024-05-18 12:44:59 +02:00
Fabiano Fidêncio
cbfdc70a55 Merge pull request #9613 from fidencio/topic/skip-pull-image-tests-on-tees-part-II
tests: pull-image: Only skip tests for TEEs
2024-05-18 03:31:38 +02:00
Archana Shinde
0e28e904e0 kata-manager: Install cni for containerd
When just containerd is installed without installing nerdctl,
cni plugins are missing from the installation.
containerd tarball does not include cni plugin files.
Hence install cni plugins separately for containerd.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-05-18 00:19:57 +00:00
Archana Shinde
d23d58a484 kata-manager: Copy cni files under /opt/cni
nerdctl requires cni plugins to be installed in /opt/cni/bin
Without bridge plugin installed, it is not possible to run a
container with nerdctl.
The downloaded nerdctl tarball contains cni plugin files, but are
extracted under /usr/local/libexec.
Copy extracted tarball cni files under /usr/local/libexec
to /opt/cni/bin

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-05-18 00:16:48 +00:00
David Esparza
938d3dc430 metrics: fix timestamps generation from launch times test.
Use `eval` to process the `date` command along with its parameters,
thus avoiding misinterpreting the parameters as commands.

Fixes: #9661

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-05-17 14:44:41 -06:00
David Esparza
bae377b42a metrics: determine the realpath of kata-shim component.
Determine the realpath of kata-shim avoiding the check fails
in case the kata-shim is not a symlink, as was happening prior
to this commit.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-05-17 14:40:02 -06:00
Fabiano Fidêncio
5ff53e4d1c ci: azure: Workaround azure cli installation script
This is done in order to work around
https://github.com/Azure/azure-cli/issues/28984, following a suggestion
on the very same issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 20:28:24 +02:00
stevenhorsman
42fddb5530 ci: cache: Filter out non-printable characters from tag
- The tags have a trailing non-printable character, which results
in our cache tags having a trailing underscore e.g. `ghcr.io/kata-containers/cached-artefacts/agent:ce24e9835_`
For ease of use of these cached components, we should strip off the trailing underscore.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 14:16:40 +01:00
Hyounggyu Choi
961735a181 CI: Migrate vfio-ap test files from tests repo
An e2e test for `vfio-ap` has been conducted internally in IBM
due to the lack of publicly available test machines equipped
with a required crypto device.
The test is performed by the `tests` repository:
(i.e. 772105b560/Makefile (L144))

The community is working to integrate all tests into the `kata-containers`
repository, so the `vfio-ap` test should be part of that effort.

This commit moves a test script and Dockerfile for a test image from
the `tests` repository. We do not rename the script to `gha-run.sh`
because it is not executed by Github Actions' workflow.

You can check the test results from the s390x nightly test with the migrated files here:
https://github.com/kata-containers/kata-containers/actions/runs/9123170010/job/25100026025

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-17 14:59:16 +02:00
stevenhorsman
a92defdffe tests: pull-image: Remove skips
Given that we think the containerd -> snapshotter image cache
problems have been resolved by bumping to nydus-snapshotter v0.3.13
we can try removing the skips to test this out

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 12:39:57 +02:00
stevenhorsman
7ac302e2d8 tests: Slacken guest pull rootfs count assert
- We previously have an expectation for the pause rootfs
to be pull on the host when we did a guest pull. We weren't
really clear why, but it is plausible related to the issues we had
with containerd and nydus caching. Now that is fixed we can begin
to address this with setting shared_fs=none, but let's start with
updating the rootfs host check to be not higher than expected

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
67ff58251d tests: confidential_common: Remove unneeded ensure_yq call
This test is called from `tests/integration/run_kuberentes_tests.sh`,
which already ensures that yq is installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
cc874ad5e1 tests: confidential: Ensure those only run on TEEs
Running those with the non-TEE runtime classes will simply fail.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
2bc5b1bba2 tests: pull-image: Only skip tests for TEEs
On 1423420, I've mistakenly disabled the tests entirely, for both
non-TEEs and TEEs.

This happened as I didn't realise that `confidential_setup` would take
non-TEEs into consideration. :-/

Now, let me follow-up on that and make sure that the tests will be
running on non-TEEs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
d875f89fa2 tests: Add is_confidential_hardware()
This function is a helper to check whether the KATA_HYPERVISOR being
used is a confidential hardware (TEE) or not, and we can use it to
skip or only run tests on those platforms when needed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
4a04a1f2ae tests: Re-work confidential_setup()
Let's rename it to `is_confidential_runtime_class`, and adapt all the
places where it's called.

The new name provides a better description, leading to a better
understanding of what the function really does.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Pavel Mores
b9febc4458 runtime-rs: document architecture & implementation conventions in qemu-rs
Implementation of QemuCmdLine has a fairly uniform and repetitive structure
that's guided by a set of conventions.  These conventions have however been
mostly implicit so far, leading to a superfluous and annoying
request/force-push churn during qemu-rs PR reviews.

This commit aims to make things explicit so that contributors can take them
into account before an initial PR submission.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-05-17 12:21:44 +02:00
Hyounggyu Choi
3917930a76 CI: Append arch type to initramfs-cryptsetup image
This commit is to append an arch type to the initramfs-cryptsetup image
to prevent a wrong arch image from being pulled on a different arch host.

Fixes: #9654

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-17 11:42:49 +02:00
Steve Horsman
9a6d8d8330 Merge pull request #9650 from stevenhorsman/caching-tagging-update-partIII
Caching tagging update part iii
2024-05-17 09:09:15 +01:00
stevenhorsman
ce24e98358 ci: cache: Add tag character filtering
- Container image tags can only contain alphanumeric, period,
hyphen and underscore characters, so convert characters outside
of these to be underscores, to avoid having invalid tag failures

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 21:38:07 +01:00
stevenhorsman
a98b1e3afb ci: cache: Integrate tagging updates with recent changes
Recently the extra gpu caching was added, unfortunately when I
rebased I ended up with both the new tagging logic and old logic.
Let's try and integrate them properly to avoid doing the push twice.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 21:38:07 +01:00
Lukáš Doktor
f994f79078 ci.ocp: Add steps to reproduce/bisect CI runs
in case the upstream CI fails it's useful to pin-point the PR that
caused the regression. Currently openshift-ci does not allow doing that
from their setup but we can mimic the setup on our infrastructure and
use the available kata-deploy-ci images to find the first failing one.
To help with that add a few helper scripts and a howto.

Fixes: #9228

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:20:05 +02:00
Lukáš Doktor
a556ad7e01 ci.ocp: Document how to run openshift-tests with kata
document the ocp pipeline.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:15:32 +02:00
Lukáš Doktor
ea081bd882 ci.ocp: Add webhook cleanup
cleanup the webhook resources as well.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:15:31 +02:00
David Esparza
029a6de52b Merge pull request #9615 from GabyCT/topic/fixlaunchtime
metrics: Update launch times script
2024-05-16 11:28:44 -06:00
Steve Horsman
33e6b241ba Merge pull request #9647 from stevenhorsman/fix-artefact-tags-unbound-variable
ci: cache: Fix unbound variable
2024-05-16 16:22:47 +01:00
stevenhorsman
9d9487b17f ci: cache: Fix unbound variable
Now we have the workflow updated and can test the changes in caching
we've hit an error:
```
line 1180: artefact_tag: unbound variable
```
so we need to fix that up. Sorry for missing this before.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 14:30:32 +01:00
Steve Horsman
03c08583c3 Merge pull request #9644 from stevenhorsman/fix-broken-workflow
workflow: Remove if from env conditional
2024-05-16 14:13:25 +01:00
stevenhorsman
f7fd2f9a5d workflow: Fix problems with build-asset workflows
- It appears like the `if` isn't required when setting env as a
conditional
- `inputs.stage` over input.stage
- Swap matrix.component to matrix.asset

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 11:51:46 +01:00
Steve Horsman
d8468cb178 Merge pull request #9550 from stevenhorsman/tag-component-caches
Tag component caches
2024-05-16 11:05:18 +01:00
Steve Horsman
b31ff09b8d Merge pull request #9617 from zvonkok/artefact-repository
deploy: Add artefact repository
2024-05-16 10:41:23 +01:00
Fabiano Fidêncio
4d073c837d Merge pull request #9636 from ChengyuZhu6/snapshotter
version: Bump nydus snapshotter to v0.13.13
2024-05-16 02:54:53 +02:00
GabyCT
05cc8fae5e Merge pull request #9610 from GabyCT/topic/fixrwfio
metrics: Fix random write value for FIO
2024-05-15 17:44:41 -06:00
Gabriela Cervantes
793a02600a metrics: Fix random write value for clh for FIO
This PR decreases the random write value for clh for FIO.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-15 22:13:10 +00:00
Chelsea Mafrica
5d2af555da runtime: Add missing check in ResizeMemory for CH
ResizeMemory for Cloud Hypervisor is missing a check for the new
requested memory being greater than the max hotplug size after
alignment. Add the check, and since an earlier check for this
setsrequested memory to the max hotplug size, do the same in the
post-alignment check.

Fixes #9640

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-05-15 11:29:18 -07:00
GabyCT
d752f0aa4f Merge pull request #9627 from GabyCT/topic/ghacomk8s
gha: Fix indentation in gha run k8s common
2024-05-15 11:55:14 -06:00
Greg Kurz
bd6420e0cc runtime-rs: Drop some useless QEMU arguments
All these settings are hardcoded as `false` and result in
no extra options on the QEMU command line, like the go
runtime does. There actually not needed :
- we're never going to ask QEMU to survive a guest shutdown
- we're never going to run QEMU daemonized since it prevents
  log collection
- we're never going to ask QEMU to start with the guest stopped

No need to keep this code around then.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-05-15 18:33:43 +02:00
stevenhorsman
7f41329010 ci: cache: Optional tag components with tags
- CoCo wants to use the agent and coco-guest-components cached artifacts
so tag them with a helpful version, so make these easier to get

Signed-off-by: stevenhorsman <steven@uk.ibm.com>

 No commands remaining.
2024-05-15 16:56:40 +01:00
stevenhorsman
9999971656 release: Move component's don't ship logic
- We don't want to ship certain components (agent, coco-guest-components)
as part of the release, but for other consumers it's useful to be able to pull in the components
from oras, so rather than not building them, just don't upload it as part of the release.
- Also make the archs all consistent on not shipping the agent

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
stevenhorsman
040e6cdf12 gha: release: Set RELEASE env
- Set RELEASE env to 'yes', or 'no', based on if the stage
passed in was 'release', so we can use it in the build scripts

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
stevenhorsman
d93156d84d gha: release: Push artifacts to registry on release
For other projects (e.g. CoCo projects) being able to
access the released versions of components is helpful,
so push these during the release process

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
Steve Horsman
19ca1a6656 Merge pull request #9638 from BbolroC/use-fixed-len-git-hash-explicitly
CI: Use `--abbrev=9` explicitly for abbreviated commit hash
2024-05-15 16:55:07 +01:00
GabyCT
64b915b86e Merge pull request #9438 from GabyCT/topic/addnegativetest
tests: Add k8s negative policy test
2024-05-15 08:52:57 -06:00
Hyounggyu Choi
e075150fbe CI: Use --abbrev=9 explicitly for abbreviated commit hash
A length of the result of `git log -1 --pretty=format:%h` could vary
over different CI systems, highly likely messing up their caching
mechanisms.

This commit is to use an option `--abbrev=9` to standardize the length
to 9 characters for CI.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-15 14:22:07 +02:00
Zvonko Kaiser
117e2f2ecc Merge pull request #9618 from zvonkok/nvidia-rootfs-#1
gpu: Add build targets for GPU rootfs initrd/image
2024-05-15 13:30:42 +02:00
ChengyuZhu6
d48c7ec979 version: Bump nydus snapshotter to v0.13.13
Bump nydus snapshotter to v0.13.13 to fix the gap when switching
different snapshotters in guest pull.

Fixes: #8407

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-05-15 12:21:01 +08:00
Gabriela Cervantes
f20a44bba3 gha: Fix indentation in gha run k8s common
This PR fixes the indentation in gha run k8s common script
to have uniformity across the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-13 20:07:47 +00:00
cncal
232db2d906 runtime: fix duplicated devices requested to the agent
By default, when a container is created with the `--privileged` flag,
all devices in `/dev` from the host are mounted into the guest. If
there is a block device(e.g. `/dev/dm`) followed by a generic
device(e.g. `/dev/null`),two identical block devices(`/dev/dm`)
would be requested to the kata agent causing the agent to exit with error:

> Conflicting device updates for /dev/dm-2

As the generic device type does not hit any cases defined in `switch`,
the variable `kataDevice` which is defined outside of the loop is still
the value of the previous block device rather than `nil`. Defining `kataDevice`
in the loop fixes this bug.

Signed-off-by: cncal <flycalvin@qq.com>
2024-05-12 16:38:37 +08:00
Zvonko Kaiser
693e307f72 deploy: Add artefact repository
New env var so everyone can test the PUSH_TO_REGISTRY feature

export PUSH_TO_REGISTRY=yes
export ARTEFACT_REGISTRY=quay.io
export ARTEFACT_REPOSITORY=my-fancy-kata-containers
export ARTEFACT_REGISTRY_USERNAME=zvonkok
export ARTEFACT_REGISTRY_PASSWORD=<super-secret>

make ...-tarball

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-10 16:41:52 +00:00
Zvonko Kaiser
85374f55d2 gpu: Add build targets for GPU rootfs initrd/image
Preparation for complete GPU rootfs build step #1/#N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-10 09:47:21 +00:00
Zvonko Kaiser
8ec2cc9c0d threat-model: Add VFIO, ACPI and KVM/VMM threat-model descriptions
We're missing several topics in the current threat model lets update.

Fixes: #8943

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-10 07:18:44 +00:00
Gabriela Cervantes
80e551ea74 metrics: Update launch times script
This PR updates the launch times scripts by improving the variable
definition as well as trying to use the same format across all the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-09 21:29:32 +00:00
Gabriela Cervantes
2fb406ed3a metrics: Fix random write value for FIO
This PR fixes the random write value for FIO for qemu by decreasing it
to avoid the random failures of the GHA CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-08 18:54:41 +00:00
Gabriela Cervantes
b54dc26073 gha: Enable uninstall kbs client function for coco gha workflow
This PR enables the uninstall kbs client function for coco gha tdx
workflow.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 15:55:24 +00:00
Gabriela Cervantes
aaf9b54d97 gha: Add support to install KBS to k8s TDX GHA workflow
This PR adds support to install KBS to k8s TDX GHA workflow in
order to run confidential attestation tests.

Fixes #9451

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 15:42:17 +00:00
Gabriela Cervantes
506e17a60d tests: Add k8s negative policy test
This PR adds a k8s negative policy test to the confidential attestation
bats test.

Fixes #9437

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 15:28:54 +00:00
cncal
9caa7beb1f runtime: make kata-runtime check error more understandable
If device /dev/kvm does not exist, kata-runtime check would fail with
an ambiguous error messae 'no such file or directory'. I added a little
more details to make it understandable and it will belike:

```
ERRO[0000] cannot open kvm device: no such file or directory  arch=arm64 check-type=full device=/dev/kvm name=kata-runtime pid=2849085 source=runtime
ERRO[0000] no such file or directory                          arch=arm64 name=kata-runtime pid=2849085 source=runtime
no such file or directory
```

Signed-off-by: cncal <flycalvin@qq.com>
2024-05-03 08:29:08 +08:00
Zvonko Kaiser
63dff9a9f2 kata-agent: CreateContainer Hook
Fixes: #9267

The doc states we have support for all lifecycle hooks. There are still some missing.
This is the first issue regarding the CreateContainer hook which is run before pivot_root but after prestart and createruntime

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-03-13 09:24:25 +00:00
524 changed files with 33068 additions and 9161 deletions

24
.github/actionlint.yaml vendored Normal file
View File

@@ -0,0 +1,24 @@
# Copyright (c) 2024 Red Hat
#
# SPDX-License-Identifier: Apache-2.0
#
# Configuration file with rules for the actionlint tool.
#
self-hosted-runner:
# Labels of self-hosted runner that linter should ignore
labels:
- arm64-builder
- garm-ubuntu-2004
- garm-ubuntu-2004-smaller
- garm-ubuntu-2204
- garm-ubuntu-2304
- garm-ubuntu-2304-smaller
- garm-ubuntu-2204-smaller
- k8s-ppc64le
- metrics
- ppc64le
- sev
- sev-snp
- s390x
- s390x-large
- tdx

View File

@@ -8,7 +8,7 @@
script_dir=$(dirname "$(readlink -f "$0")")
parent_dir=$(realpath "${script_dir}/../..")
cidir="${parent_dir}/ci"
source "${cidir}/lib.sh"
source "${cidir}/../tests/common.bash"
cargo_deny_file="${script_dir}/action.yaml"

View File

@@ -23,7 +23,7 @@ jobs:
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'dragonball', 'qemu', 'stratovirt', 'cloud-hypervisor', 'qemu-runtime-rs']
runs-on: garm-ubuntu-2204-smaller
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
@@ -62,7 +62,7 @@ jobs:
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'cloud-hypervisor', 'dragonball', 'qemu', 'stratovirt']
runs-on: garm-ubuntu-2204-smaller
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
@@ -104,7 +104,7 @@ jobs:
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'qemu', 'dragonball', 'stratovirt']
runs-on: garm-ubuntu-2204-smaller
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
@@ -138,7 +138,7 @@ jobs:
run: bash tests/integration/nydus/gha-run.sh run
run-runk:
runs-on: garm-ubuntu-2204-smaller
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts
steps:
@@ -168,6 +168,37 @@ jobs:
- name: Run runk tests
timeout-minutes: 10
run: bash tests/integration/runk/gha-run.sh run
run-stdio:
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/stdio/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/stdio/gha-run.sh install-kata kata-artifacts
- name: Run stdio tests
timeout-minutes: 10
run: bash tests/integration/stdio/gha-run.sh
run-tracing:
strategy:
@@ -176,6 +207,9 @@ jobs:
vmm:
- clh # cloud-hypervisor
- qemu
# TODO: enable me when https://github.com/kata-containers/kata-containers/issues/9763 is fixed
# TODO: Transition to free runner (see #9940).
if: false
runs-on: garm-ubuntu-2204-smaller
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}
@@ -211,7 +245,13 @@ jobs:
strategy:
fail-fast: false
matrix:
vmm: ['clh', 'qemu']
vmm:
- clh
- qemu
# TODO: enable with clh when https://github.com/kata-containers/kata-containers/issues/9764 is fixed
# TODO: enable with qemu when https://github.com/kata-containers/kata-containers/issues/9851 is fixed
# TODO: Transition to free runner (see #9940).
if: false
runs-on: garm-ubuntu-2304
env:
GOPATH: ${{ github.workspace }}
@@ -251,7 +291,7 @@ jobs:
vmm:
- clh
- qemu
runs-on: garm-ubuntu-2304-smaller
runs-on: ubuntu-22.04
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
@@ -294,7 +334,10 @@ jobs:
- dragonball
- qemu
- cloud-hypervisor
runs-on: garm-ubuntu-2304-smaller
# TODO: enable with clh when https://github.com/kata-containers/kata-containers/issues/9852 is fixed
exclude:
- vmm: clh
runs-on: ubuntu-22.04
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
@@ -326,7 +369,9 @@ jobs:
run: bash tests/integration/nerdctl/gha-run.sh run
- name: Collect artifacts ${{ matrix.vmm }}
if: always()
run: bash tests/integration/nerdctl/gha-run.sh collect-artifacts
continue-on-error: true
- name: Archive artifacts ${{ matrix.vmm }}
uses: actions/upload-artifact@v4

View File

@@ -111,3 +111,4 @@ jobs:
${{ matrix.command }}
env:
RUST_BACKTRACE: "1"
SKIP_GO_VERSION_CHECK: "1"

View File

@@ -60,14 +60,8 @@ jobs:
stage:
- ${{ inputs.stage }}
exclude:
- asset: agent
stage: release
- asset: cloud-hypervisor-glibc
stage: release
- asset: pause-image
stage: release
- asset: coco-guest-components
stage: release
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -93,7 +87,7 @@ jobs:
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
@@ -102,8 +96,10 @@ jobs:
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
if: ${{ matrix.stage != 'release' || (matrix.asset != 'agent' && matrix.asset != 'coco-guest-components' && matrix.asset != 'pause-image') }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

View File

@@ -39,13 +39,7 @@ jobs:
- rootfs-initrd
- shim-v2
- virtiofsd
stage:
- ${{ inputs.stage }}
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
@@ -70,7 +64,7 @@ jobs:
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
@@ -79,8 +73,10 @@ jobs:
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
if: ${{ inputs.stage != 'release' || matrix.asset != 'agent' }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

View File

@@ -36,16 +36,11 @@ jobs:
stage:
- ${{ inputs.stage }}
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- name: Prepare the self-hosted runner
run: |
${HOME}/scripts/prepare_runner.sh
sudo rm -rf $GITHUB_WORKSPACE/*
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
@@ -70,8 +65,7 @@ jobs:
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
sudo chown -R $(id -u):$(id -g) "kata-build"
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
@@ -80,8 +74,10 @@ jobs:
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
if: ${{ inputs.stage != 'release' || matrix.asset != 'agent' }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}

View File

@@ -39,13 +39,6 @@ jobs:
- rootfs-initrd-confidential
- shim-v2
- virtiofsd
stage:
- ${{ inputs.stage }}
exclude:
- asset: pause-image
stage: release
- asset: coco-guest-components
stage: release
steps:
- name: Take a pre-action for self-hosted runner
run: ${HOME}/script/pre_action.sh ubuntu-2204
@@ -74,8 +67,7 @@ jobs:
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
sudo chown -R $(id -u):$(id -g) "kata-build"
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
@@ -84,8 +76,10 @@ jobs:
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
if: ${{ inputs.stage != 'release' || (matrix.asset != 'agent' && matrix.asset != 'coco-guest-components' && matrix.asset != 'pause-image') }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}

View File

@@ -43,7 +43,7 @@ jobs:
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
build-kata-static-tarball-ppc64le:
uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml
with:
@@ -62,7 +62,7 @@ jobs:
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
publish-kata-deploy-payload-ppc64le:
needs: build-kata-static-tarball-ppc64le
uses: ./.github/workflows/publish-kata-deploy-payload-ppc64le.yaml
@@ -113,6 +113,8 @@ jobs:
file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile
run-kata-deploy-tests-on-aks:
# TODO: Reenable when Azure CI budget is secured (see #9939).
if: false
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-kata-deploy-tests-on-aks.yaml
with:
@@ -125,6 +127,8 @@ jobs:
secrets: inherit
run-kata-deploy-tests-on-garm:
# TODO: Transition to free runner (see #9940).
if: false
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-kata-deploy-tests-on-garm.yaml
with:
@@ -203,7 +207,8 @@ jobs:
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-k8s-tests-on-ppc64le:
needs: publish-kata-deploy-payload-ppc64le
uses: ./.github/workflows/run-k8s-tests-on-ppc64le.yaml

View File

@@ -0,0 +1,31 @@
name: Cleanup dangling Azure resources
on:
schedule:
- cron: "0 */6 * * *"
workflow_dispatch:
jobs:
cleanup-resources:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Log into Azure
env:
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
run: bash tests/integration/kubernetes/gha-run.sh login-azure
- name: Install Python dependencies
run: |
pip3 install --user --upgrade \
azure-identity==1.16.0 \
azure-mgmt-resource==23.0.1
- name: Cleanup resources
env:
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
CLEANUP_AFTER_HOURS: 6 # Clean up resources created more than this many hours ago.
run: python3 tests/cleanup_resources.py

View File

@@ -26,11 +26,6 @@ jobs:
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Setup
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh
env:
GOPATH: ${{ runner.workspace }}/kata-containers
# docs url alive check
- name: Docs URL Alive Check
run: |

View File

@@ -10,7 +10,9 @@ jobs:
build-kata-static-tarball-amd64:
uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml
with:
push-to-registry: yes
stage: release
secrets: inherit
kata-deploy:
needs: build-kata-static-tarball-amd64

View File

@@ -10,7 +10,9 @@ jobs:
build-kata-static-tarball-arm64:
uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml
with:
push-to-registry: yes
stage: release
secrets: inherit
kata-deploy:
needs: build-kata-static-tarball-arm64

View File

@@ -10,7 +10,9 @@ jobs:
build-kata-static-tarball-ppc64le:
uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml
with:
push-to-registry: yes
stage: release
secrets: inherit
kata-deploy:
needs: build-kata-static-tarball-ppc64le

View File

@@ -10,6 +10,7 @@ jobs:
build-kata-static-tarball-s390x:
uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml
with:
push-to-registry: yes
stage: release
secrets: inherit

View File

@@ -36,6 +36,7 @@ jobs:
- clh
- dragonball
- qemu
- qemu-runtime-rs
- stratovirt
- cloud-hypervisor
instance-type:

View File

@@ -86,7 +86,9 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Collect artifacts ${{ matrix.vmm }}
if: always()
run: bash tests/integration/kubernetes/gha-run.sh collect-artifacts
continue-on-error: true
- name: Archive artifacts ${{ matrix.vmm }}
uses: actions/upload-artifact@v4

View File

@@ -27,8 +27,6 @@ jobs:
strategy:
fail-fast: false
matrix:
vmm:
- qemu
snapshotter:
- devmapper
- nydus
@@ -39,10 +37,12 @@ jobs:
pull-type: default
using-nfd: true
deploy-cmd: configure-snapshotter
vmm: qemu
- snapshotter: nydus
pull-type: guest-pull
using-nfd: false
deploy-cmd: deploy-snapshotter
vmm: qemu-coco-dev
runs-on: s390x-large
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
@@ -57,9 +57,12 @@ jobs:
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: ${{ matrix.using-nfd }}
TARGET_ARCH: "s390x"
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
steps:
- name: Take a pre-action for self-hosted runner
run: ${HOME}/script/pre_action.sh ubuntu-2204
run: |
"${HOME}/script/pre_action.sh" ubuntu-2204
- uses: actions/checkout@v4
with:
@@ -90,4 +93,4 @@ jobs:
if: always()
run: |
bash tests/integration/kubernetes/gha-run.sh cleanup-zvsi || true
${HOME}/script/post_action.sh ubuntu-2204
"${HOME}/script/post_action.sh" ubuntu-2204

View File

@@ -40,11 +40,15 @@ jobs:
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: "k3s"
KUBERNETES: "vanilla"
USING_NFD: "true"
KBS: "true"
K8S_TEST_HOST_TYPE: "baremetal"
KBS_INGRESS: "nodeport"
SNAPSHOTTER: ${{ matrix.snapshotter }}
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
steps:
- uses: actions/checkout@v4
with:
@@ -65,8 +69,20 @@ jobs:
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-tdx
- name: Uninstall previous `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client
- name: Deploy CoCo KBS
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
- name: Install `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Run tests
timeout-minutes: 30
timeout-minutes: 50
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
@@ -77,6 +93,10 @@ jobs:
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-snapshotter
- name: Delete CoCo KBS
if: always()
run: bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs
run-k8s-tests-on-sev:
strategy:
fail-fast: false
@@ -100,6 +120,8 @@ jobs:
K8S_TEST_HOST_TYPE: "baremetal"
SNAPSHOTTER: ${{ matrix.snapshotter }}
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
steps:
- uses: actions/checkout@v4
with:
@@ -152,9 +174,13 @@ jobs:
KUBECONFIG: /home/kata/.kube/config
KUBERNETES: "vanilla"
USING_NFD: "false"
KBS: "true"
KBS_INGRESS: "nodeport"
K8S_TEST_HOST_TYPE: "baremetal"
SNAPSHOTTER: ${{ matrix.snapshotter }}
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
steps:
- uses: actions/checkout@v4
with:
@@ -175,6 +201,18 @@ jobs:
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-snp
- name: Uninstall previous `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client
- name: Deploy CoCo KBS
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
- name: Install `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
@@ -187,6 +225,10 @@ jobs:
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-snapshotter
- name: Delete CoCo KBS
if: always()
run: bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs
# Generate jobs for testing CoCo on non-TEE environments
run-k8s-tests-coco-nontee:
strategy:
@@ -212,6 +254,8 @@ jobs:
KBS_INGRESS: "aks"
KUBERNETES: "vanilla"
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: "false"
steps:

View File

@@ -33,6 +33,7 @@ jobs:
- clh
- dragonball
- qemu
- qemu-runtime-rs
include:
- host_os: cbl-mariner
vmm: clh

View File

@@ -34,6 +34,10 @@ jobs:
- k0s
- k3s
- rke2
# TODO: There are a couple of vmm/k8s combination failing (https://github.com/kata-containers/kata-containers/issues/9854)
# and we will put the entire kata-deploy-tests on GARM on maintenance.
# TODO: Transition to free runner (see #9940).
if: false
runs-on: garm-ubuntu-2004-smaller
env:
DOCKER_REGISTRY: ${{ inputs.registry }}

View File

@@ -15,6 +15,8 @@ on:
jobs:
run-monitor:
# TODO: Transition to free runner (see #9940).
if: false
strategy:
fail-fast: false
matrix:
@@ -23,13 +25,18 @@ jobs:
container_engine:
- crio
- containerd
include:
# TODO: enable when https://github.com/kata-containers/kata-containers/issues/9853 is fixed
#include:
# - container_engine: containerd
# containerd_version: lts
exclude:
# TODO: enable with containerd when https://github.com/kata-containers/kata-containers/issues/9761 is fixed
- container_engine: containerd
containerd_version: lts
vmm: qemu
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINER_ENGINE: ${{ matrix.container_engine }}
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
#CONTAINERD_VERSION: ${{ matrix.containerd_version }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4

View File

@@ -15,6 +15,8 @@ on:
jobs:
run-runk:
# TODO: Transition to free runner (see #9940).
if: false
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINERD_VERSION: lts

View File

@@ -40,6 +40,8 @@ jobs:
instance: ubuntu-20.04
build-checks-depending-on-kvm:
# TODO: Transition to free runner (see #9940).
if: false
runs-on: garm-ubuntu-2004-smaller
strategy:
fail-fast: false

View File

@@ -140,6 +140,7 @@ The table below lists the remaining parts of the project:
| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |
| [`runk`](src/tools/runk) | utility | Standard OCI container runtime based on the agent. |
| [`ci`](.github/workflows) | CI | Continuous Integration configuration files and scripts. |
| [`ocp-ci`](ci/openshift-ci/README.md) | CI | Continuous Integration configuration for the OpenShift pipelines. |
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |
| [`Webhook`](tools/testing/kata-webhook/README.md) | utility | Example of a simple admission controller webhook to annotate pods with the Kata runtime class |

View File

@@ -1 +1 @@
3.5.0
3.7.0

View File

@@ -7,6 +7,6 @@
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
source "${cidir}/../tests/common.bash"
run_docs_url_alive_check

View File

@@ -1,22 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2019 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
clone_tests_repo
new_goroot=/usr/local/go
pushd "${tests_repo_dir}"
# Force overwrite the current version of golang
[ -z "${GOROOT}" ] || rm -rf "${GOROOT}"
.ci/install_go.sh -p -f -d "$(dirname ${new_goroot})"
[ -z "${GOROOT}" ] || sudo ln -sf "${new_goroot}" "${GOROOT}"
go version
popd

View File

@@ -23,11 +23,11 @@ workdir="$(mktemp -d --tmpdir build-libseccomp.XXXXX)"
# Variables for libseccomp
libseccomp_version="${LIBSECCOMP_VERSION:-""}"
if [ -z "${libseccomp_version}" ]; then
libseccomp_version=$(get_from_kata_deps "externals.libseccomp.version")
libseccomp_version=$(get_from_kata_deps ".externals.libseccomp.version")
fi
libseccomp_url="${LIBSECCOMP_URL:-""}"
if [ -z "${libseccomp_url}" ]; then
libseccomp_url=$(get_from_kata_deps "externals.libseccomp.url")
libseccomp_url=$(get_from_kata_deps ".externals.libseccomp.url")
fi
libseccomp_tarball="libseccomp-${libseccomp_version}.tar.gz"
libseccomp_tarball_url="${libseccomp_url}/releases/download/v${libseccomp_version}/${libseccomp_tarball}"
@@ -36,11 +36,11 @@ cflags="-O2"
# Variables for gperf
gperf_version="${GPERF_VERSION:-""}"
if [ -z "${gperf_version}" ]; then
gperf_version=$(get_from_kata_deps "externals.gperf.version")
gperf_version=$(get_from_kata_deps ".externals.gperf.version")
fi
gperf_url="${GPERF_URL:-""}"
if [ -z "${gperf_url}" ]; then
gperf_url=$(get_from_kata_deps "externals.gperf.url")
gperf_url=$(get_from_kata_deps ".externals.gperf.url")
fi
gperf_tarball="gperf-${gperf_version}.tar.gz"
gperf_tarball_url="${gperf_url}/${gperf_tarball}"

View File

@@ -1,16 +0,0 @@
#!/usr/bin/env bash
# Copyright (c) 2019 Ant Financial
#
# SPDX-License-Identifier: Apache-2.0
#
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
clone_tests_repo
pushd ${tests_repo_dir}
.ci/install_rust.sh ${1:-}
popd

View File

@@ -1,19 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2018 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -e
cidir=$(dirname "$0")
vcdir="${cidir}/../src/runtime/virtcontainers/"
source "${cidir}/lib.sh"
export CI_JOB="${CI_JOB:-default}"
clone_tests_repo
if [ "${CI_JOB}" != "PODMAN" ]; then
echo "Install virtcontainers"
make -C "${vcdir}" && chronic sudo make -C "${vcdir}" install
fi

View File

@@ -5,6 +5,8 @@
# SPDX-License-Identifier: Apache-2.0
#
[ -n "$DEBUG" ] && set -o xtrace
# If we fail for any reason a message will be displayed
die() {
msg="$*"
@@ -16,7 +18,7 @@ die() {
# Install via binary download, as we may not have golang installed at this point
function install_yq() {
local yq_pkg="github.com/mikefarah/yq"
local yq_version=3.4.1
local yq_version=v4.40.7
local precmd=""
INSTALL_IN_GOPATH=${INSTALL_IN_GOPATH:-true}
@@ -36,7 +38,7 @@ function install_yq() {
fi
fi
fi
[ -x "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq version ${yq_version}"X ] && return
[ -x "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq (https://github.com/mikefarah/yq/) version ${yq_version}"X ] && return
read -r -a sysInfo <<< "$(uname -sm)"

View File

@@ -1,87 +0,0 @@
#
# Copyright (c) 2018 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -o nounset
GOPATH=${GOPATH:-${HOME}/go}
export kata_repo="github.com/kata-containers/kata-containers"
export kata_repo_dir="$GOPATH/src/$kata_repo"
export tests_repo="${tests_repo:-github.com/kata-containers/tests}"
export tests_repo_dir="$GOPATH/src/$tests_repo"
export branch="${target_branch:-main}"
# Clones the tests repository and checkout to the branch pointed out by
# the global $branch variable.
# If the clone exists and `CI` is exported then it does nothing. Otherwise
# it will clone the repository or `git pull` the latest code.
#
clone_tests_repo()
{
if [ -d "$tests_repo_dir" ]; then
[ -n "${CI:-}" ] && return
# git config --global --add safe.directory will always append
# the target to .gitconfig without checking the existence of
# the target, so it's better to check it before adding the target repo.
local sd="$(git config --global --get safe.directory ${tests_repo_dir} || true)"
if [ -z "${sd}" ]; then
git config --global --add safe.directory ${tests_repo_dir}
fi
pushd "${tests_repo_dir}"
git checkout "${branch}"
git pull
popd
else
git clone -q "https://${tests_repo}" "$tests_repo_dir"
pushd "${tests_repo_dir}"
git checkout "${branch}"
popd
fi
}
run_static_checks()
{
# Make sure we have the targeting branch
git remote set-branches --add origin "${branch}"
git fetch -a
bash "$kata_repo_dir/tests/static-checks.sh" "$@"
}
run_docs_url_alive_check()
{
# Make sure we have the targeting branch
git remote set-branches --add origin "${branch}"
git fetch -a
bash "$kata_repo_dir/tests/static-checks.sh" --docs --all "$kata_repo"
}
run_get_pr_changed_file_details()
{
# Make sure we have the targeting branch
git remote set-branches --add origin "${branch}"
git fetch -a
source "$kata_repo_dir/tests/common.bash"
get_pr_changed_file_details
}
# Check if the 1st argument version is greater than and equal to 2nd one
# Version format: [0-9]+ separated by period (e.g. 2.4.6, 1.11.3 and etc.)
#
# Parameters:
# $1 - a version to be tested
# $2 - a target version
#
# Return:
# 0 if $1 is greater than and equal to $2
# 1 otherwise
version_greater_than_equal() {
local current_version=$1
local target_version=$2
smaller_version=$(echo -e "$current_version\n$target_version" | sort -V | head -1)
if [ "${smaller_version}" = "${target_version}" ]; then
return 0
else
return 1
fi
}

149
ci/openshift-ci/README.md Normal file
View File

@@ -0,0 +1,149 @@
OpenShift CI
============
This directory contains scripts used by
[the OpenShift CI](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers/kata-containers)
pipelines to monitor selected functional tests on OpenShift.
There are 2 pipelines, history and logs can be accessed here:
* [main - currently supported OCP](https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-kata-containers-kata-containers-main-e2e-tests)
* [next - currently under development OCP](https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-kata-containers-kata-containers-main-next-e2e-tests)
Running openshift-tests on OCP with kata-containers manually
============================================================
To run openshift-tests (or other suites) with kata-containers one can use
the kata-webhook. To deploy everything you can mimic the CI pipeline by:
```bash
#!/bin/bash -e
# Setup your kubectl and check it's accessible by
kubectl nodes
# Deploy kata (set KATA_DEPLOY_IMAGE to override the default kata-deploy-ci:latest image)
./test.sh
# Deploy the webhook
KATA_RUNTIME=kata-qemu cluster/deploy_webhook.sh
```
This should ensure kata-containers as well as kata-webhook are installed and
working. Before running the openshift-tests it's (currently) recommended to
ignore some security features by:
```bash
#!/bin/bash -e
oc adm policy add-scc-to-group privileged system:authenticated system:serviceaccounts
oc adm policy add-scc-to-group anyuid system:authenticated system:serviceaccounts
oc label --overwrite ns default pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline
```
Now you should be ready to run the openshift-tests. Our CI only uses a subset
of tests, to get the current ``TEST_SKIPS`` see
[the pipeline config](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers/kata-containers).
Following steps require the [openshift tests](https://github.com/openshift/origin)
being cloned and built in the current directory:
```bash
#!/bin/bash -e
# Define tests to be skipped (see the pipeline config for the current version)
TEST_SKIPS="\[sig-node\] Security Context should support seccomp runtime/default\|\[sig-node\] Variable Expansion should allow substituting values in a volume subpath\|\[k8s.io\] Probing container should be restarted with a docker exec liveness probe with timeout\|\[sig-node\] Pods Extended Pod Container lifecycle evicted pods should be terminal\|\[sig-node\] PodOSRejection \[NodeConformance\] Kubelet should reject pod when the node OS doesn't match pod's OS\|\[sig-network\].*for evicted pods\|\[sig-network\].*HAProxy router should override the route\|\[sig-network\].*HAProxy router should serve a route\|\[sig-network\].*HAProxy router should serve the correct\|\[sig-network\].*HAProxy router should run\|\[sig-network\].*when FIPS.*the HAProxy router\|\[sig-network\].*bond\|\[sig-network\].*all sysctl on whitelist\|\[sig-network\].*sysctls should not affect\|\[sig-network\] pods should successfully create sandboxes by adding pod to network"
# Get the list of tests to be executed
TESTS="$(./openshift-tests run --dry-run --provider "${TEST_PROVIDER}" "${TEST_SUITE}")"
# Store the list of tests in /tmp/tsts file
echo "${TESTS}" | grep -v "$TEST_SKIPS" > /tmp/tsts
# Remove previously-existing temporarily files as well as previous results
OUT=RESULTS/tmp
rm -Rf /tmp/*test* /tmp/e2e-*
rm -R $OUT
mkdir -p $OUT
# Run the tests ignoring the monitor health checks
./openshift-tests run --provider azure -o "$OUT/job.log" --junit-dir "$OUT" --file /tmp/tsts --max-parallel-tests 5 --cluster-stability Disruptive --run '^\[sig-node\].*|^\[sig-network\]'
```
[!NOTE]
Note we are ignoring the cluster stability checks because our public cloud is
not that stable and running with VMs instead of containers results in minor
stability issues. Some of the old monitor stability tests do not reflect
the ``--cluster-stability`` setting, one should simply ignore these. If you
get a message like ``invariant was violated`` or ``error: failed due to a
MonitorTest failure``, it's usually an indication that only those kind of
tests failed but the real tests passed. See
[wrapped-openshift-tests.sh](https://github.com/openshift/release/blob/master/ci-operator/config/kata-containers/kata-containers/wrapped-openshift-tests.sh)
for details how our pipeline deals with that.
[!TIP]
To compare multiple results locally one can use
[junit2html](https://github.com/inorton/junit2html) tool.
Best-effort kata-containers cleanup
===================================
If you need to cleanup the cluster after testing, you can use the
``cleanup.sh`` script from the current directory. It tries to delete all
resources created by ``test.sh`` as well as ``cluster/deploy_webhook.sh``
ignoring all failures. The primary purpose of this script is to allow
soft-cleanup after deployment to test different versions without
re-provisioning everything.
[!WARNING]
Do not rely on this script in production, return codes are not checked!**
Bisecting e2e tests failures
============================
Let's say the OCP pipeline passed running with
``quay.io/kata-containers/kata-deploy-ci:kata-containers-d7afd31fd40e37a675b25c53618904ab57e74ccd-amd64``
but failed running with
``quay.io/kata-containers/kata-deploy-ci:kata-containers-9f512c016e75599a4a921bd84ea47559fe610057-amd64``
and you'd like to know which PR caused the regression. You can either run with
all the 60 tags between or you can utilize the [bisecter](https://github.com/ldoktor/bisecter)
to optimize the number of steps in between.
Before running the bisection you need a reproducer script. Sample one called
``sample-test-reproducer.sh`` is provided in this directory but you might
want to copy and modify it, especially:
* ``OCP_DIR`` - directory where your openshift/release is located (can be exported)
* ``E2E_TEST`` - openshift-test(s) to be executed (can be exported)
* behaviour of SETUP (returning 125 skips the current image tag, returning
>=128 interrupts the execution, everything else reports the tag as failure
* what should be executed (perhaps running the setup is enough for you or
you might want to be looking for specific failures...)
* use ``timeout`` to interrupt execution in case you know things should be faster
Executing that script with the GOOD commit should pass
``./sample-test-reproducer.sh quay.io/kata-containers/kata-deploy-ci:kata-containers-d7afd31fd40e37a675b25c53618904ab57e74ccd-amd64``
and fail when executed with the BAD commit
``./sample-test-reproducer.sh quay.io/kata-containers/kata-deploy-ci:kata-containers-9f512c016e75599a4a921bd84ea47559fe610057-amd64``.
To get the list of all tags in between those two PRs you can use the
``bisect-range.sh`` script
```bash
./bisect-range.sh d7afd31fd40e37a675b25c53618904ab57e74ccd 9f512c016e75599a4a921bd84ea47559fe610057
```
[!NOTE]
The tagged images are only built per PR, not for individual commits. See
[kata-deploy-ci](https://quay.io/kata-containers/kata-deploy-ci) to see the
available images.
To find out which PR caused this regression, you can either manually try the
individual commits or you can simply execute:
```bash
bisecter start "$(./bisect-range.sh d7afd31fd40 9f512c016)"
OCP_DIR=/path/to/openshift/release bisecter run ./sample-test-reproducer.sh
```
[!NOTE]
If you use ``KATA_WITH_SYSTEM_QEMU=yes`` you might want to deploy once with
it and skip it for the cleanup. That way you might (in most cases) test
all images with a single MCP update instead of per-image MCP update.
[!TIP]
You can check the bisection progress during/after execution by running
``bisecter log`` from the current directory. Before starting a new
bisection you need to execute ``bisecter reset``.

24
ci/openshift-ci/bisect-range.sh Executable file
View File

@@ -0,0 +1,24 @@
#!/bin/bash
# Copyright (c) 2024 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
if [ "$#" -gt 2 ] || [ "$#" -lt 1 ] ; then
echo "Usage: $0 GOOD [BAD]"
echo "Prints list of available kata-deploy-ci tags between GOOD and BAD commits (by default BAD is the latest available tag)"
exit 255
fi
GOOD="$1"
[ -n "$2" ] && BAD="$2"
ARCH=amd64
REPO="quay.io/kata-containers/kata-deploy-ci"
TAGS=$(skopeo list-tags "docker://$REPO")
# Only amd64
TAGS=$(echo "$TAGS" | jq '.Tags' | jq "map(select(endswith(\"$ARCH\")))" | jq -r '.[]')
# Tags since $GOOD
TAGS=$(echo "$TAGS" | sed -n -e "/$GOOD/,$$p")
# Tags up to $BAD
[ -n "$BAD" ] && TAGS=$(echo "$TAGS" | sed "/$BAD/q")
# Comma separated tags with repo
echo "$TAGS" | sed -e "s@^@$REPO:@" | paste -s -d, -

View File

@@ -25,6 +25,10 @@ WORKAROUND_9206_CRIO=${WORKAROUND_9206_CRIO:-no}
# Ignore errors as we want best-effort-approach here
trap - ERR
# Delete webhook resources
oc delete -f "${scripts_dir}/../../tools/testing/kata-webhook/deploy"
oc delete -f "${scripts_dir}/cluster/deployments/configmap_kata-webhook.yaml.in"
# Delete potential smoke-test resources
oc delete -f "${scripts_dir}/smoke/service.yaml"
oc delete -f "${scripts_dir}/smoke/service_kubernetes.yaml"

View File

@@ -4,7 +4,7 @@
#
# This is the build root image for Kata Containers on OpenShift CI.
#
FROM quay.io/centos/centos:stream8
FROM quay.io/centos/centos:stream9
RUN yum -y update && \
yum -y install \

View File

@@ -15,7 +15,9 @@ pod='http-server'
# Create a pod.
#
info "Creating the ${pod} pod"
oc apply -f ${script_dir}/smoke/${pod}.yaml || \
[ -z "$KATA_RUNTIME" ] && die "Please set the KATA_RUNTIME first"
envsubst < "${script_dir}/smoke/${pod}.yaml.in" | \
oc apply -f - || \
die "failed to create ${pod} pod"
# Check it eventually goes to 'running'

View File

@@ -0,0 +1,50 @@
#!/bin/bash
# Copyright (c) 2024 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# A sample script to deploy, configure, run E2E_TEST and soft-cleanup
# afterwards OCP cluster using kata-containers primarily created for use
# with https://github.com/ldoktor/bisecter
[ "$#" -ne 1 ] && echo "Provide image as the first and only argument" && exit 255
export KATA_DEPLOY_IMAGE="$1"
OCP_DIR="${OCP_DIR:-/path/to/your/openshift/release/}"
E2E_TEST="${E2E_TEST:-'"[sig-node] Container Runtime blackbox test on terminated container should report termination message as empty when pod succeeds and TerminationMessagePolicy FallbackToLogsOnError is set [NodeConformance] [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]"'}"
KATA_CI_DIR="${KATA_CI_DIR:-$(pwd)}"
export KATA_RUNTIME="${KATA_RUNTIME:-kata-qemu}"
## SETUP
# Deploy kata
SETUP=0
pushd "$KATA_CI_DIR" || { echo "Failed to cd to '$KATA_CI_DIR'"; exit 255; }
./test.sh || SETUP=125
cluster/deploy_webhook.sh || SETUP=125
if [ $SETUP != 0 ]; then
./cleanup.sh
exit "$SETUP"
fi
popd || true
# Disable security
oc adm policy add-scc-to-group privileged system:authenticated system:serviceaccounts
oc adm policy add-scc-to-group anyuid system:authenticated system:serviceaccounts
oc label --overwrite ns default pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline
## TEST EXECUTION
# Run the testing
pushd "$OCP_DIR" || { echo "Failed to cd to '$OCP_DIR'"; exit 255; }
echo "$E2E_TEST" > /tmp/tsts
# Remove previously-existing temporarily files as well as previous results
OUT=RESULTS/tmp
rm -Rf /tmp/*test* /tmp/e2e-*
rm -R $OUT
mkdir -p $OUT
# Run the tests ignoring the monitor health checks
./openshift-tests run --provider azure -o "$OUT/job.log" --junit-dir "$OUT" --file /tmp/tsts --max-parallel-tests 5 --cluster-stability Disruptive
RET=$?
popd || true
## CLEANUP
./cleanup.sh
exit "$RET"

View File

@@ -27,4 +27,4 @@ spec:
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
runtimeClassName: kata-qemu
runtimeClassName: ${KATA_RUNTIME}

View File

@@ -5,6 +5,9 @@
# SPDX-License-Identifier: Apache-2.0
#
# The kata shim to be used
export KATA_RUNTIME=${KATA_RUNTIME:-kata-qemu}
script_dir=$(dirname $0)
source ${script_dir}/lib.sh

View File

@@ -1,21 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2019 Ant Financial
#
# SPDX-License-Identifier: Apache-2.0
#
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
export CI_JOB="${CI_JOB:-}"
clone_tests_repo
pushd ${tests_repo_dir}
.ci/run.sh
# temporary fix, see https://github.com/kata-containers/tests/issues/3878
if [ "$(uname -m)" != "s390x" ] && [ "$CI_JOB" == "CRI_CONTAINERD_K8S_MINIMAL" ]; then
tracing/test-agent-shutdown.sh
fi
popd

View File

@@ -1,16 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2018 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
clone_tests_repo
pushd "${tests_repo_dir}"
.ci/setup.sh
popd

View File

@@ -7,6 +7,6 @@
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
source "${cidir}/../tests/common.bash"
run_static_checks "${@:-github.com/kata-containers/kata-containers}"

View File

@@ -0,0 +1,86 @@
# Blog Post Contributor Guide
This section describes the guidelines for contributing new blog posts to the
Kata Containers website.
## Share your stories on the Kata Containers website
Are you experimenting with Kata Containers or have it deployed in production and
would like to share your story as a case study? Do you have a use case that
Kata Containers can make more secure, but the world doesn't know it yet? Do you
have features in the runtime that you like and would like to highlight? Do you
have a Kata Containers demo that you would like to draw attention to?
Share your Kata Containers story on the [Kata Containers blog](https://www.katacontainers.io/blog/)!
You are only a few steps away...
### Kata Containers website source
Like the rest of the Kata Containers artifacts, the projects website code and
content are stored in a [GitHub repository](https://github.com/kata-containers/www.katacontainers.io).
The blog posts are written using markdown language that is mainly plain text
with a few easy formatting conventions to create lists, add images or code blocks,
or format the text.
You can find many [cheat sheets](https://www.markdownguide.org/cheat-sheet/)
floating on the web to get in terms of the basic syntax. You can also check the
[source files of the already existing blog posts](https://github.com/kata-containers/www.katacontainers.io/tree/main/src/pages/blog),
where you will find examples of all the basic items that you will need for your
new entry.
### Create a new blog post
When you create a new blog post, you need to create a new file in the
[`src/pages/blog/` folder](https://github.com/kata-containers/www.katacontainers.io/tree/main/src/pages/blog)
with a `.md` extension.
The markdown file has a few formatting conventions in its header to capture the
title, author, publishing date and category of your blog post.
The header looks like the following:
```
---
templateKey: blog-post
title: The Title of Your Amazing Blog Post
author: Your Name
date: 2021-01-28T16:23:52.741Z
category:
- value: category-6-wjkXzEM2
label: Features & Updates
---
```
The categories give the possibility to filter on the web page and see only the
blog posts that fall under one of the options. You can choose from the
following options:
* News & Announcements
* Features & Updates
The `Annual Report` category is reserved for the Kata Containers chapter in the
Open Infrastructure Annual report that we are also re-posting on the Kata
Containers website.
Once you filled out the above fields in the header and got your one-liner all
set, you can go ahead and type up the contents of your blog post using the
conventional markdown formatting.
If you have an image file to add, you need to place the file in the
`static/img` folder.
You can then insert the image into your blog post by using the following line:
```
![alt text](/img/the-file-name-of-your-image.jpg)
```
Once you are done with formatting your blog post and happy with the content, you
need to upload it to GitHub and create a pull request. You can do that by using
git commands on your laptop or you can also use the GitHub web interface to add
files to the repository and create a pull request when you are ready.
If you have an idea for a blog post and would like to get feedback from the
community about it or have any questions about the process, please reach out
on one of the community's [communication channels](https://katacontainers.io/community/).

View File

@@ -461,7 +461,7 @@ and repository utilized can be found by looking at the [versions file](../versio
Find the correct version of QEMU from the versions file:
```bash
$ source kata-containers/tools/packaging/scripts/lib.sh
$ qemu_version="$(get_from_kata_deps "assets.hypervisor.qemu.version")"
$ qemu_version="$(get_from_kata_deps ".assets.hypervisor.qemu.version")"
$ echo "${qemu_version}"
```
Get source from the matching branch of QEMU:

View File

@@ -50,6 +50,7 @@ Documents that help to understand and contribute to Kata Containers.
* [Developer Guide](Developer-Guide.md): Setup the Kata Containers developing environments
* [How to contribute to Kata Containers](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md)
* [Code of Conduct](../CODE_OF_CONDUCT.md)
* [How to submit a blog post](Blog-Post-Submission-Guide.md)
## Help Writing a Code PR

View File

@@ -32,7 +32,7 @@ For virtio-fs, the [runtime](README.md#runtime) starts one `virtiofsd` daemon
## Devicemapper
The
[devicemapper `snapshotter`](https://github.com/containerd/containerd/tree/main/snapshots/devmapper)
[devicemapper `snapshotter`](https://github.com/containerd/containerd/blob/main/docs/snapshotters/devmapper.md)
is a special case. The `snapshotter` uses dedicated block devices
rather than formatted filesystems, and operates at the block level
rather than the file level. This knowledge is used to directly use the

View File

@@ -40,7 +40,7 @@ use `RuntimeClass` instead of the deprecated annotations.
### Containerd Runtime V2 API: Shim V2 API
The [`containerd-shim-kata-v2` (short as `shimv2` in this documentation)](../../src/runtime/cmd/containerd-shim-kata-v2/)
implements the [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/main/runtime/v2) for Kata.
implements the [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/main/core/runtime/v2) for Kata.
With `shimv2`, Kubernetes can launch Pod and OCI-compatible containers with one shim per Pod. Prior to `shimv2`, `2N+1`
shims (i.e. a `containerd-shim` and a `kata-shim` for each container and the Pod sandbox itself) and no standalone `kata-proxy`
process were used, even with VSOCK not available.
@@ -62,7 +62,7 @@ Follow the instructions to [install Kata Containers](../install/README.md).
> You do not need to install `cri` if you have containerd 1.1 or above. Just remove the `cri` plugin from the list of
> `disabled_plugins` in the containerd configuration file (`/etc/containerd/config.toml`).
Follow the instructions from the [CRI installation guide](https://github.com/containerd/containerd/blob/main/docs/cri/installation.md).
Follow the instructions from the [CRI installation guide](https://github.com/containerd/containerd/blob/main/docs/cri/crictl.md#install-crictl).
Then, check if `containerd` is now available:
@@ -132,9 +132,9 @@ The `RuntimeClass` is suggested.
The following configuration includes two runtime classes:
- `plugins.cri.containerd.runtimes.runc`: the runc, and it is the default runtime.
- `plugins.cri.containerd.runtimes.kata`: The function in containerd (reference [the document here](https://github.com/containerd/containerd/tree/main/runtime/v2#binary-naming))
- `plugins.cri.containerd.runtimes.kata`: The function in containerd (reference [the document here](https://github.com/containerd/containerd/tree/main/core/runtime/v2))
where the dot-connected string `io.containerd.kata.v2` is translated to `containerd-shim-kata-v2` (i.e. the
binary name of the Kata implementation of [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/main/runtime/v2)).
binary name of the Kata implementation of [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/main/core/runtime/v2)).
```toml
[plugins.cri.containerd]

View File

@@ -35,27 +35,23 @@ $ git clone -b "${nydus_snapshotter_version}" "${nydus_snapshotter_url}" "${nydu
2. Configure DaemonSet file
```bash
$ pushd "$nydus_snapshotter_install_dir"
$ yq write -i \
> misc/snapshotter/base/nydus-snapshotter.yaml \
> 'data.FS_DRIVER' \
> "proxy" --style=double
$ yq -i \
> '.data.FS_DRIVER = "proxy"' -P \
> misc/snapshotter/base/nydus-snapshotter.yaml
# Disable to read snapshotter config from configmap
$ yq write -i \
> misc/snapshotter/base/nydus-snapshotter.yaml \
> 'data.ENABLE_CONFIG_FROM_VOLUME' \
> "false" --style=double
$ yq -i \
> 'data.ENABLE_CONFIG_FROM_VOLUME = "false"' -P \
> misc/snapshotter/base/nydus-snapshotter.yaml
# Enable to run snapshotter as a systemd service
# (skip if you want to run nydus snapshotter as a standalone process)
$ yq write -i \
> misc/snapshotter/base/nydus-snapshotter.yaml \
> 'data.ENABLE_SYSTEMD_SERVICE' \
> "true" --style=double
$ yq -i \
> 'data.ENABLE_SYSTEMD_SERVICE = "true"' -P \
> misc/snapshotter/base/nydus-snapshotter.yaml
# Enable "runtime specific snapshotter" feature in containerd when configuring containerd for snapshotter
# (skip if you want to configure nydus snapshotter as a global snapshotter in containerd)
$ yq write -i \
> misc/snapshotter/base/nydus-snapshotter.yaml \
> 'data.ENABLE_RUNTIME_SPECIFIC_SNAPSHOTTER' \
> "true" --style=double
$ yq -i \
> 'data.ENABLE_RUNTIME_SPECIFIC_SNAPSHOTTER = "true"' -P \
> misc/snapshotter/base/nydus-snapshotter.yaml
```
3. Install `nydus snapshotter` as a DaemonSet

View File

@@ -44,8 +44,8 @@ $ popd
- Build a custom QEMU
```bash
$ source kata-containers/tools/packaging/scripts/lib.sh
$ qemu_url="$(get_from_kata_deps "assets.hypervisor.qemu-snp-experimental.url")"
$ qemu_tag="$(get_from_kata_deps "assets.hypervisor.qemu-snp-experimental.tag")"
$ qemu_url="$(get_from_kata_deps ".assets.hypervisor.qemu-snp-experimental.url")"
$ qemu_tag="$(get_from_kata_deps ".assets.hypervisor.qemu-snp-experimental.tag")"
$ git clone "${qemu_url}"
$ pushd qemu
$ git checkout "${qemu_tag}"
@@ -53,7 +53,14 @@ $ ./configure --enable-virtfs --target-list=x86_64-softmmu --enable-debug
$ make -j "$(nproc)"
$ popd
```
- Create cert-chain for SNP attestation ( using [snphost](https://github.com/virtee/snphost/blob/main/docs/snphost.1.adoc) )
```bash
$ git clone https://github.com/virtee/snphost.git && cd snphost/
$ cargo build
$ mkdir /tmp/certs
$ ./target/debug/snphost fetch vcek der /tmp/certs
$ ./target/debug/snphost import /tmp/certs /opt/snp/cert_chain.cert
```
### Kata Containers Configuration for SNP
The configuration file located at `/etc/kata-containers/configuration.toml` must be adapted as follows to support SNP-VMs:
@@ -100,6 +107,10 @@ sev_snp_guest = true
- Configure an OVMF (add path)
```toml
firmware = "/path/to/kata-containers/tools/packaging/static-build/ovmf/opt/kata/share/ovmf/OVMF.fd"
```
- SNP attestation (add cert-chain to default path or add the path with cert-chain)
```toml
snp_certs_path = "/path/to/cert-chain"
```
## Test Kata Containers with Containerd

View File

@@ -202,11 +202,6 @@ attributes of each environment (local and CI):
- The hardware architecture.
- Number (and spec) of the CPUs.
## Gotchas (part 3)
If in doubt, look at the
["test artifacts" attached to the failing CI test](http://jenkins.katacontainers.io).
## Before raising a PR
- Remember to check that the test runs locally:

View File

@@ -1,95 +1,157 @@
# Kata Containers threat model
This document discusses threat models associated with the Kata Containers project.
Kata was designed to provide additional isolation of container workloads, protecting
the host infrastructure from potentially malicious container users or workloads. Since
Kata Containers adds a level of isolation on top of traditional containers, the focus
is on the additional layer provided, not on traditional container security.
This document discusses threat models associated with the Kata Containers
project. Kata was designed to provide additional isolation of container
workloads, protecting the host infrastructure from potentially malicious
container users or workloads. Since Kata Containers adds a level of isolation on
top of traditional containers, the focus is on the additional layer provided,
not on traditional container security.
This document provides a brief background on containers and layered security, describes
the interface to Kata from CRI runtimes, a review of utilized virtual machine interfaces, and then
a review of threats.
This document provides a brief background on containers and layered security,
describes the interface to Kata from CRI runtimes, a review of utilized virtual
machine interfaces, and then a review of threats.
## Kata security objective
Kata seeks to prevent an untrusted container workload or user of that container workload to gain
control of, obtain information from, or tamper with the host infrastructure.
Kata seeks to prevent an untrusted container workload or user of that container
workload to gain control of, obtain information from, or tamper with the host
infrastructure.
In our scenario, an asset is anything on the host system, or elsewhere in the cluster
infrastructure. The attacker is assumed to be either a malicious user or the workload itself
running within the container. The goal of Kata is to prevent attacks which would allow
any access to the defined assets.
In our scenario, an asset is anything on the host system, or elsewhere in the
cluster infrastructure. The attacker is assumed to be either a malicious user or
the workload itself running within the container. The goal of Kata is to prevent
attacks which would allow any access to the defined assets.
## Background on containers, layered security
Traditional containers leverage several key Linux kernel features to provide isolation and
a view that the container workload is the only entity running on the host. Key features include
`Namespaces`, `cgroups`, `capablities`, `SELinux` and `seccomp`. The canonical runtime for creating such
a container is `runc`. In the remainder of the document, the term `traditional-container` will be used
to describe a container workload created by runc.
Traditional containers leverage several key Linux kernel features to provide
isolation and a view that the container workload is the only entity running on
the host. Key features include `Namespaces`, `cgroups`, `capablities`, `SELinux`
and `seccomp`. The canonical runtime for creating such a container is `runc`. In
the remainder of the document, the term `traditional-container` will be used to
describe a container workload created by runc.
Kata Containers provides a second layer of isolation on top of those provided by traditional-containers.
The hardware virtualization interface is the basis of this additional layer. Kata launches a lightweight
virtual machine, and uses the guests Linux kernel to create a container workload, or workloads in the case
of multi-container pods. In Kubernetes and in the Kata implementation, the sandbox is carried out at the
pod level. In Kata, this sandbox is created using a virtual machine.
Kata Containers provides a second layer of isolation on top of those provided by
traditional-containers. The hardware virtualization interface is the basis of
this additional layer. Kata launches a lightweight virtual machine, and uses the
guests Linux kernel to create a container workload, or workloads in the case of
multi-container pods. In Kubernetes and in the Kata implementation, the sandbox
is carried out at the pod level. In Kata, this sandbox is created using a
virtual machine.
## Interface to Kata Containers: CRI, v2-shim, OCI
A typical Kata Containers deployment uses Kubernetes with a CRI implementation.
On every node, Kubelet will interact with a CRI implementor, which will in turn interface with
an OCI based runtime, such as Kata Containers. Typical CRI implementors are `cri-o` and `containerd`.
On every node, Kubelet will interact with a CRI implementor, which will in turn
interface with an OCI based runtime, such as Kata Containers. Typical CRI
implementors are `cri-o` and `containerd`.
The CRI API, as defined at the Kubernetes [CRI-API repo](https://github.com/kubernetes/cri-api/),
results in a few constructs being supported by the CRI implementation, and ultimately in the OCI
runtime creating the workloads.
The CRI API, as defined at the Kubernetes [CRI-API
repo](https://github.com/kubernetes/cri-api/), results in a few constructs being
supported by the CRI implementation, and ultimately in the OCI runtime creating
the workloads.
In order to run a container inside of the Kata sandbox, several virtual machine devices and interfaces
are required. Kata translates sandbox and container definitions to underlying virtualization technologies provided
by a set of virtual machine monitors (VMMs) and hypervisors. These devices and their underlying
implementations are discussed in detail in the following section.
In order to run a container inside of the Kata sandbox, several virtual machine
devices and interfaces are required. Kata translates sandbox and container
definitions to underlying virtualization technologies provided by a set of
virtual machine monitors (VMMs) and hypervisors. These devices and their
underlying implementations are discussed in detail in the following section.
## Interface to the Kata sandbox/virtual machine
In case of Kata, today the devices which we need in the guest are:
- Storage: In the current design of Kata Containers, we are reliant on the CRI implementor to
assist in image handling and volume management on the host. As a result, we need to support a way of passing to the sandbox the container rootfs, volumes requested
by the workload, and any other volumes created to facilitate sharing of secrets and `configmaps` with the containers. Depending on how these are managed, a block based device or file-system
sharing is required. Kata Containers does this by way of `virtio-blk` and/or `virtio-fs`.
- Networking: A method for enabling network connectivity with the workload is required. Typically this will be done providing a `TAP` device
to the VMM, and this will be exposed to the guest as a `virtio-net` device. It is feasible to pass in a NIC device directly, in which case `VFIO` is leveraged
and the device itself will be exposed to the guest.
- Control: In order to interact with the guest agent and retrieve `STDIO` from containers, a medium of communication is required.
This is available via `virtio-vsock`.
- Devices: `VFIO` is utilized when devices are passed directly to the virtual machine and exposed to the container.
- Dynamic Resource Management: `ACPI` is utilized to allow for dynamic VM resource management (for example: CPU, memory, device hotplug). This is required when containers are resized,
or more generally when containers are added to a pod.
- Storage: In the current design of Kata Containers, we are reliant on the CRI
implementor to assist in image handling and volume management on the host. As a
result, we need to support a way of passing to the sandbox the container
rootfs, volumes requested by the workload, and any other volumes created to
facilitate sharing of secrets and `configmaps` with the containers. Depending
on how these are managed, a block based device or file-system sharing is
required. Kata Containers does this by way of `virtio-blk` and/or `virtio-fs`.
- Networking: A method for enabling network connectivity with the workload is
required. Typically this will be done providing a `TAP` device to the VMM, and
this will be exposed to the guest as a `virtio-net` device. It is feasible to
pass in a NIC device directly, in which case `VFIO` is leveraged and the device
itself will be exposed to the guest.
- Control: In order to interact with the guest agent and retrieve `STDIO` from
containers, a medium of communication is required. This is available via
`virtio-vsock`.
- Devices: `VFIO` is utilized when devices are passed directly to the virtual
machine and exposed to the container.
- Dynamic Resource Management: `ACPI` is utilized to allow for dynamic VM
resource management (for example: CPU, memory, device hotplug). This is
required when containers are resized, or more generally when containers are
added to a pod.
How these devices are utilized varies depending on the VMM utilized. We clarify the default settings provided when integrating Kata
with the QEMU, Firecracker and Cloud Hypervisor VMMs in the following sections.
How these devices are utilized varies depending on the VMM utilized. We clarify
the default settings provided when integrating Kata with the QEMU, Dragonball,
Firecracker and Cloud Hypervisor VMMs in the following sections.
### Virtual Machine Monitor(s)
In a KVM/QEMU (any other VMM utilizing KVM) virtualization setup, all virtual
machines (VMs) share the same host kernel. This shared environment can lead to
scenarios where one VM could potentially impact the performance or stability of
other VMs, including the possibility of a Denial of Service attack.
- Kernel Vulnerabilities: Since all VMs rely on the host's kernel, a
vulnerability in the kernel could be exploited by a process running within one
VM to affect the entire system. This could lead to scenarios where the
compromised VM impacts other VMs or even takes down the host.
- Improper Isolation and Containment: If the virtualization environment is not
correctly configured, processes in one VM might impact other VMs. This could
occur through improper isolation of network traffic, shared file systems, or
other inter-VM communication channels.
- Hypervisor Vulnerabilities: Flaws in the KVM hypervisor or QEMU could be
exploited to cause information disclosure, data tampering, elevation of
privileges, denial of service, and others. Since KVM/QEMU leverages the host
kernel for its operation, any exploit at this level can have widespread impacts.
- Malicious or Flawed Guest Operating Systems: A guest operating system that is
maliciously designed or has serious flaws could engage in activities that
disrupt the normal operation of the host or other guests. This might include
aggressive network activity or interactions with the virtualization stack that
lead to instability.
- Resource Exhaustion: A VM could consume excessive shared resources such as
CPU, memory, or I/O bandwidth, leading to resource starvation for other VMs.
This could be due to misconfiguration, a runaway process, or a deliberate
denial of service attack from a compromised VM.
### Devices
Each virtio device is implemented by a backend, which may execute within userspace on the host (vhost-user), the VMM itself, or within the host kernel (vhost). While it may provide enhanced performance,
vhost devices are often seen as higher risk since an exploit would be already running within the kernel space. While VMM and vhost-user are both in userspace on the host, `vhost-user` generally allows for the back-end process to require less system calls and capabilities compared to a full VMM.
Each virtio device is implemented by a backend, which may execute within
userspace on the host (vhost-user), the VMM itself, or within the host kernel
(vhost). While it may provide enhanced performance, vhost devices are often seen
as higher risk since an exploit would be already running within the kernel
space. While VMM and vhost-user are both in userspace on the host, `vhost-user`
generally allows for the back-end process to require less system calls and
capabilities compared to a full VMM.
#### `virtio-blk` and `virtio-scsi`
The backend for `virtio-blk` and `virtio-scsi` are based in the VMM itself (ring3 in the context of x86) by default for Cloud Hypervisor, Firecracker and QEMU.
While `vhost` based back-ends are available for QEMU, it is not recommended. `vhost-user` back-ends are being added for Cloud Hypervisor, they are not utilized in Kata today.
The backend for `virtio-blk` and `virtio-scsi` are based in the VMM itself
(ring3 in the context of x86) by default for Cloud Hypervisor, Firecracker and
QEMU. While `vhost` based back-ends are available for QEMU, it is not
recommended. `vhost-user` back-ends are being added for Cloud Hypervisor, they
are not utilized in Kata today.
#### `virtio-fs`
`virtio-fs` is supported in Cloud Hypervisor and QEMU. `virtio-fs`'s interaction with the host filesystem is done through a vhost-user daemon, `virtiofsd`.
The `virtio-fs` client, running in the guest, will generate requests to access files. `virtiofsd` will receive requests, open the file, and request the VMM
to `mmap` it into the guest. When DAX is utilized, the guest will access the host's page cache, avoiding the need for copy and duplication. DAX is still an experimental feature,
and is not enabled by default.
`virtio-fs` is supported in Cloud Hypervisor and QEMU. `virtio-fs`'s interaction
with the host filesystem is done through a vhost-user daemon, `virtiofsd`. The
`virtio-fs` client, running in the guest, will generate requests to access
files. `virtiofsd` will receive requests, open the file, and request the VMM to
`mmap` it into the guest. When DAX is utilized, the guest will access the host's
page cache, avoiding the need for copy and duplication. DAX is still an
experimental feature, and is not enabled by default.
From the `virtiofsd` [documentation](https://gitlab.com/virtio-fs/virtiofsd/-/blob/main/README.md):
```This program must be run as the root user. Upon startup the program will switch into a new file system namespace with the shared directory tree as its root. This prevents “file system escapes” due to symlinks and other file system objects that might lead to files outside the shared directory. The program also sandboxes itself using seccomp(2) to prevent ptrace(2) and other vectors that could allow an attacker to compromise the system after gaining control of the virtiofsd process.```
DAX-less support for `virtio-fs` is available as of the 5.4 Linux kernel. QEMU VMM supports virtio-fs as of v4.2. Cloud Hypervisor
supports `virtio-fs`.
DAX-less support for `virtio-fs` is available as of the 5.4 Linux kernel. QEMU
VMM supports virtio-fs as of v4.2. Cloud Hypervisor supports `virtio-fs`.
#### `virtio-net`
@@ -97,9 +159,9 @@ supports `virtio-fs`.
##### QEMU networking
While QEMU has options for `vhost`, `virtio-net` and `vhost-user`, the `virtio-net` backend
for Kata defaults to `vhost-net` for performance reasons. The default configuration is being
reevaluated.
While QEMU has options for `vhost`, `virtio-net` and `vhost-user`, the
`virtio-net` backend for Kata defaults to `vhost-net` for performance reasons.
The default configuration is being reevaluated.
##### Firecracker networking
@@ -107,8 +169,14 @@ For Firecracker, the `virtio-net` backend is within Firecracker's VMM.
##### Cloud Hypervisor networking
For Cloud Hypervisor, the current backend default is within the VMM. `vhost-user-net` support
is being added (written in rust, Cloud Hypervisor specific).
For Cloud Hypervisor, the current backend default is within the VMM.
`vhost-user-net` support is being added (written in rust, Cloud Hypervisor
specific).
##### Dragonball networking
For Dragonball, the `virtio-net` backend default is within Dragonbasll's VMM.
#### virtio-vsock
@@ -116,22 +184,88 @@ is being added (written in rust, Cloud Hypervisor specific).
In QEMU, vsock is backed by `vhost_vsock`, which runs within the kernel itself.
##### Firecracker and Cloud Hypervisor
##### Dragonball, Firecracker and Cloud Hypervisor
In Firecracker and Cloud Hypervisor, vsock is backed by a unix-domain-socket in the hosts userspace.
In Dragonball, Firecracker and Cloud Hypervisor, vsock is backed by a unix-domain-socket in
the hosts userspace.
#### VFIO
Utilizing VFIO, devices can be passed through to the virtual machine. We will assess this separately. Exposure to
host is limited to gaps in device pass-through handling. This is supported in QEMU and Cloud Hypervisor, but not
Firecracker.
Utilizing VFIO, devices can be passed through to the virtual machine. Exposure
to the host is limited to gaps in device pass-through handling. This is
supported in QEMU and Cloud Hypervisor, but not Firecracker.
- Device Isolation Failure: One of the primary risks associated with VFIO is the
failure to isolate the physical device. If a VM can affect the operation of the
physical device in a way that impacts other VMs or the host system, it could
lead to security breaches or system instability.
- DMA Attacks: Direct Memory Access (DMA) attacks are a significant concern with
VFIO. Since the device has direct access to the system's memory, there's a risk
that a compromised VM could use its assigned device to read or write memory
outside of its allocated space, potentially accessing sensitive information or
affecting the host or other VMs.
- Firmware Vulnerabilities: Devices attached via VFIO rely on their firmware,
which can have vulnerabilities. A compromised device firmware could be exploited
to gain unauthorized access or to disrupt the system. Resource Starvation:
Improperly managed, a VM with direct access to hardware resources could
monopolize those resources, leading to performance degradation or denial of
service for other VMs or the host system.
- Escalation of Privileges: If a VM with VFIO access is compromised, it could
potentially be used to gain higher privileges than intended, especially if the
I/O devices have capabilities that are not adequately controlled or monitored.
- Improper Configuration and Management: Human errors in configuring VFIO, such
as incorrect group or user permissions, can expose the system to risks.
Additionally, inadequate monitoring and management of the VMs and their devices
can lead to security lapses.
- Software Vulnerabilities: Like any software, the components of VFIO (like the
kernel modules, device drivers, and management tools) can have vulnerabilities
that might be exploited by an attacker to compromise the security of the system.
Inter-VM Interference and Side-Channel Attacks: Even with device assignment,
there could be side-channel attacks where an attacker VM infers sensitive
information from the physical device's behavior or through shared resources like
cache.
#### ACPI (Dragonball uses Upcall)
ACPI is necessary for hotplugging of CPU, memory and devices. ACPI is available
in QEMU and Cloud Hypervisor. Device, CPU and memory hotplug are not available
in Firecracker.
- Hypervisor Vulnerabilities: In virtualized environments, the hypervisor
manages ACPI calls for virtual machines (VMs). If the hypervisor has
vulnerabilities in handling ACPI requests, it could lead to escalated privileges
or other security breaches.
- VM Escape: A sophisticated attack could exploit ACPI functionality to achieve
a VM escape, where malicious code in a VM breaks out to the host system or other
VMs. Firmware Attacks in a Virtualized Context: Similar to physical
environments, firmware-based attacks (including those targeting ACPI) in
virtualized systems can be persistent and difficult to detect. In a virtualized
environment, such attacks might not only compromise the host system but also all
the VMs running on it.
- Resource Starvation Attacks: ACPI functionality could be exploited to
manipulate power management features, causing denial of service through
resource starvation. For example, an attacker could force a VM into a low-power
state, degrading its performance or availability.
- Compromised VMs Affecting Host ACPI Settings: If a VM is compromised, it might
be used to alter ACPI settings on the host, affecting all VMs on that host. This
could lead to various impacts, from performance degradation to system
instability.
- Supply Chain Risks: As with non-virtualized environments, the firmware,
including ACPI firmware used in virtualized environments, could be compromised
during the supply chain process, leading to vulnerabilities that affect all VMs
running on the hardware.
#### ACPI
ACPI is necessary for hotplug of CPU, memory and devices. ACPI is available in QEMU and Cloud Hypervisor. Device, CPU and memory hotplug
are not available in Firecracker.
## Devices and threat model
![Threat model](threat-model-boundaries.svg "threat-model")

View File

@@ -279,8 +279,8 @@ $ export KERNEL_EXTRAVERSION=$(awk '/^EXTRAVERSION =/{print $NF}' $GOPATH/$LINUX
$ export KERNEL_ROOTFS_DIR=${KERNEL_MAJOR_VERSION}.${KERNEL_PATHLEVEL}.${KERNEL_SUBLEVEL}${KERNEL_EXTRAVERSION}
$ cd $QAT_SRC
$ KERNEL_SOURCE_ROOT=$GOPATH/$LINUX_VER ./configure --enable-icp-sriov=guest
$ sudo -E make all -j $($(nproc ${CI:+--ignore 1}))
$ sudo -E make INSTALL_MOD_PATH=$ROOTFS_DIR qat-driver-install -j $($(nproc ${CI:+--ignore 1}))
$ sudo -E make all -j $(nproc)
$ sudo -E make INSTALL_MOD_PATH=$ROOTFS_DIR qat-driver-install -j $(nproc)
```
The `usdm_drv` module also needs to be copied into the rootfs modules path and

3741
src/agent/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -21,8 +21,8 @@ scopeguard = "1.0.0"
thiserror = "1.0.26"
regex = "1.10.4"
serial_test = "0.5.1"
oci-distribution = "0.10.0"
url = "2.5.0"
derivative = "2.2.0"
kata-sys-util = { path = "../libs/kata-sys-util" }
kata-types = { path = "../libs/kata-types" }
safe-path = { path = "../libs/safe-path" }
@@ -34,8 +34,8 @@ async-recursion = "0.3.2"
futures = "0.3.30"
# Async runtime
tokio = { version = "1.28.1", features = ["full"] }
tokio-vsock = "0.3.1"
tokio = { version = "1.38.0", features = ["full"] }
tokio-vsock = "0.3.4"
netlink-sys = { version = "0.7.0", features = ["tokio_socket"] }
rtnetlink = "0.8.0"
@@ -57,12 +57,7 @@ cfg-if = "1.0.0"
prometheus = { version = "0.13.0", features = ["process"] }
procfs = "0.12.0"
# anyhow is currently locked at 1.0.58 because:
# - Versions between 1.0.59 - 1.0.76 have not been tested yet using Kata CI.
# However, those versions are passing "make test" for the Kata Agent.
# - Versions 1.0.77 or newer fail during "make test" - see
# https://github.com/kata-containers/kata-containers/issues/9538
anyhow = "=1.0.58"
anyhow = "1"
cgroups = { package = "cgroups-rs", version = "0.3.3" }
@@ -81,11 +76,13 @@ strum = "0.26.2"
strum_macros = "0.26.2"
# Image pull/decrypt
image-rs = { git = "https://github.com/confidential-containers/guest-components", rev = "ca6b438", default-features = true, optional = true }
openssl = { version = "0.10.54", features = ["vendored"], optional = true }
image-rs = { git = "https://github.com/confidential-containers/guest-components", rev = "2c5ac6b01aafcb0be3875f5743c77d654a548146", default-features = false, optional = true }
# Agent Policy
regorus = { version = "0.1.4", default-features = false, features = ["arc", "regex"], optional = true }
regorus = { version = "0.1.4", default-features = false, features = [
"arc",
"regex",
], optional = true }
[dev-dependencies]
tempfile = "3.1.0"
@@ -106,7 +103,7 @@ default-pull = ["guest-pull"]
seccomp = ["rustjail/seccomp"]
standard-oci-runtime = ["rustjail/standard-oci-runtime"]
agent-policy = ["regorus"]
guest-pull = ["image-rs", "openssl"]
guest-pull = ["image-rs/kata-cc-rustls-tls"]
[[bin]]
name = "kata-agent"

View File

@@ -15,7 +15,7 @@ PROJECT_COMPONENT = kata-agent
TARGET = $(PROJECT_COMPONENT)
VERSION_FILE := ./VERSION
VERSION := $(shell grep -v ^\# $(VERSION_FILE))
VERSION := $(shell grep -v ^\# $(VERSION_FILE) 2>/dev/null || echo "unknown")
COMMIT_NO := $(shell git rev-parse HEAD 2>/dev/null || true)
COMMIT := $(if $(shell git status --porcelain --untracked-files=no 2>/dev/null || true),${COMMIT_NO}-dirty,${COMMIT_NO})
COMMIT_MSG = $(if $(COMMIT),$(COMMIT),unknown)
@@ -159,7 +159,7 @@ vendor:
#TARGET test: run cargo tests
test: $(GENERATED_FILES)
@cargo test --all --target $(TRIPLE) $(EXTRA_RUSTFEATURES) -- --nocapture
@RUST_LIB_BACKTRACE=0 cargo test --all --target $(TRIPLE) $(EXTRA_RUSTFEATURES) -- --nocapture
##TARGET check: run test
check: $(GENERATED_FILES) standard_rust_check

View File

@@ -125,9 +125,11 @@ The kata agent has the ability to configure agent options in guest kernel comman
| `agent.debug_console` | Debug console flag | Allow to connect guest OS running inside hypervisor Connect using `kata-runtime exec <sandbox-id>` | boolean | `false` |
| `agent.debug_console_vport` | Debug console port | Allow to specify the `vsock` port to connect the debugging console | integer | `0` |
| `agent.devmode` | Developer mode | Allow the agent process to coredump | boolean | `false` |
| `agent.guest_components_rest_api` | `api-server-rest` configuration | Select the features that the API Server Rest attestation component will run with. Valid values are `all`, `attestation`, `resource` | string | `resource` |
| `agent.guest_components_procs` | guest-components processes | Attestation-related processes that should be spawned as children of the guest. Valid values are `none`, `attestation-agent`, `confidential-data-hub` (implies `attestation-agent`), `api-server-rest` (implies `attestation-agent` and `confidential-data-hub`) | string | `api-server-rest` |
| `agent.hotplug_timeout` | Hotplug timeout | Allow to configure hotplug timeout(seconds) of block devices | integer | `3` |
| `agent.guest_components_rest_api` | `api-server-rest` configuration | Select the features that the API Server Rest attestation component will run with. Valid values are `all`, `attestation`, `resource`, or `none` to not launch the `api-server-rest` component | string | `resource` |
| `agent.https_proxy` | HTTPS proxy | Allow to configure `https_proxy` in the guest | string | `""` |
| `agent.image_registry_auth` | Image registry credential URI | The URI to where image-rs can find the credentials for pulling images from private registries e.g. `file:///root/.docker/config.json` to read from a file in the guest image, or `kbs:///default/credentials/test` to get the file from the KBS| string | `""` |
| `agent.log` | Log level | Allow the agent log level to be changed (produces more or less output) | string | `"info"` |
| `agent.log_vport` | Log port | Allow to specify the `vsock` port to read logs | integer | `0` |
| `agent.no_proxy` | NO proxy | Allow to configure `no_proxy` in the guest | string | `""` |

View File

@@ -30,8 +30,8 @@ cgroups = { package = "cgroups-rs", version = "0.3.3" }
rlimit = "0.5.3"
cfg-if = "0.1.0"
tokio = { version = "1.28.1", features = ["sync", "io-util", "process", "time", "macros", "rt", "fs"] }
tokio-vsock = "0.3.1"
tokio = { version = "1.38.0", features = ["sync", "io-util", "process", "time", "macros", "rt", "fs"] }
tokio-vsock = "0.3.4"
futures = "0.3.17"
async-trait = "0.1.31"
inotify = "0.9.2"

View File

@@ -601,6 +601,22 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
unistd::close(mount_fd)?;
}
if init {
// CreateContainer Hooks:
// before pivot_root after prestart, createruntime
state.pid = std::process::id() as i32;
state.status = oci::ContainerState::Created;
if let Some(hooks) = spec.hooks.as_ref() {
log_child!(
cfd_log,
"create_container hooks {:?}",
hooks.create_container
);
let mut create_container_states = HookStates::new();
create_container_states.execute_hooks(&hooks.create_container, Some(state.clone()))?;
}
}
if to_new.contains(CloneFlags::CLONE_NEWNS) {
// unistd::chroot(rootfs)?;
if no_pivot {

View File

@@ -200,15 +200,8 @@ impl Process {
}
pub async fn close_stdin(&mut self) {
// stdin will be closed automatically in passfd-io senario
if self.proc_io.is_some() {
return;
}
close_process_stream!(self, term_master, TermMaster);
close_process_stream!(self, parent_stdin, ParentStdin);
self.notify_term_close();
}
pub fn cleanup_process_stream(&mut self) {

150
src/agent/src/cdh.rs Normal file
View File

@@ -0,0 +1,150 @@
// Copyright (c) 2023 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
// Confidential Data Hub client wrapper.
// Confidential Data Hub is a service running inside guest to provide resource related APIs.
// https://github.com/confidential-containers/guest-components/tree/main/confidential-data-hub
use anyhow::Result;
use derivative::Derivative;
use protocols::{
sealed_secret, sealed_secret_ttrpc_async, sealed_secret_ttrpc_async::SealedSecretServiceClient,
};
use crate::CDH_SOCKET_URI;
// Nanoseconds
const CDH_UNSEAL_TIMEOUT: i64 = 50 * 1000 * 1000 * 1000;
const SEALED_SECRET_PREFIX: &str = "sealed.";
#[derive(Derivative)]
#[derivative(Clone, Debug)]
pub struct CDHClient {
#[derivative(Debug = "ignore")]
sealed_secret_client: SealedSecretServiceClient,
}
impl CDHClient {
pub fn new() -> Result<Self> {
let client = ttrpc::asynchronous::Client::connect(CDH_SOCKET_URI)?;
let sealed_secret_client =
sealed_secret_ttrpc_async::SealedSecretServiceClient::new(client);
Ok(CDHClient {
sealed_secret_client,
})
}
pub async fn unseal_secret_async(&self, sealed_secret: &str) -> Result<Vec<u8>> {
let mut input = sealed_secret::UnsealSecretInput::new();
input.set_secret(sealed_secret.into());
let unsealed_secret = self
.sealed_secret_client
.unseal_secret(ttrpc::context::with_timeout(CDH_UNSEAL_TIMEOUT), &input)
.await?;
Ok(unsealed_secret.plaintext)
}
pub async fn unseal_env(&self, env: &str) -> Result<String> {
if let Some((key, value)) = env.split_once('=') {
if value.starts_with(SEALED_SECRET_PREFIX) {
let unsealed_value = self.unseal_secret_async(value).await?;
let unsealed_env = format!("{}={}", key, std::str::from_utf8(&unsealed_value)?);
return Ok(unsealed_env);
}
}
Ok((*env.to_owned()).to_string())
}
}
#[cfg(test)]
#[cfg(feature = "sealed-secret")]
mod tests {
use crate::cdh::CDHClient;
use crate::cdh::CDH_ADDR;
use anyhow::anyhow;
use async_trait::async_trait;
use protocols::{sealed_secret, sealed_secret_ttrpc_async};
use std::sync::Arc;
use test_utils::skip_if_not_root;
use tokio::signal::unix::{signal, SignalKind};
struct TestService;
#[async_trait]
impl sealed_secret_ttrpc_async::SealedSecretService for TestService {
async fn unseal_secret(
&self,
_ctx: &::ttrpc::asynchronous::TtrpcContext,
_req: sealed_secret::UnsealSecretInput,
) -> ttrpc::error::Result<sealed_secret::UnsealSecretOutput> {
let mut output = sealed_secret::UnsealSecretOutput::new();
output.set_plaintext("unsealed".into());
Ok(output)
}
}
fn remove_if_sock_exist(sock_addr: &str) -> std::io::Result<()> {
let path = sock_addr
.strip_prefix("unix://")
.expect("socket address does not have the expected format.");
if std::path::Path::new(path).exists() {
std::fs::remove_file(path)?;
}
Ok(())
}
fn start_ttrpc_server() {
tokio::spawn(async move {
let ss = Box::new(TestService {})
as Box<dyn sealed_secret_ttrpc_async::SealedSecretService + Send + Sync>;
let ss = Arc::new(ss);
let ss_service = sealed_secret_ttrpc_async::create_sealed_secret_service(ss);
remove_if_sock_exist(CDH_ADDR).unwrap();
let mut server = ttrpc::asynchronous::Server::new()
.bind(CDH_ADDR)
.unwrap()
.register_service(ss_service);
server.start().await.unwrap();
let mut interrupt = signal(SignalKind::interrupt()).unwrap();
tokio::select! {
_ = interrupt.recv() => {
server.shutdown().await.unwrap();
}
};
});
}
#[tokio::test]
async fn test_unseal_env() {
skip_if_not_root!();
let rt = tokio::runtime::Runtime::new().unwrap();
let _guard = rt.enter();
start_ttrpc_server();
std::thread::sleep(std::time::Duration::from_secs(2));
let cc = Some(CDHClient::new().unwrap());
let cdh_client = cc.as_ref().ok_or(anyhow!("get cdh_client failed")).unwrap();
let sealed_env = String::from("key=sealed.testdata");
let unsealed_env = cdh_client.unseal_env(&sealed_env).await.unwrap();
assert_eq!(unsealed_env, String::from("key=unsealed"));
let normal_env = String::from("key=testdata");
let unchanged_env = cdh_client.unseal_env(&normal_env).await.unwrap();
assert_eq!(unchanged_env, String::from("key=testdata"));
rt.shutdown_background();
std::thread::sleep(std::time::Duration::from_secs(2));
}
}

View File

@@ -28,6 +28,9 @@ const CONTAINER_PIPE_SIZE_OPTION: &str = "agent.container_pipe_size";
const UNIFIED_CGROUP_HIERARCHY_OPTION: &str = "systemd.unified_cgroup_hierarchy";
const CONFIG_FILE: &str = "agent.config_file";
const GUEST_COMPONENTS_REST_API_OPTION: &str = "agent.guest_components_rest_api";
const GUEST_COMPONENTS_PROCS_OPTION: &str = "agent.guest_components_procs";
#[cfg(feature = "guest-pull")]
const IMAGE_REGISTRY_AUTH_OPTION: &str = "agent.image_registry_auth";
// Configure the proxy settings for HTTPS requests in the guest,
// to solve the problem of not being able to access the specified image in some cases.
@@ -59,19 +62,34 @@ const ERR_INVALID_CONTAINER_PIPE_SIZE_PARAM: &str = "unable to parse container p
const ERR_INVALID_CONTAINER_PIPE_SIZE_KEY: &str = "invalid container pipe size key name";
const ERR_INVALID_CONTAINER_PIPE_NEGATIVE: &str = "container pipe size should not be negative";
const ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE: &str = "invalid guest components rest api feature given. Valid values are `all`, `attestation`, `resource`, or `none`";
const ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE: &str = "invalid guest components rest api feature given. Valid values are `all`, `attestation`, `resource`";
const ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE: &str = "invalid guest components process param given. Valid values are `attestation-agent`, `confidential-data-hub`, `api-server-rest`, or `none`";
#[derive(Clone, Copy, Debug, Default, Display, Deserialize, EnumString, PartialEq)]
// Features seem to typically be in kebab-case format, but we only have single words at the moment
#[strum(serialize_all = "kebab-case")]
#[serde(rename_all = "kebab-case")]
pub enum GuestComponentsFeatures {
All,
Attestation,
None,
#[default]
Resource,
}
#[derive(Clone, Copy, Debug, Default, Display, Deserialize, EnumString, PartialEq)]
/// Attestation-related processes that we want to spawn as children of the agent
#[strum(serialize_all = "kebab-case")]
#[serde(rename_all = "kebab-case")]
pub enum GuestComponentsProcs {
None,
/// ApiServerRest implies ConfidentialDataHub and AttestationAgent
#[default]
ApiServerRest,
AttestationAgent,
/// ConfidentialDataHub implies AttestationAgent
ConfidentialDataHub,
}
#[derive(Debug)]
pub struct AgentConfig {
pub debug_console: bool,
@@ -89,6 +107,9 @@ pub struct AgentConfig {
pub https_proxy: String,
pub no_proxy: String,
pub guest_components_rest_api: GuestComponentsFeatures,
pub guest_components_procs: GuestComponentsProcs,
#[cfg(feature = "guest-pull")]
pub image_registry_auth: String,
}
#[derive(Debug, Deserialize)]
@@ -107,6 +128,9 @@ pub struct AgentConfigBuilder {
pub https_proxy: Option<String>,
pub no_proxy: Option<String>,
pub guest_components_rest_api: Option<GuestComponentsFeatures>,
pub guest_components_procs: Option<GuestComponentsProcs>,
#[cfg(feature = "guest-pull")]
pub image_registry_auth: Option<String>,
}
macro_rules! config_override {
@@ -171,6 +195,9 @@ impl Default for AgentConfig {
https_proxy: String::from(""),
no_proxy: String::from(""),
guest_components_rest_api: GuestComponentsFeatures::default(),
guest_components_procs: GuestComponentsProcs::default(),
#[cfg(feature = "guest-pull")]
image_registry_auth: String::from(""),
}
}
}
@@ -207,6 +234,9 @@ impl FromStr for AgentConfig {
agent_config,
guest_components_rest_api
);
config_override!(agent_config_builder, agent_config, guest_components_procs);
#[cfg(feature = "guest-pull")]
config_override!(agent_config_builder, agent_config, image_registry_auth);
Ok(agent_config)
}
@@ -220,7 +250,10 @@ impl AgentConfig {
let config_position = args.iter().position(|a| a == "--config" || a == "-c");
if let Some(config_position) = config_position {
if let Some(config_file) = args.get(config_position + 1) {
return AgentConfig::from_config_file(config_file).context("AgentConfig from args");
let mut config =
AgentConfig::from_config_file(config_file).context("AgentConfig from args")?;
config.override_config_from_envs();
return Ok(config);
} else {
panic!("The config argument wasn't formed properly: {:?}", args);
}
@@ -293,7 +326,6 @@ impl AgentConfig {
get_vsock_port,
|port| port > 0
);
parse_cmdline_param!(
param,
CONTAINER_PIPE_SIZE_OPTION,
@@ -314,23 +346,22 @@ impl AgentConfig {
config.guest_components_rest_api,
get_guest_components_features_value
);
parse_cmdline_param!(
param,
GUEST_COMPONENTS_PROCS_OPTION,
config.guest_components_procs,
get_guest_components_procs_value
);
#[cfg(feature = "guest-pull")]
parse_cmdline_param!(
param,
IMAGE_REGISTRY_AUTH_OPTION,
config.image_registry_auth,
get_string_value
);
}
if let Ok(addr) = env::var(SERVER_ADDR_ENV_VAR) {
config.server_addr = addr;
}
if let Ok(addr) = env::var(LOG_LEVEL_ENV_VAR) {
if let Ok(level) = logrus_to_slog_level(&addr) {
config.log_level = level;
}
}
if let Ok(value) = env::var(TRACING_ENV_VAR) {
let name_value = format!("{}={}", TRACING_ENV_VAR, value);
config.tracing = get_bool_value(&name_value)?;
}
config.override_config_from_envs();
Ok(config)
}
@@ -341,6 +372,25 @@ impl AgentConfig {
.with_context(|| format!("Failed to read config file {}", file))?;
AgentConfig::from_str(&config)
}
#[instrument]
fn override_config_from_envs(&mut self) {
if let Ok(addr) = env::var(SERVER_ADDR_ENV_VAR) {
self.server_addr = addr;
}
if let Ok(addr) = env::var(LOG_LEVEL_ENV_VAR) {
if let Ok(level) = logrus_to_slog_level(&addr) {
self.log_level = level;
}
}
if let Ok(value) = env::var(TRACING_ENV_VAR) {
let name_value = format!("{}={}", TRACING_ENV_VAR, value);
self.tracing = get_bool_value(&name_value).unwrap_or(false);
}
}
}
#[instrument]
@@ -471,13 +521,24 @@ fn get_url_value(param: &str) -> Result<String> {
fn get_guest_components_features_value(param: &str) -> Result<GuestComponentsFeatures> {
let fields: Vec<&str> = param.split('=').collect();
ensure!(fields.len() >= 2, ERR_INVALID_GET_VALUE_PARAM);
// We need name (but the value can be blank)
ensure!(!fields[0].is_empty(), ERR_INVALID_GET_VALUE_NO_NAME);
let value = fields[1..].join("=");
GuestComponentsFeatures::from_str(&value)
.map_err(|_| anyhow!(ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE))
}
#[instrument]
fn get_guest_components_procs_value(param: &str) -> Result<GuestComponentsProcs> {
let fields: Vec<&str> = param.split('=').collect();
ensure!(fields.len() >= 2, ERR_INVALID_GET_VALUE_PARAM);
// We need name (but the value can be blank)
ensure!(!fields[0].is_empty(), ERR_INVALID_GET_VALUE_NO_NAME);
let value = fields[1..].join("=");
GuestComponentsFeatures::from_str(&value)
.map_err(|_| anyhow!(ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE))
GuestComponentsProcs::from_str(&value)
.map_err(|_| anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE))
}
#[cfg(test)]
@@ -486,6 +547,7 @@ mod tests {
use super::*;
use anyhow::anyhow;
use serial_test::serial;
use std::fs::File;
use std::io::Write;
use std::time;
@@ -501,6 +563,8 @@ mod tests {
}
#[test]
// Run in serial to stop the env set interfering with test_from_cmdline_with_args_overwrites
#[serial]
fn test_from_cmdline() {
const TEST_SERVER_ADDR: &str = "vsock://-1:1024";
@@ -519,6 +583,9 @@ mod tests {
https_proxy: &'a str,
no_proxy: &'a str,
guest_components_rest_api: GuestComponentsFeatures,
guest_components_procs: GuestComponentsProcs,
#[cfg(feature = "guest-pull")]
image_registry_auth: &'a str,
}
impl Default for TestData<'_> {
@@ -537,6 +604,9 @@ mod tests {
https_proxy: "",
no_proxy: "",
guest_components_rest_api: GuestComponentsFeatures::default(),
guest_components_procs: GuestComponentsProcs::default(),
#[cfg(feature = "guest-pull")]
image_registry_auth: "",
}
}
}
@@ -745,6 +815,13 @@ mod tests {
server_addr: "unix://@/tmp/foo.socket",
..Default::default()
},
// Test that env_var has precedence over the command line (which is the current behaviour)
TestData {
contents: "agent.server_addr=unix:///tmp/ignored.socket",
env_vars: vec!["KATA_AGENT_SERVER_ADDR=unix:///tmp/foo.socket"],
server_addr: "unix:///tmp/foo.socket",
..Default::default()
},
TestData {
contents: "",
env_vars: vec!["KATA_AGENT_LOG_LEVEL="],
@@ -942,8 +1019,35 @@ mod tests {
..Default::default()
},
TestData {
contents: "agent.guest_components_rest_api=none",
guest_components_rest_api: GuestComponentsFeatures::None,
contents: "agent.guest_components_procs=api-server-rest",
guest_components_procs: GuestComponentsProcs::ApiServerRest,
..Default::default()
},
TestData {
contents: "agent.guest_components_procs=confidential-data-hub",
guest_components_procs: GuestComponentsProcs::ConfidentialDataHub,
..Default::default()
},
TestData {
contents: "agent.guest_components_procs=attestation-agent",
guest_components_procs: GuestComponentsProcs::AttestationAgent,
..Default::default()
},
TestData {
contents: "agent.guest_components_procs=none",
guest_components_procs: GuestComponentsProcs::None,
..Default::default()
},
#[cfg(feature = "guest-pull")]
TestData {
contents: "agent.image_registry_auth=file:///root/.docker/config.json",
image_registry_auth: "file:///root/.docker/config.json",
..Default::default()
},
#[cfg(feature = "guest-pull")]
TestData {
contents: "agent.image_registry_auth=kbs:///default/credentials/test",
image_registry_auth: "kbs:///default/credentials/test",
..Default::default()
},
];
@@ -1000,6 +1104,13 @@ mod tests {
"{}",
msg
);
assert_eq!(
d.guest_components_procs, config.guest_components_procs,
"{}",
msg
);
#[cfg(feature = "guest-pull")]
assert_eq!(d.image_registry_auth, config.image_registry_auth, "{}", msg);
for v in vars_to_unset {
env::remove_var(v);
@@ -1008,15 +1119,17 @@ mod tests {
}
#[test]
// Run in serial to stop the env set interfering with test_from_cmdline
#[serial]
fn test_from_cmdline_with_args_overwrites() {
let expected = AgentConfig {
dev_mode: true,
server_addr: "unix://@/tmp/foo.socket".to_string(),
server_addr: "unix:///tmp/overwrite.socket".to_string(),
..Default::default()
};
let example_config_file_contents =
"dev_mode = true\nserver_addr = 'unix://@/tmp/foo.socket'";
"dev_mode = true\nserver_addr = 'unix:///tmp/ignored.socket'";
let dir = tempdir().expect("failed to create tmpdir");
let file_path = dir.path().join("config.toml");
let filename = file_path.to_str().expect("failed to create filename");
@@ -1024,10 +1137,15 @@ mod tests {
file.write_all(example_config_file_contents.as_bytes())
.unwrap_or_else(|_| panic!("failed to write file contents"));
// Ensure that the env has precedence over agent config file
env::set_var("KATA_AGENT_SERVER_ADDR", "unix:///tmp/overwrite.socket");
let config =
AgentConfig::from_cmdline("", vec!["--config".to_string(), filename.to_string()])
.expect("Failed to parse command line");
env::remove_var("KATA_AGENT_SERVER_ADDR");
assert_eq!(expected.debug_console, config.debug_console);
assert_eq!(expected.dev_mode, config.dev_mode);
assert_eq!(
@@ -1500,10 +1618,6 @@ Caused by:
param: "x=attestation",
result: Ok(GuestComponentsFeatures::Attestation),
},
TestData {
param: "x=none",
result: Ok(GuestComponentsFeatures::None),
},
TestData {
param: "x=resource",
result: Ok(GuestComponentsFeatures::Resource),
@@ -1533,12 +1647,76 @@ Caused by:
}
}
#[test]
fn test_get_guest_components_procs_value() {
#[derive(Debug)]
struct TestData<'a> {
param: &'a str,
result: Result<GuestComponentsProcs>,
}
let tests = &[
TestData {
param: "",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_PARAM)),
},
TestData {
param: "=",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)),
},
TestData {
param: "==",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)),
},
TestData {
param: "x=attestation-agent",
result: Ok(GuestComponentsProcs::AttestationAgent),
},
TestData {
param: "x=confidential-data-hub",
result: Ok(GuestComponentsProcs::ConfidentialDataHub),
},
TestData {
param: "x=none",
result: Ok(GuestComponentsProcs::None),
},
TestData {
param: "x=api-server-rest",
result: Ok(GuestComponentsProcs::ApiServerRest),
},
TestData {
param: "x===",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)),
},
TestData {
param: "x==x",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)),
},
TestData {
param: "x=x",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = get_guest_components_procs_value(d.param);
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
}
#[test]
fn test_config_builder_from_string() {
let config = AgentConfig::from_str(
r#"
dev_mode = true
server_addr = 'vsock://8:2048'
guest_components_procs = "api-server-rest"
guest_components_rest_api = "all"
"#,
)
.unwrap();
@@ -1546,6 +1724,14 @@ Caused by:
// Verify that the override worked
assert!(config.dev_mode);
assert_eq!(config.server_addr, "vsock://8:2048");
assert_eq!(
config.guest_components_procs,
GuestComponentsProcs::ApiServerRest
);
assert_eq!(
config.guest_components_rest_api,
GuestComponentsFeatures::All
);
// Verify that the default values are valid
assert_eq!(config.hotplug_timeout, DEFAULT_HOTPLUG_TIMEOUT);

View File

@@ -710,7 +710,14 @@ pub fn update_env_pci(
env: &mut [String],
pcimap: &HashMap<pci::Address, pci::Address>,
) -> Result<()> {
for envvar in env {
// SR-IOV device plugin may add two environment variables for one resource:
// - PCIDEVICE_<prefix>_<resource-name>: a list of PCI device ids separated by comma
// - PCIDEVICE_<prefix>_<resource-name>_INFO: detailed info in JSON for above PCI devices
// Both environment variables hold information about the same set of PCI devices.
// Below code updates both of them in two passes:
// - 1st pass updates PCIDEVICE_<prefix>_<resource-name> and collects host to guest PCI address mapping
let mut pci_dev_map: HashMap<String, HashMap<String, String>> = HashMap::new();
for envvar in env.iter_mut() {
let eqpos = envvar
.find('=')
.ok_or_else(|| anyhow!("Malformed OCI env entry {:?}", envvar))?;
@@ -718,24 +725,46 @@ pub fn update_env_pci(
let (name, eqval) = envvar.split_at(eqpos);
let val = &eqval[1..];
if !name.starts_with("PCIDEVICE_") {
if !name.starts_with("PCIDEVICE_") || name.ends_with("_INFO") {
continue;
}
let mut addr_map: HashMap<String, String> = HashMap::new();
let mut guest_addrs = Vec::<String>::new();
for host_addr in val.split(',') {
let host_addr = pci::Address::from_str(host_addr)
for host_addr_str in val.split(',') {
let host_addr = pci::Address::from_str(host_addr_str)
.with_context(|| format!("Can't parse {} environment variable", name))?;
let guest_addr = pcimap
.get(&host_addr)
.ok_or_else(|| anyhow!("Unable to translate host PCI address {}", host_addr))?;
guest_addrs.push(format!("{}", guest_addr));
addr_map.insert(host_addr_str.to_string(), format!("{}", guest_addr));
}
pci_dev_map.insert(format!("{}_INFO", name), addr_map);
envvar.replace_range(eqpos + 1.., guest_addrs.join(",").as_str());
}
// - 2nd pass update PCIDEVICE_<prefix>_<resource-name>_INFO if it exists
for envvar in env.iter_mut() {
let eqpos = envvar
.find('=')
.ok_or_else(|| anyhow!("Malformed OCI env entry {:?}", envvar))?;
let (name, _) = envvar.split_at(eqpos);
if !(name.starts_with("PCIDEVICE_") && name.ends_with("_INFO")) {
continue;
}
if let Some(addr_map) = pci_dev_map.get(name) {
for (host_addr, guest_addr) in addr_map {
*envvar = envvar.replace(host_addr, guest_addr);
}
}
}
Ok(())
}
@@ -867,9 +896,10 @@ async fn vfio_pci_device_handler(
}
group = Some(devgroup);
pci_fixups.push((host, guestdev));
}
// collect PCI address mapping for both vfio-pci-gk and vfio-pci device
pci_fixups.push((host, guestdev));
}
let dev_update = if vfio_in_guest {
@@ -903,7 +933,11 @@ async fn vfio_ap_device_handler(
for apqn in device.options.iter() {
wait_for_ap_device(sandbox, ap::Address::from_str(apqn)?).await?;
}
Ok(Default::default())
let dev_update = Some(DevUpdate::new(Z9_CRYPT_DEV_PATH, Z9_CRYPT_DEV_PATH)?);
Ok(SpecUpdate {
dev: dev_update,
pci: Vec::new(),
})
}
#[cfg(not(target_arch = "s390x"))]
@@ -932,22 +966,22 @@ pub async fn add_devices(
));
}
let mut sb = sandbox.lock().await;
for (host, guest) in update.pci {
if let Some(other_guest) = sb.pcimap.insert(host, guest) {
return Err(anyhow!(
"Conflicting guest address for host device {} ({} versus {})",
host,
guest,
other_guest
));
}
}
// Update cgroup to allow all devices added to guest.
insert_devices_cgroup_rule(spec, &dev_update.info, true, "rwm")
.context("Update device cgroup")?;
}
let mut sb = sandbox.lock().await;
for (host, guest) in update.pci {
if let Some(other_guest) = sb.pcimap.insert(host, guest) {
return Err(anyhow!(
"Conflicting guest address for host device {} ({} versus {})",
host,
guest,
other_guest
));
}
}
}
if let Some(process) = spec.process.as_mut() {
@@ -1382,8 +1416,11 @@ mod tests {
("0000:01:01.0", "ffff:02:1f.7"),
];
let pci_dev_info_original = r#"PCIDEVICE_x_INFO={"0000:1a:01.0":{"generic":{"deviceID":"0000:1a:01.0"}},"0000:1b:02.0":{"generic":{"deviceID":"0000:1b:02.0"}}}"#;
let pci_dev_info_expected = r#"PCIDEVICE_x_INFO={"0000:01:01.0":{"generic":{"deviceID":"0000:01:01.0"}},"0000:01:02.0":{"generic":{"deviceID":"0000:01:02.0"}}}"#;
let mut env = vec![
"PCIDEVICE_x=0000:1a:01.0,0000:1b:02.0".to_string(),
pci_dev_info_original.to_string(),
"PCIDEVICE_y=0000:01:01.0".to_string(),
"NOTAPCIDEVICE_blah=abcd:ef:01.0".to_string(),
];
@@ -1399,11 +1436,12 @@ mod tests {
.collect();
let res = update_env_pci(&mut env, &pci_fixups);
assert!(res.is_ok());
assert!(res.is_ok(), "error: {}", res.err().unwrap());
assert_eq!(env[0], "PCIDEVICE_x=0000:01:01.0,0000:01:02.0");
assert_eq!(env[1], "PCIDEVICE_y=ffff:02:1f.7");
assert_eq!(env[2], "NOTAPCIDEVICE_blah=abcd:ef:01.0");
assert_eq!(env[1], pci_dev_info_expected);
assert_eq!(env[2], "PCIDEVICE_y=ffff:02:1f.7");
assert_eq!(env[3], "NOTAPCIDEVICE_blah=abcd:ef:01.0");
}
#[test]

View File

@@ -20,8 +20,6 @@ use tokio::sync::Mutex;
use crate::rpc::CONTAINER_BASE;
use crate::AGENT_CONFIG;
// A marker to merge container spec for images pulled inside guest.
const ANNO_K8S_IMAGE_NAME: &str = "io.kubernetes.cri.image-name";
const KATA_IMAGE_WORK_DIR: &str = "/run/kata-containers/image/";
const CONFIG_JSON: &str = "config.json";
const KATA_PAUSE_BUNDLE: &str = "/pause_bundle";
@@ -52,23 +50,24 @@ fn copy_if_not_exists(src: &Path, dst: &Path) -> Result<()> {
pub struct ImageService {
image_client: ImageClient,
images: HashMap<String, String>,
}
impl ImageService {
pub fn new() -> Self {
Self {
image_client: ImageClient::new(PathBuf::from(KATA_IMAGE_WORK_DIR)),
images: HashMap::new(),
let mut image_client = ImageClient::new(PathBuf::from(KATA_IMAGE_WORK_DIR));
#[cfg(feature = "guest-pull")]
if !AGENT_CONFIG.image_registry_auth.is_empty() {
let registry_auth = &AGENT_CONFIG.image_registry_auth;
debug!(sl(), "Set registry auth file {:?}", registry_auth);
image_client.config.file_paths.auth_file = registry_auth.clone();
image_client.config.auth = true;
}
}
async fn add_image(&mut self, image: String, cid: String) {
self.images.insert(image, cid);
Self { image_client }
}
/// pause image is packaged in rootfs
fn unpack_pause_image(cid: &str, target_subpath: &str) -> Result<String> {
fn unpack_pause_image(cid: &str) -> Result<String> {
verify_id(cid).context("The guest pause image cid contains invalid characters.")?;
let guest_pause_bundle = Path::new(KATA_PAUSE_BUNDLE);
@@ -102,9 +101,7 @@ impl ImageService {
bail!("The number of args should be greater than or equal to one! Please check the pause image.");
}
let container_bundle = scoped_join(CONTAINER_BASE, cid)?;
fs::create_dir_all(&container_bundle)?;
let pause_bundle = scoped_join(&container_bundle, target_subpath)?;
let pause_bundle = scoped_join(CONTAINER_BASE, cid)?;
fs::create_dir_all(&pause_bundle)?;
let pause_rootfs = scoped_join(&pause_bundle, "rootfs")?;
fs::create_dir_all(&pause_rootfs)?;
@@ -125,7 +122,7 @@ impl ImageService {
/// - `cid`: Container id
/// - `image_metadata`: Annotations about the image (exp: "containerd.io/snapshot/cri.layer-digest": "sha256:24fb2886d6f6c5d16481dd7608b47e78a8e92a13d6e64d87d57cb16d5f766d63")
/// # Returns
/// - The image rootfs bundle path. (exp. /run/kata-containers/cb0b47276ea66ee9f44cc53afa94d7980b57a52c3f306f68cb034e58d9fbd3c6/images/rootfs)
/// - The image rootfs bundle path. (exp. /run/kata-containers/cb0b47276ea66ee9f44cc53afa94d7980b57a52c3f306f68cb034e58d9fbd3c6/rootfs)
pub async fn pull_image(
&mut self,
image: &str,
@@ -146,16 +143,13 @@ impl ImageService {
}
if is_sandbox {
let mount_path = Self::unpack_pause_image(cid, "pause")?;
self.add_image(String::from(image), String::from(cid)).await;
let mount_path = Self::unpack_pause_image(cid)?;
return Ok(mount_path);
}
// Image layers will store at KATA_IMAGE_WORK_DIR, generated bundles
// with rootfs and config.json will store under CONTAINER_BASE/cid/images.
let bundle_base_dir = scoped_join(CONTAINER_BASE, cid)?;
fs::create_dir_all(&bundle_base_dir)?;
let bundle_path = scoped_join(&bundle_base_dir, "images")?;
let bundle_path = scoped_join(CONTAINER_BASE, cid)?;
fs::create_dir_all(&bundle_path)?;
info!(sl(), "pull image {image:?}, bundle path {bundle_path:?}");
@@ -179,35 +173,9 @@ impl ImageService {
return Err(e);
}
};
self.add_image(String::from(image), String::from(cid)).await;
let image_bundle_path = scoped_join(&bundle_path, "rootfs")?;
Ok(image_bundle_path.as_path().display().to_string())
}
/// Partially merge an OCI process specification into another one.
fn merge_oci_process(&self, target: &mut oci::Process, source: &oci::Process) {
// Override the target args only when the target args is empty and source.args is not empty
if target.args.is_empty() && !source.args.is_empty() {
target.args.append(&mut source.args.clone());
}
// Override the target cwd only when the target cwd is blank and source.cwd is not blank
if target.cwd == "/" && source.cwd != "/" {
target.cwd = String::from(&source.cwd);
}
for source_env in &source.env {
if let Some((variable_name, variable_value)) = source_env.split_once('=') {
debug!(
sl(),
"source spec environment variable: {variable_name:?} : {variable_value:?}"
);
if !target.env.iter().any(|i| i.contains(variable_name)) {
target.env.push(source_env.to_string());
}
}
}
}
}
/// Set proxy environment from AGENT_CONFIG
@@ -237,55 +205,6 @@ pub async fn set_proxy_env_vars() {
};
}
/// When being passed an image name through a container annotation, merge its
/// corresponding bundle OCI specification into the passed container creation one.
pub async fn merge_bundle_oci(container_oci: &mut oci::Spec) -> Result<()> {
let image_service = IMAGE_SERVICE.clone();
let mut image_service = image_service.lock().await;
let image_service = image_service
.as_mut()
.expect("Image Service not initialized");
if let Some(image_name) = container_oci.annotations.get(ANNO_K8S_IMAGE_NAME) {
if let Some(container_id) = image_service.images.get(image_name) {
let image_oci_config_path = Path::new(CONTAINER_BASE)
.join(container_id)
.join(CONFIG_JSON);
debug!(
sl(),
"Image bundle config path: {:?}", image_oci_config_path
);
let image_oci = oci::Spec::load(image_oci_config_path.to_str().ok_or_else(|| {
anyhow!(
"Invalid container image OCI config path {:?}",
image_oci_config_path
)
})?)
.context("load image bundle")?;
if let (Some(container_root), Some(image_root)) =
(container_oci.root.as_mut(), image_oci.root.as_ref())
{
let root_path = Path::new(CONTAINER_BASE)
.join(container_id)
.join(image_root.path.clone());
container_root.path =
String::from(root_path.to_str().ok_or_else(|| {
anyhow!("Invalid container image root path {:?}", root_path)
})?);
}
if let (Some(container_process), Some(image_process)) =
(container_oci.process.as_mut(), image_oci.process.as_ref())
{
image_service.merge_oci_process(container_process, image_process);
}
}
}
Ok(())
}
/// Init the image service
pub async fn init_image_service() {
let image_service = ImageService::new();
@@ -305,71 +224,3 @@ pub async fn pull_image(
image_service.pull_image(image, cid, image_metadata).await
}
#[cfg(test)]
mod tests {
use super::ImageService;
use rstest::rstest;
#[rstest]
// TODO - how can we tell the user didn't specifically set it to `/` vs not setting at all? Is that scenario valid?
#[case::image_cwd_should_override_blank_container_cwd("/", "/imageDir", "/imageDir")]
#[case::container_cwd_should_override_image_cwd("/containerDir", "/imageDir", "/containerDir")]
#[case::container_cwd_should_override_blank_image_cwd("/containerDir", "/", "/containerDir")]
async fn test_merge_cwd(
#[case] container_process_cwd: &str,
#[case] image_process_cwd: &str,
#[case] expected: &str,
) {
let image_service = ImageService::new();
let mut container_process = oci::Process {
cwd: container_process_cwd.to_string(),
..Default::default()
};
let image_process = oci::Process {
cwd: image_process_cwd.to_string(),
..Default::default()
};
image_service.merge_oci_process(&mut container_process, &image_process);
assert_eq!(expected, container_process.cwd);
}
#[rstest]
#[case::pods_environment_overrides_images(
vec!["ISPRODUCTION=true".to_string()],
vec!["ISPRODUCTION=false".to_string()],
vec!["ISPRODUCTION=true".to_string()]
)]
#[case::multiple_environment_variables_can_be_overrided(
vec!["ISPRODUCTION=true".to_string(), "ISDEVELOPMENT=false".to_string()],
vec!["ISPRODUCTION=false".to_string(), "ISDEVELOPMENT=true".to_string()],
vec!["ISPRODUCTION=true".to_string(), "ISDEVELOPMENT=false".to_string()]
)]
#[case::not_override_them_when_none_of_variables_match(
vec!["ANOTHERENV=TEST".to_string()],
vec!["ISPRODUCTION=false".to_string(), "ISDEVELOPMENT=true".to_string()],
vec!["ANOTHERENV=TEST".to_string(), "ISPRODUCTION=false".to_string(), "ISDEVELOPMENT=true".to_string()]
)]
#[case::a_mix_of_both_overriding_and_not(
vec!["ANOTHERENV=TEST".to_string(), "ISPRODUCTION=true".to_string()],
vec!["ISPRODUCTION=false".to_string(), "ISDEVELOPMENT=true".to_string()],
vec!["ANOTHERENV=TEST".to_string(), "ISPRODUCTION=true".to_string(), "ISDEVELOPMENT=true".to_string()]
)]
async fn test_merge_env(
#[case] container_process_env: Vec<String>,
#[case] image_process_env: Vec<String>,
#[case] expected: Vec<String>,
) {
let image_service = ImageService::new();
let mut container_process = oci::Process {
env: container_process_env,
..Default::default()
};
let image_process = oci::Process {
env: image_process_env,
..Default::default()
};
image_service.merge_oci_process(&mut container_process, &image_process);
assert_eq!(expected, container_process.env);
}
}

View File

@@ -71,6 +71,7 @@ cfg_if! {
pub const CCW_ROOT_BUS_PATH: &str = "/devices/css0";
pub const AP_ROOT_BUS_PATH: &str = "/devices/ap";
pub const AP_SCANS_PATH: &str = "/sys/bus/ap/scans";
pub const Z9_CRYPT_DEV_PATH: &str = "/dev/z90crypt";
}
}

View File

@@ -38,6 +38,7 @@ use std::process::Command;
use std::sync::Arc;
use tracing::{instrument, span};
mod cdh;
mod config;
mod console;
mod device;
@@ -59,11 +60,12 @@ mod util;
mod version;
mod watcher;
use config::GuestComponentsFeatures;
use cdh::CDHClient;
use config::GuestComponentsProcs;
use mount::{cgroups_mount, general_mount};
use sandbox::Sandbox;
use signal::setup_signal_handler;
use slog::{error, info, o, warn, Logger};
use slog::{debug, error, info, o, warn, Logger};
use uevent::watch_uevents;
use futures::future::join_all;
@@ -104,9 +106,13 @@ const AA_ATTESTATION_URI: &str = concatcp!(UNIX_SOCKET_PREFIX, AA_ATTESTATION_SO
const CDH_PATH: &str = "/usr/local/bin/confidential-data-hub";
const CDH_SOCKET: &str = "/run/confidential-containers/cdh.sock";
const CDH_SOCKET_URI: &str = concatcp!(UNIX_SOCKET_PREFIX, CDH_SOCKET);
const API_SERVER_PATH: &str = "/usr/local/bin/api-server-rest";
/// Path of ocicrypt config file. This is used by image-rs when decrypting image.
const OCICRYPT_CONFIG_PATH: &str = "/tmp/ocicrypt_config.json";
const DEFAULT_LAUNCH_PROCESS_TIMEOUT: i32 = 6;
lazy_static! {
@@ -403,12 +409,28 @@ async fn start_sandbox(
let (tx, rx) = tokio::sync::oneshot::channel();
sandbox.lock().await.sender = Some(tx);
if Path::new(CDH_PATH).exists() && Path::new(AA_PATH).exists() {
init_attestation_components(logger, config)?;
let mut cdh_client = None;
let gc_procs = config.guest_components_procs;
if gc_procs != GuestComponentsProcs::None {
if !attestation_binaries_available(logger, &gc_procs) {
warn!(
logger,
"attestation binaries requested for launch not available"
);
} else {
cdh_client = init_attestation_components(logger, config)?;
}
}
// vsock:///dev/vsock, port
let mut server = rpc::start(sandbox.clone(), config.server_addr.as_str(), init_mode).await?;
let mut server = rpc::start(
sandbox.clone(),
config.server_addr.as_str(),
init_mode,
cdh_client,
)
.await?;
server.start().await?;
rx.await?;
@@ -417,9 +439,34 @@ async fn start_sandbox(
Ok(())
}
// Check if required attestation binaries are available on the rootfs.
fn attestation_binaries_available(logger: &Logger, procs: &GuestComponentsProcs) -> bool {
let binaries = match procs {
GuestComponentsProcs::AttestationAgent => vec![AA_PATH],
GuestComponentsProcs::ConfidentialDataHub => vec![AA_PATH, CDH_PATH],
GuestComponentsProcs::ApiServerRest => vec![AA_PATH, CDH_PATH, API_SERVER_PATH],
_ => vec![],
};
for binary in binaries.iter() {
if !Path::new(binary).exists() {
warn!(logger, "{} not found", binary);
return false;
}
}
true
}
// Start-up attestation-agent, CDH and api-server-rest if they are packaged in the rootfs
fn init_attestation_components(logger: &Logger, _config: &AgentConfig) -> Result<()> {
// The Attestation Agent will run for the duration of the guest.
// and the corresponding procs are enabled in the agent configuration. the process will be
// launched in the background and the function will return immediately.
// If the CDH is started, a CDH client will be instantiated and returned.
fn init_attestation_components(logger: &Logger, config: &AgentConfig) -> Result<Option<CDHClient>> {
// skip launch of any guest-component
if config.guest_components_procs == GuestComponentsProcs::None {
return Ok(None);
}
debug!(logger, "spawning attestation-agent process {}", AA_PATH);
launch_process(
logger,
AA_PATH,
@@ -429,33 +476,58 @@ fn init_attestation_components(logger: &Logger, _config: &AgentConfig) -> Result
)
.map_err(|e| anyhow!("launch_process {} failed: {:?}", AA_PATH, e))?;
if let Err(e) = launch_process(
// skip launch of confidential-data-hub and api-server-rest
if config.guest_components_procs == GuestComponentsProcs::AttestationAgent {
return Ok(None);
}
let ocicrypt_config = serde_json::json!({
"key-providers": {
"attestation-agent":{
"ttrpc":CDH_SOCKET_URI
}
}
});
fs::write(OCICRYPT_CONFIG_PATH, ocicrypt_config.to_string().as_bytes())?;
env::set_var("OCICRYPT_KEYPROVIDER_CONFIG", OCICRYPT_CONFIG_PATH);
debug!(
logger,
"spawning confidential-data-hub process {}", CDH_PATH
);
launch_process(
logger,
CDH_PATH,
&vec![],
CDH_SOCKET,
DEFAULT_LAUNCH_PROCESS_TIMEOUT,
) {
error!(logger, "launch_process {} failed: {:?}", CDH_PATH, e);
} else {
let features = _config.guest_components_rest_api;
match features {
GuestComponentsFeatures::None => {}
_ => {
if let Err(e) = launch_process(
logger,
API_SERVER_PATH,
&vec!["--features", &features.to_string()],
"",
0,
) {
error!(logger, "launch_process {} failed: {:?}", API_SERVER_PATH, e);
}
}
}
)
.map_err(|e| anyhow!("launch_process {} failed: {:?}", CDH_PATH, e))?;
let cdh_client = CDHClient::new().context("Failed to create CDH Client")?;
// skip launch of api-server-rest
if config.guest_components_procs == GuestComponentsProcs::ConfidentialDataHub {
return Ok(Some(cdh_client));
}
Ok(())
let features = config.guest_components_rest_api;
debug!(
logger,
"spawning api-server-rest process {} --features {}", API_SERVER_PATH, features
);
launch_process(
logger,
API_SERVER_PATH,
&vec!["--features", &features.to_string()],
"",
0,
)
.map_err(|e| anyhow!("launch_process {} failed: {:?}", API_SERVER_PATH, e))?;
Ok(Some(cdh_client))
}
fn wait_for_path_to_exist(logger: &Logger, path: &str, timeout_secs: i32) -> Result<()> {

View File

@@ -76,6 +76,8 @@ use crate::policy::{do_set_policy, is_allowed};
#[cfg(feature = "guest-pull")]
use crate::image;
use crate::cdh::CDHClient;
use opentelemetry::global;
use tracing::span;
use tracing_opentelemetry::OpenTelemetrySpanExt;
@@ -171,6 +173,7 @@ impl<T> OptionToTtrpcResult<T> for Option<T> {
pub struct AgentService {
sandbox: Arc<Mutex<Sandbox>>,
init_mode: bool,
cdh_client: Option<CDHClient>,
}
impl AgentService {
@@ -210,11 +213,6 @@ impl AgentService {
"receive createcontainer, storages: {:?}", &req.storages
);
// In case of pulling image inside guest, we need to merge the image bundle OCI spec
// into the container creation request OCI spec.
#[cfg(feature = "guest-pull")]
image::merge_bundle_oci(&mut oci).await?;
// Some devices need some extra processing (the ones invoked with
// --device for instance), and that's what this call is doing. It
// updates the devices listed in the OCI spec, so that they actually
@@ -222,6 +220,22 @@ impl AgentService {
// cannot predict everything from the caller.
add_devices(&req.devices, &mut oci, &self.sandbox).await?;
if let Some(cdh) = self.cdh_client.as_ref() {
let process = oci
.process
.as_mut()
.ok_or_else(|| anyhow!("Spec didn't contain process field"))?;
for env in process.env.iter_mut() {
match cdh.unseal_env(env).await {
Ok(unsealed_env) => *env = unsealed_env.to_string(),
Err(e) => {
warn!(sl(), "Failed to unseal secret: {}", e)
}
}
}
}
// Both rootfs and volumes (invoked with --volume for instance) will
// be processed the same way. The idea is to always mount any provided
// storage to the specified MountPoint, so that it will match what's
@@ -584,25 +598,32 @@ impl AgentService {
let cid = req.container_id;
let eid = req.exec_id;
let writer = {
let mut sandbox = self.sandbox.lock().await;
let p = sandbox.find_container_process(cid.as_str(), eid.as_str())?;
// use ptmx io
if p.term_master.is_some() {
p.get_writer(StreamType::TermMaster)
} else {
// use piped io
p.get_writer(StreamType::ParentStdin)
}
};
let writer = writer.ok_or_else(|| anyhow!(ERR_CANNOT_GET_WRITER))?;
writer.lock().await.write_all(req.data.as_slice()).await?;
let mut resp = WriteStreamResponse::new();
resp.set_len(req.data.len() as u32);
// EOF of stdin
if req.data.is_empty() {
let mut sandbox = self.sandbox.lock().await;
let p = sandbox.find_container_process(cid.as_str(), eid.as_str())?;
p.close_stdin().await;
} else {
let writer = {
let mut sandbox = self.sandbox.lock().await;
let p = sandbox.find_container_process(cid.as_str(), eid.as_str())?;
// use ptmx io
if p.term_master.is_some() {
p.get_writer(StreamType::TermMaster)
} else {
// use piped io
p.get_writer(StreamType::ParentStdin)
}
};
let writer = writer.ok_or_else(|| anyhow!(ERR_CANNOT_GET_WRITER))?;
writer.lock().await.write_all(req.data.as_slice()).await?;
}
Ok(resp)
}
@@ -645,6 +666,7 @@ impl AgentService {
biased;
v = read_stream(&reader, req.len as usize) => {
let vector = v?;
let mut resp = ReadStreamResponse::new();
resp.set_data(vector);
@@ -845,6 +867,9 @@ impl agent_ttrpc::AgentService for AgentService {
ctx: &TtrpcContext,
req: protocols::agent::CloseStdinRequest,
) -> ttrpc::Result<Empty> {
// The stdin will be closed when EOF is got in rpc `write_stdin`[runtime-rs]
// so this rpc will not be called anymore by runtime-rs.
trace_rpc_call!(ctx, "close_stdin", req);
is_allowed(&req).await?;
@@ -1601,10 +1626,12 @@ pub async fn start(
s: Arc<Mutex<Sandbox>>,
server_address: &str,
init_mode: bool,
cdh_client: Option<CDHClient>,
) -> Result<TtrpcServer> {
let agent_service = Box::new(AgentService {
sandbox: s,
init_mode,
cdh_client,
}) as Box<dyn agent_ttrpc::AgentService + Send + Sync>;
let aservice = agent_ttrpc::create_agent_service(Arc::new(agent_service));
@@ -1920,21 +1947,28 @@ pub fn setup_bundle(cid: &str, spec: &mut Spec) -> Result<PathBuf> {
return Err(anyhow!(nix::Error::EINVAL));
};
let spec_root_path = Path::new(&spec_root.path);
let bundle_path = Path::new(CONTAINER_BASE).join(cid);
let config_path = bundle_path.join("config.json");
let rootfs_path = bundle_path.join("rootfs");
let spec_root_path = Path::new(&spec_root.path);
fs::create_dir_all(&rootfs_path)?;
baremount(
spec_root_path,
&rootfs_path,
"bind",
MsFlags::MS_BIND,
"",
&sl(),
)?;
let rootfs_exists = Path::new(&rootfs_path).exists();
info!(
sl(),
"The rootfs_path is {:?} and exists: {}", rootfs_path, rootfs_exists
);
if !rootfs_exists {
fs::create_dir_all(&rootfs_path)?;
baremount(
spec_root_path,
&rootfs_path,
"bind",
MsFlags::MS_BIND,
"",
&sl(),
)?;
}
let rootfs_path_name = rootfs_path
.to_str()
@@ -2148,6 +2182,7 @@ mod tests {
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
cdh_client: None,
});
let req = protocols::agent::UpdateInterfaceRequest::default();
@@ -2162,10 +2197,10 @@ mod tests {
async fn test_update_routes() {
let logger = slog::Logger::root(slog::Discard, o!());
let sandbox = Sandbox::new(&logger).unwrap();
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
cdh_client: None,
});
let req = protocols::agent::UpdateRoutesRequest::default();
@@ -2180,10 +2215,10 @@ mod tests {
async fn test_add_arp_neighbors() {
let logger = slog::Logger::root(slog::Discard, o!());
let sandbox = Sandbox::new(&logger).unwrap();
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
cdh_client: None,
});
let req = protocols::agent::AddARPNeighborsRequest::default();
@@ -2322,6 +2357,7 @@ mod tests {
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
cdh_client: None,
});
let result = agent_service
@@ -2811,6 +2847,7 @@ OtherField:other
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
cdh_client: None,
});
let ctx = mk_ttrpc_context();

View File

@@ -3,6 +3,7 @@
// SPDX-License-Identifier: Apache-2.0
//
use super::new_device;
use crate::image;
use crate::storage::{StorageContext, StorageHandler};
use anyhow::{anyhow, Result};
@@ -12,8 +13,6 @@ use protocols::agent::Storage;
use std::sync::Arc;
use tracing::instrument;
use super::{common_storage_handler, new_device};
#[derive(Debug)]
pub struct ImagePullHandler {}
@@ -36,7 +35,7 @@ impl StorageHandler for ImagePullHandler {
#[instrument]
async fn create_device(
&self,
mut storage: Storage,
storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
//Currently the image metadata is not used to pulling image in the guest.
@@ -51,12 +50,7 @@ impl StorageHandler for ImagePullHandler {
.ok_or_else(|| anyhow!("failed to get container id"))?;
let bundle_path = image::pull_image(image_name, &cid, &image_pull_volume.metadata).await?;
storage.source = bundle_path;
storage.options = vec!["bind".to_string(), "ro".to_string()];
common_storage_handler(ctx.logger, &storage)?;
new_device(storage.mount_point)
new_device(bundle_path)
}
}

View File

@@ -644,10 +644,7 @@ impl AddressSpaceMgr {
guest_numa_node_id: u32,
vcpu_ids: &[u32],
) {
let node = self
.numa_nodes
.entry(guest_numa_node_id)
.or_insert_with(NumaNode::new);
let node = self.numa_nodes.entry(guest_numa_node_id).or_default();
node.add_info(&NumaNodeInfo {
base: region.start_addr(),
size: region.len(),

View File

@@ -281,6 +281,8 @@ pub enum VmmData {
MachineConfiguration(Box<VmConfigInfo>),
/// Prometheus Metrics represented by String.
HypervisorMetrics(String),
/// Return vfio device's slot number in guest.
VfioDeviceData(Option<u8>),
/// Sync Hotplug
SyncHotplug((Sender<Option<i32>>, Receiver<Option<i32>>)),
}
@@ -398,7 +400,9 @@ impl VmmService {
self.add_balloon_device(vmm, event_mgr, balloon_cfg)
}
#[cfg(feature = "host-device")]
VmmAction::InsertHostDevice(hostdev_cfg) => self.add_vfio_device(vmm, hostdev_cfg),
VmmAction::InsertHostDevice(mut hostdev_cfg) => {
self.add_vfio_device(vmm, &mut hostdev_cfg)
}
#[cfg(feature = "host-device")]
VmmAction::PrepareRemoveHostDevice(hostdev_id) => {
self.prepare_remove_vfio_device(vmm, &hostdev_id)
@@ -850,7 +854,7 @@ impl VmmService {
}
#[cfg(feature = "host-device")]
fn add_vfio_device(&self, vmm: &mut Vmm, config: HostDeviceConfig) -> VmmRequestResult {
fn add_vfio_device(&self, vmm: &mut Vmm, config: &mut HostDeviceConfig) -> VmmRequestResult {
let vm = vmm.get_vm_mut().ok_or(VmmActionError::HostDeviceConfig(
VfioDeviceError::InvalidVMID,
))?;
@@ -873,7 +877,8 @@ impl VmmService {
.unwrap()
.insert_device(&mut ctx, config)
.map_err(VmmActionError::HostDeviceConfig)?;
Ok(VmmData::Empty)
Ok(VmmData::VfioDeviceData(config.dev_config.guest_dev_id))
}
// using upcall to unplug the pci device in the guest

View File

@@ -424,7 +424,7 @@ impl DeviceOpContext {
fn generate_virtio_device_info(&self) -> Result<HashMap<(DeviceType, String), MMIODeviceInfo>> {
let mut dev_info = HashMap::new();
#[cfg(feature = "dbs-virtio-devices")]
for (_index, device) in self.virtio_devices.iter().enumerate() {
for device in self.virtio_devices.iter() {
let (mmio_base, mmio_size, irq) = DeviceManager::get_virtio_mmio_device_info(device)?;
let dev_type;
let device_id;
@@ -553,7 +553,7 @@ impl DeviceOpContext {
&self,
dev: &Arc<dyn DeviceIo>,
callback: Option<Box<dyn Fn(UpcallClientResponse) + Send>>,
) -> Result<()> {
) -> Result<u8> {
if !self.is_hotplug || !self.pci_hotplug_enabled {
return Err(DeviceMgrError::InvalidOperation);
}
@@ -561,7 +561,12 @@ impl DeviceOpContext {
let (busno, devfn) = DeviceManager::get_pci_device_info(dev)?;
let req = DevMgrRequest::AddPciDev(PciDevRequest { busno, devfn });
self.call_hotplug_device(req, callback)
self.call_hotplug_device(req, callback)?;
// Extract the slot number from devfn
// Right shift by 3 to remove function bits (2:0) and
// align slot bits (7:3) to the least significant position
Ok(devfn >> 3)
}
#[cfg(feature = "host-device")]

View File

@@ -255,7 +255,7 @@ impl VfioDeviceMgr {
pub fn insert_device(
&mut self,
ctx: &mut DeviceOpContext,
config: HostDeviceConfig,
config: &mut HostDeviceConfig,
) -> Result<()> {
if !cfg!(feature = "hotplug") && ctx.is_hotplug {
return Err(VfioDeviceError::UpdateNotAllowedPostBoot);
@@ -267,7 +267,7 @@ impl VfioDeviceMgr {
"hostdev_id" => &config.hostdev_id,
"bdf" => &config.dev_config.bus_slot_func,
);
let device_index = self.info_list.insert_or_update(&config)?;
let device_index = self.info_list.insert_or_update(config)?;
// Handle device hotplug case
if ctx.is_hotplug {
slog::info!(
@@ -277,7 +277,7 @@ impl VfioDeviceMgr {
"hostdev_id" => &config.hostdev_id,
"bdf" => &config.dev_config.bus_slot_func,
);
self.add_device(ctx, &config, device_index)?;
self.add_device(ctx, config, device_index)?;
}
Ok(())
@@ -438,7 +438,7 @@ impl VfioDeviceMgr {
fn add_device(
&mut self,
ctx: &mut DeviceOpContext,
cfg: &HostDeviceConfig,
cfg: &mut HostDeviceConfig,
idx: usize,
) -> Result<()> {
let dev = self.create_device(cfg, ctx, idx)?;
@@ -450,8 +450,13 @@ impl VfioDeviceMgr {
self.register_memory(vm_memory.deref())?;
}
ctx.insert_hotplug_pci_device(&dev, None)
.map_err(VfioDeviceError::VfioDeviceMgr)
let slot = ctx
.insert_hotplug_pci_device(&dev, None)
.map_err(VfioDeviceError::VfioDeviceMgr)?;
cfg.dev_config.guest_dev_id = Some(slot);
Ok(())
}
/// Gets the index of the device with the specified `hostdev_id` if it exists in the list.

318
src/libs/Cargo.lock generated
View File

@@ -2,6 +2,21 @@
# It is not intended for manual editing.
version = 3
[[package]]
name = "addr2line"
version = "0.22.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6e4503c46a5c0c7844e948c9a4d6acd9f50cccb4de1c48eb9e291ea17470c678"
dependencies = [
"gimli",
]
[[package]]
name = "adler"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"
[[package]]
name = "ahash"
version = "0.7.7"
@@ -48,7 +63,7 @@ checksum = "ed6aa3524a2dfcf9fe180c51eae2b58738348d819517ceadf95789c51fff7600"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -68,6 +83,21 @@ version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa"
[[package]]
name = "backtrace"
version = "0.3.73"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5cc23269a4f8976d0a4d2e7109211a419fe30e8d88d677cd60b6bc79c5732e0a"
dependencies = [
"addr2line",
"cc",
"cfg-if",
"libc",
"miniz_oxide",
"object",
"rustc-demangle",
]
[[package]]
name = "base64"
version = "0.13.1"
@@ -87,7 +117,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fd9e32d7420c85055e8107e5b2463c4eeefeaac18b52359fe9f9c08a18f342b2"
dependencies = [
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -122,7 +152,7 @@ dependencies = [
"borsh-schema-derive-internal",
"proc-macro-crate",
"proc-macro2",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -133,7 +163,7 @@ checksum = "afb438156919598d2c7bad7e1c0adf3d26ed3840dbc010db1a882a65583ca2fb"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -144,7 +174,7 @@ checksum = "634205cc43f74a1b9046ef87c4540ebda95696ec0f315024860cad7c5b0f5ccd"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -183,7 +213,7 @@ checksum = "a7ec4c6f261935ad534c0c22dbef2201b45918860eb1c574b972bd213a76af61"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -198,6 +228,12 @@ version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a2bd12c1caf447e69cd4528f47f94d203fd2582878ecb9e9465484c4148a8223"
[[package]]
name = "cc"
version = "1.0.99"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "96c51067fd44124faa7f870b4b1c969379ad32b2ba805aa959430ceaa384f695"
[[package]]
name = "cfg-if"
version = "1.0.0"
@@ -328,7 +364,7 @@ dependencies = [
"ident_case",
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -339,7 +375,7 @@ checksum = "a4aab4dbc9f7611d8b55048a3a16d2d010c2c8334e46304b40ac1cc14bf3b48e"
dependencies = [
"darling_core",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -350,7 +386,7 @@ checksum = "3418329ca0ad70234b9735dc4ceed10af4df60eff9c8e7b06cb5e520d92c3535"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -474,7 +510,7 @@ checksum = "33c1e13800337f4d4d7a316bf45a567dbcb6ffe087f16424852d97e97a91f512"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -518,6 +554,12 @@ dependencies = [
"wasi 0.10.2+wasi-snapshot-preview1",
]
[[package]]
name = "gimli"
version = "0.29.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "40ecd4077b5ae9fd2e9e169b102c6c330d0605168eb0e8bf79952b256dbefffd"
[[package]]
name = "glob"
version = "0.3.0"
@@ -613,7 +655,7 @@ dependencies = [
"httpdate",
"itoa",
"pin-project-lite",
"socket2",
"socket2 0.4.7",
"tokio",
"tower-service",
"tracing",
@@ -748,9 +790,9 @@ checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
[[package]]
name = "libc"
version = "0.2.151"
version = "0.2.155"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "302d7ab3130588088d277783b1e2d2e10c9e9e4a16dd9050e6ec93fb3e7048f4"
checksum = "97b3888a4aecf77e811145cadf6eef5901f4782c53886191b2f693f24761847c"
[[package]]
name = "lock_api"
@@ -811,26 +853,23 @@ dependencies = [
]
[[package]]
name = "mio"
version = "0.8.2"
name = "miniz_oxide"
version = "0.7.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "52da4364ffb0e4fe33a9841a98a3f3014fb964045ce4f7a45a398243c8d6b0c9"
checksum = "87dfd01fe195c66b572b37921ad8803d010623c0aca821bea2302239d155cdae"
dependencies = [
"libc",
"log",
"miow",
"ntapi 0.3.7",
"wasi 0.11.0+wasi-snapshot-preview1",
"winapi",
"adler",
]
[[package]]
name = "miow"
version = "0.3.7"
name = "mio"
version = "0.8.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b9f1c5b025cda876f66ef43a113f91ebc9f4ccef34843000e0adf6ebbab84e21"
checksum = "a4a650543ca06a924e8b371db273b2756685faae30f8487da1b56505a8f78b0c"
dependencies = [
"winapi",
"libc",
"wasi 0.11.0+wasi-snapshot-preview1",
"windows-sys 0.48.0",
]
[[package]]
@@ -876,15 +915,6 @@ dependencies = [
"pin-utils",
]
[[package]]
name = "ntapi"
version = "0.3.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c28774a7fd2fbb4f0babd8237ce554b73af68021b5f695a3cebd6c59bac0980f"
dependencies = [
"winapi",
]
[[package]]
name = "ntapi"
version = "0.4.1"
@@ -932,6 +962,15 @@ dependencies = [
"libc",
]
[[package]]
name = "object"
version = "0.36.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "576dfe1fc8f9df304abb159d767a29d0476f7750fbf8aa7ad07816004a207434"
dependencies = [
"memchr",
]
[[package]]
name = "oci"
version = "0.1.0"
@@ -944,9 +983,9 @@ dependencies = [
[[package]]
name = "once_cell"
version = "1.17.1"
version = "1.19.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b7e5500299e16ebb147ae15a00a942af264cf3688f47923b8fc2cd5858f23ad3"
checksum = "3fdb12b2476b595f9358c5161aa467c2438859caa136dec86c26fdd2efe17b92"
[[package]]
name = "parking_lot"
@@ -1000,14 +1039,14 @@ checksum = "069bdb1e05adc7a8990dce9cc75370895fbe4e3d58b9b73bf1aee56359344a55"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
name = "pin-project-lite"
version = "0.2.8"
version = "0.2.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e280fbe77cc62c91527259e9442153f4688736748d24660126286329742b4c6c"
checksum = "bda66fc9667c18cb2758a2ac84d1167245054bcf85d5d1aaa6923f45801bdd02"
[[package]]
name = "pin-utils"
@@ -1032,11 +1071,11 @@ dependencies = [
[[package]]
name = "proc-macro2"
version = "1.0.37"
version = "1.0.85"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ec757218438d5fda206afc041538b2f6d889286160d649a86a24d37e1235afd1"
checksum = "22244ce15aa966053a896d1accb3a6e68469b97c7f33f284b99f0d576879fc23"
dependencies = [
"unicode-xid",
"unicode-ident",
]
[[package]]
@@ -1077,7 +1116,7 @@ dependencies = [
"itertools",
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -1186,14 +1225,14 @@ checksum = "16b845dbfca988fa33db069c0e230574d15a3088f147a87b64c7589eb662c9ac"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
name = "quote"
version = "1.0.18"
version = "1.0.36"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a1feb54ed693b93a84e14094943b84b7c4eae204c512b7ccb95ab0c66d278ad1"
checksum = "0fa76aaf39101c457836aec0ce2316dbdc3ab723cdda1c6bd4e6ad4208acaca7"
dependencies = [
"proc-macro2",
]
@@ -1334,7 +1373,7 @@ checksum = "b5c462a1328c8e67e4d6dbad1eb0355dd43e8ab432c6e227a43657f16ade5033"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -1353,6 +1392,12 @@ dependencies = [
"serde_json",
]
[[package]]
name = "rustc-demangle"
version = "0.1.24"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "719b953e2095829ee67db738b3bfa9fa368c94900df327b3f07fe6e794d2fe1f"
[[package]]
name = "rustversion"
version = "1.0.12"
@@ -1402,7 +1447,7 @@ checksum = "6eb8ec7724e4e524b2492b510e66957fe1a2c76c26a6975ec80823f2439da685"
dependencies = [
"darling_core",
"serde-rename-rule",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -1415,7 +1460,7 @@ dependencies = [
"proc-macro2",
"quote",
"serde-attributes",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -1432,7 +1477,7 @@ checksum = "4f1d362ca8fc9c3e3a7484440752472d68a6caa98f1ab81d99b5dfe517cec852"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -1465,7 +1510,7 @@ checksum = "b2acd6defeddb41eb60bb468f8825d0cfd0c2a76bc03bfd235b6a1dc4f6a1ad5"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -1565,6 +1610,16 @@ dependencies = [
"winapi",
]
[[package]]
name = "socket2"
version = "0.5.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ce305eb0b4296696835b71df73eb912e0f1ffd2556a501fcede6e0c50349191c"
dependencies = [
"libc",
"windows-sys 0.52.0",
]
[[package]]
name = "subprocess"
version = "0.2.9"
@@ -1587,18 +1642,29 @@ dependencies = [
]
[[package]]
name = "sysinfo"
version = "0.29.11"
name = "syn"
version = "2.0.66"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cd727fc423c2060f6c92d9534cef765c65a6ed3f428a03d7def74a8c4348e666"
checksum = "c42f3f41a2de00b01c0aaad383c5a45241efc8b2d1eda5661812fda5f3cdcff5"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "sysinfo"
version = "0.30.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "732ffa00f53e6b2af46208fba5718d9662a421049204e156328b66791ffa15ae"
dependencies = [
"cfg-if",
"core-foundation-sys",
"libc",
"ntapi 0.4.1",
"ntapi",
"once_cell",
"rayon",
"winapi",
"windows",
]
[[package]]
@@ -1662,7 +1728,7 @@ checksum = "aa32fd3f627f367fe16f893e2597ae3c05020f8bba2666a4e6ea73d377e5714b"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
]
[[package]]
@@ -1730,30 +1796,30 @@ checksum = "1f3ccbac311fea05f86f61904b462b55fb3df8837a366dfc601a0161d0532f20"
[[package]]
name = "tokio"
version = "1.17.0"
version = "1.38.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2af73ac49756f3f7c01172e34a23e5d0216f6c32333757c2c61feb2bbff5a5ee"
checksum = "ba4f4a02a7a80d6f274636f0aa95c7e383b912d41fe721a31f29e29698585a4a"
dependencies = [
"backtrace",
"bytes",
"libc",
"memchr",
"mio",
"num_cpus",
"pin-project-lite",
"socket2",
"socket2 0.5.7",
"tokio-macros",
"winapi",
"windows-sys 0.48.0",
]
[[package]]
name = "tokio-macros"
version = "1.7.0"
version = "2.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b557f72f448c511a979e2564e55d74e6c4432fc96ff4f6241bc6bded342643b7"
checksum = "5f5ae998a069d4b5aba8ee9dad856af7d520c3699e6159b185c2acd48155d39a"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 2.0.66",
]
[[package]]
@@ -1828,7 +1894,7 @@ dependencies = [
"thiserror",
"tokio",
"tokio-vsock",
"windows-sys",
"windows-sys 0.48.0",
]
[[package]]
@@ -1858,6 +1924,12 @@ dependencies = [
"tempfile",
]
[[package]]
name = "unicode-ident"
version = "1.0.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3354b9ac3fae1ff6755cb6db53683adb661634f67557942dea4facebec0fee4b"
[[package]]
name = "unicode-segmentation"
version = "1.9.0"
@@ -1941,7 +2013,7 @@ dependencies = [
"log",
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
"wasm-bindgen-shared",
]
@@ -1963,7 +2035,7 @@ checksum = "7d94ac45fcf608c1f45ef53e748d35660f168490c10b23704c7779ab8f5c3048"
dependencies = [
"proc-macro2",
"quote",
"syn",
"syn 1.0.91",
"wasm-bindgen-backend",
"wasm-bindgen-shared",
]
@@ -2007,13 +2079,41 @@ version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"
[[package]]
name = "windows"
version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e48a53791691ab099e5e2ad123536d0fff50652600abaf43bbf952894110d0be"
dependencies = [
"windows-core",
"windows-targets 0.52.5",
]
[[package]]
name = "windows-core"
version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33ab640c8d7e35bf8ba19b884ba838ceb4fba93a4e8c65a9059d08afcfc683d9"
dependencies = [
"windows-targets 0.52.5",
]
[[package]]
name = "windows-sys"
version = "0.48.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "677d2418bec65e3338edb076e806bc1ec15693c5d0104683f2efe857f61056a9"
dependencies = [
"windows-targets",
"windows-targets 0.48.5",
]
[[package]]
name = "windows-sys"
version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d"
dependencies = [
"windows-targets 0.52.5",
]
[[package]]
@@ -2022,13 +2122,29 @@ version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9a2fa6e2155d7247be68c096456083145c183cbbbc2764150dda45a87197940c"
dependencies = [
"windows_aarch64_gnullvm",
"windows_aarch64_msvc",
"windows_i686_gnu",
"windows_i686_msvc",
"windows_x86_64_gnu",
"windows_x86_64_gnullvm",
"windows_x86_64_msvc",
"windows_aarch64_gnullvm 0.48.5",
"windows_aarch64_msvc 0.48.5",
"windows_i686_gnu 0.48.5",
"windows_i686_msvc 0.48.5",
"windows_x86_64_gnu 0.48.5",
"windows_x86_64_gnullvm 0.48.5",
"windows_x86_64_msvc 0.48.5",
]
[[package]]
name = "windows-targets"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6f0713a46559409d202e70e28227288446bf7841d3211583a4b53e3f6d96e7eb"
dependencies = [
"windows_aarch64_gnullvm 0.52.5",
"windows_aarch64_msvc 0.52.5",
"windows_i686_gnu 0.52.5",
"windows_i686_gnullvm",
"windows_i686_msvc 0.52.5",
"windows_x86_64_gnu 0.52.5",
"windows_x86_64_gnullvm 0.52.5",
"windows_x86_64_msvc 0.52.5",
]
[[package]]
@@ -2037,42 +2153,90 @@ version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2b38e32f0abccf9987a4e3079dfb67dcd799fb61361e53e2882c3cbaf0d905d8"
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7088eed71e8b8dda258ecc8bac5fb1153c5cffaf2578fc8ff5d61e23578d3263"
[[package]]
name = "windows_aarch64_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dc35310971f3b2dbbf3f0690a219f40e2d9afcf64f9ab7cc1be722937c26b4bc"
[[package]]
name = "windows_aarch64_msvc"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9985fd1504e250c615ca5f281c3f7a6da76213ebd5ccc9561496568a2752afb6"
[[package]]
name = "windows_i686_gnu"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a75915e7def60c94dcef72200b9a8e58e5091744960da64ec734a6c6e9b3743e"
[[package]]
name = "windows_i686_gnu"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "88ba073cf16d5372720ec942a8ccbf61626074c6d4dd2e745299726ce8b89670"
[[package]]
name = "windows_i686_gnullvm"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "87f4261229030a858f36b459e748ae97545d6f1ec60e5e0d6a3d32e0dc232ee9"
[[package]]
name = "windows_i686_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f55c233f70c4b27f66c523580f78f1004e8b5a8b659e05a4eb49d4166cca406"
[[package]]
name = "windows_i686_msvc"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "db3c2bf3d13d5b658be73463284eaf12830ac9a26a90c717b7f771dfe97487bf"
[[package]]
name = "windows_x86_64_gnu"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "53d40abd2583d23e4718fddf1ebec84dbff8381c07cae67ff7768bbf19c6718e"
[[package]]
name = "windows_x86_64_gnu"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4e4246f76bdeff09eb48875a0fd3e2af6aada79d409d33011886d3e1581517d9"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b7b52767868a23d5bab768e390dc5f5c55825b6d30b86c844ff2dc7414044cc"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "852298e482cd67c356ddd9570386e2862b5673c85bd5f88df9ab6802b334c596"
[[package]]
name = "windows_x86_64_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ed94fce61571a4006852b7389a063ab983c02eb1bb37b47f8272ce92d06d9538"
[[package]]
name = "windows_x86_64_msvc"
version = "0.52.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bec47e5bfd1bff0eeaf6d8b485cc1074891a197ab4225d504cb7a1ab88b02bf0"
[[package]]
name = "wyz"
version = "0.5.1"

View File

@@ -127,7 +127,7 @@ mod tests {
assert!(cap.is_fs_sharing_supported());
// test set hybrid-vsock support
cap.set(CapabilityBits::HybridVsockSupport);
cap.add(CapabilityBits::HybridVsockSupport);
assert!(cap.is_hybrid_vsock_supported());
// test append capabilities
cap.add(CapabilityBits::GuestMemoryProbe);

View File

@@ -88,3 +88,13 @@ pub const DEFAULT_CH_PCI_BRIDGES: u32 = 2;
pub const MAX_CH_PCI_BRIDGES: u32 = 5;
pub const MAX_CH_VCPUS: u32 = 256;
pub const MIN_CH_MEMORY_SIZE_MB: u32 = 64;
//Default configuration for firecracker
pub const DEFAULT_FIRECRACKER_ENTROPY_SOURCE: &str = "/dev/urandom";
pub const DEFAULT_FIRECRACKER_MEMORY_SIZE_MB: u32 = 128;
pub const DEFAULT_FIRECRACKER_MEMORY_SLOTS: u32 = 128;
pub const DEFAULT_FIRECRACKER_VCPUS: u32 = 1;
pub const DEFAULT_FIRECRACKER_GUEST_KERNEL_IMAGE: &str = "vmlinux";
pub const DEFAULT_FIRECRACKER_GUEST_KERNEL_PARAMS: &str = "";
pub const MAX_FIRECRACKER_VCPUS: u32 = 32;
pub const MIN_FIRECRACKER_MEMORY_SIZE_MB: u32 = 128;

View File

@@ -0,0 +1,116 @@
// Copyright (c) 2019-2021 Alibaba Cloud
// Copyright (c) 2022-2023 Nubificus LTD
//
// SPDX-License-Identifier: Apache-2.0
//
use std::io::Result;
use std::path::Path;
use std::sync::Arc;
use super::{default, register_hypervisor_plugin};
use crate::config::default::MAX_FIRECRACKER_VCPUS;
use crate::config::default::MIN_FIRECRACKER_MEMORY_SIZE_MB;
use crate::config::{ConfigPlugin, TomlConfig};
use crate::{eother, validate_path};
/// Hypervisor name for firecracker, used to index `TomlConfig::hypervisor`.
pub const HYPERVISOR_NAME_FIRECRACKER: &str = "firecracker";
/// Configuration information for firecracker.
#[derive(Default, Debug)]
pub struct FirecrackerConfig {}
impl FirecrackerConfig {
/// Create a new instance of `FirecrackerConfig`.
pub fn new() -> Self {
FirecrackerConfig {}
}
/// Register the firecracker plugin.
pub fn register(self) {
let plugin = Arc::new(self);
register_hypervisor_plugin(HYPERVISOR_NAME_FIRECRACKER, plugin);
}
}
impl ConfigPlugin for FirecrackerConfig {
fn get_max_cpus(&self) -> u32 {
MAX_FIRECRACKER_VCPUS
}
fn get_min_memory(&self) -> u32 {
MIN_FIRECRACKER_MEMORY_SIZE_MB
}
fn name(&self) -> &str {
HYPERVISOR_NAME_FIRECRACKER
}
/// Adjust the configuration information after loading from configuration file.
fn adjust_config(&self, conf: &mut TomlConfig) -> Result<()> {
if let Some(firecracker) = conf.hypervisor.get_mut(HYPERVISOR_NAME_FIRECRACKER) {
if firecracker.boot_info.kernel.is_empty() {
firecracker.boot_info.kernel =
default::DEFAULT_FIRECRACKER_GUEST_KERNEL_IMAGE.to_string();
}
if firecracker.boot_info.kernel_params.is_empty() {
firecracker.boot_info.kernel_params =
default::DEFAULT_FIRECRACKER_GUEST_KERNEL_PARAMS.to_string();
}
if firecracker.machine_info.entropy_source.is_empty() {
firecracker.machine_info.entropy_source =
default::DEFAULT_FIRECRACKER_ENTROPY_SOURCE.to_string();
}
if firecracker.memory_info.default_memory == 0 {
firecracker.memory_info.default_memory =
default::DEFAULT_FIRECRACKER_MEMORY_SIZE_MB;
}
}
Ok(())
}
/// Validate the configuration information.
fn validate(&self, conf: &TomlConfig) -> Result<()> {
if let Some(firecracker) = conf.hypervisor.get(HYPERVISOR_NAME_FIRECRACKER) {
if firecracker.path.is_empty() {
return Err(eother!("Firecracker path is empty"));
}
validate_path!(
firecracker.path,
"FIRECRACKER binary path `{}` is invalid: {}"
)?;
if firecracker.boot_info.kernel.is_empty() {
return Err(eother!("Guest kernel image for firecracker is empty"));
}
if firecracker.boot_info.image.is_empty() {
return Err(eother!(
"Both guest boot image and initrd for firecracker are empty"
));
}
if (firecracker.cpu_info.default_vcpus > 0
&& firecracker.cpu_info.default_vcpus as u32 > default::MAX_FIRECRACKER_VCPUS)
|| firecracker.cpu_info.default_maxvcpus > default::MAX_FIRECRACKER_VCPUS
{
return Err(eother!(
"Firecracker hypervisor can not support {} vCPUs",
firecracker.cpu_info.default_maxvcpus
));
}
if firecracker.memory_info.default_memory < MIN_FIRECRACKER_MEMORY_SIZE_MB {
return Err(eother!(
"Firecracker hypervisor has minimal memory limitation {}",
MIN_FIRECRACKER_MEMORY_SIZE_MB
));
}
}
Ok(())
}
}

View File

@@ -59,6 +59,9 @@ pub const VIRTIO_SCSI: &str = "virtio-scsi";
/// Virtual PMEM device driver.
pub const VIRTIO_PMEM: &str = "virtio-pmem";
mod firecracker;
pub use self::firecracker::{FirecrackerConfig, HYPERVISOR_NAME_FIRECRACKER};
const VIRTIO_9P: &str = "virtio-9p";
const VIRTIO_FS: &str = "virtio-fs";
const VIRTIO_FS_INLINE: &str = "inline-virtio-fs";
@@ -411,6 +414,20 @@ pub struct DebugInfo {
/// much disk space.
#[serde(default)]
pub guest_memory_dump_path: String,
/// This option allows to add a debug monitor socket when `enable_debug = true`
/// WARNING: Anyone with access to the monitor socket can take full control of
/// Qemu. This is for debugging purpose only and must *NEVER* be used in
/// production.
/// Valid values are :
/// - "hmp"
/// - "qmp"
/// - "qmp-pretty" (same as "qmp" with pretty json formatting)
/// If set to the empty string "", no debug monitor socket is added. This is
/// the default.
/// dbg_monitor_socket = "hmp"
#[serde(default)]
pub dbg_monitor_socket: String,
}
impl DebugInfo {
@@ -516,6 +533,7 @@ impl TopologyConfigInfo {
HYPERVISOR_NAME_QEMU,
HYPERVISOR_NAME_CH,
HYPERVISOR_NAME_DRAGONBALL,
HYPERVISOR_NAME_FIRECRACKER,
];
let hypervisor_name = toml_config.runtime.hypervisor_name.as_str();
if !hypervisor_names.contains(&hypervisor_name) {

View File

@@ -25,8 +25,8 @@ pub mod hypervisor;
pub use self::agent::Agent;
use self::default::DEFAULT_AGENT_DBG_CONSOLE_PORT;
pub use self::hypervisor::{
BootInfo, CloudHypervisorConfig, DragonballConfig, Hypervisor, QemuConfig,
HYPERVISOR_NAME_DRAGONBALL, HYPERVISOR_NAME_QEMU,
BootInfo, CloudHypervisorConfig, DragonballConfig, FirecrackerConfig, Hypervisor, QemuConfig,
HYPERVISOR_NAME_DRAGONBALL, HYPERVISOR_NAME_FIRECRACKER, HYPERVISOR_NAME_QEMU,
};
mod runtime;

View File

@@ -130,7 +130,11 @@ fn count_files<P: AsRef<Path>>(path: P, limit: i32) -> std::io::Result<i32> {
let file = entry?;
let p = file.path();
if p.is_dir() {
num_files += count_files(&p, limit)?;
let inc = count_files(&p, limit - num_files)?;
if inc == -1 {
return Ok(-1);
}
num_files += inc;
} else {
num_files += 1;
}
@@ -165,6 +169,40 @@ mod tests {
use std::fs;
use test_utils::skip_if_not_root;
#[test]
fn test_count_files() {
let limit = 8;
let test_tmp_dir = tempfile::tempdir().expect("failed to create tempdir");
let work_path = test_tmp_dir.path().join("work");
let result = fs::create_dir_all(&work_path);
assert!(result.is_ok());
let origin_dir = work_path.join("origin_dir");
let result = fs::create_dir_all(&origin_dir);
assert!(result.is_ok());
for n in 0..limit {
let tmp_file = origin_dir.join(format!("file{}", n));
let res = fs::File::create(tmp_file);
assert!(res.is_ok());
}
let symlink_origin_dir = work_path.join("symlink_origin_dir");
let result = std::os::unix::fs::symlink(&origin_dir, &symlink_origin_dir);
assert!(result.is_ok());
for n in 0..2 {
let tmp_file = work_path.join(format!("file{}", n));
let res = fs::File::create(tmp_file);
assert!(res.is_ok());
}
let count = count_files(&work_path, limit).unwrap_or(0);
assert_eq!(count, -1);
let count = count_files(&origin_dir, limit).unwrap_or(0);
assert_eq!(count, limit);
}
#[test]
fn test_is_watchable_mount() {
skip_if_not_root!();

View File

@@ -84,6 +84,7 @@ sandbox_bind_mounts=["/proc/self"]
vfio_mode="vfio"
experimental=["a", "b"]
enable_pprof = true
name="virt-container"
hypervisor_name = "qemu"
agent_name = "agent0"

View File

@@ -83,6 +83,7 @@ sandbox_bind_mounts=["/proc/self"]
vfio_mode="vfio"
experimental=["a", "b"]
enable_pprof = true
name="virt-container"
hypervisor_name = "qemu"
agent_name = "agent0"

View File

@@ -198,13 +198,34 @@ fn real_main() -> Result<(), std::io::Error> {
// generate async
#[cfg(feature = "async")]
{
codegen("src", &["protos/agent.proto", "protos/health.proto"], true)?;
codegen(
"src",
&[
"protos/agent.proto",
"protos/health.proto",
"protos/sealed_secret.proto",
],
true,
)?;
fs::rename("src/agent_ttrpc.rs", "src/agent_ttrpc_async.rs")?;
fs::rename("src/health_ttrpc.rs", "src/health_ttrpc_async.rs")?;
fs::rename(
"src/sealed_secret_ttrpc.rs",
"src/sealed_secret_ttrpc_async.rs",
)?;
}
codegen("src", &["protos/agent.proto", "protos/health.proto"], false)?;
codegen(
"src",
&[
"protos/agent.proto",
"protos/health.proto",
"protos/sealed_secret.proto",
],
false,
)?;
// There is a message named 'Box' in oci.proto
// so there is a struct named 'Box', we should replace Box<Self> to ::std::boxed::Box<Self>

View File

@@ -0,0 +1,21 @@
//
// Copyright (c) 2024 IBM
//
// SPDX-License-Identifier: Apache-2.0
//
syntax = "proto3";
package api;
message UnsealSecretInput {
bytes secret = 1;
}
message UnsealSecretOutput {
bytes plaintext = 1;
}
service SealedSecretService {
rpc UnsealSecret(UnsealSecretInput) returns (UnsealSecretOutput) {};
}

View File

@@ -27,3 +27,9 @@ pub use serde_config::{
deserialize_enum_or_unknown, deserialize_message_field, serialize_enum_or_unknown,
serialize_message_field,
};
pub mod sealed_secret;
pub mod sealed_secret_ttrpc;
#[cfg(feature = "async")]
pub mod sealed_secret_ttrpc_async;

View File

@@ -14,12 +14,12 @@ edition = "2018"
[dependencies]
anyhow = "^1.0"
nix = "0.24.0"
tokio = { version = "1.8.0", features = ["rt-multi-thread"] }
tokio = { version = "1.38.0", features = ["rt-multi-thread"] }
hyper = { version = "0.14.20", features = ["stream", "server", "http1"] }
hyperlocal = "0.8"
kata-types = { path = "../kata-types" }
kata-sys-util = {path = "../kata-sys-util" }
kata-sys-util = { path = "../kata-sys-util" }
[dev-dependencies]
tempfile = "3.2.0"
test-utils = {path = "../test-utils"}
test-utils = { path = "../test-utils" }

View File

@@ -37,6 +37,9 @@ fn get_uds_with_sid(short_id: &str, path: &str) -> Result<String> {
return Ok(format!("unix://{}", p.display()));
}
let _ = fs::create_dir_all(kata_run_path.join(short_id))
.context(format!("failed to create directory {:?}", kata_run_path.join(short_id)));
let target_ids: Vec<String> = fs::read_dir(&kata_run_path)?
.filter_map(|e| {
let x = e.ok()?.file_name().to_string_lossy().into_owned();

View File

@@ -185,7 +185,7 @@ dependencies = [
"polling",
"rustix 0.37.23",
"slab",
"socket2",
"socket2 0.4.9",
"waker-fn",
]
@@ -1589,7 +1589,7 @@ dependencies = [
"httpdate",
"itoa",
"pin-project-lite",
"socket2",
"socket2 0.4.9",
"tokio",
"tower-service",
"tracing",
@@ -1635,6 +1635,8 @@ dependencies = [
"dragonball",
"futures 0.3.28",
"go-flag",
"hyper",
"hyperlocal",
"hypervisor",
"kata-sys-util",
"kata-types",
@@ -1644,6 +1646,9 @@ dependencies = [
"nix 0.24.3",
"path-clean",
"persist",
"qapi",
"qapi-qmp",
"qapi-spec",
"rand 0.8.5",
"rust-ini",
"safe-path 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -2034,9 +2039,9 @@ dependencies = [
[[package]]
name = "mio"
version = "0.8.8"
version = "0.8.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "927a765cd3fc26206e66b296465fa9d3e5ab003e651c1b3c060e7956d96b19d2"
checksum = "a4a650543ca06a924e8b371db273b2756685faae30f8487da1b56505a8f78b0c"
dependencies = [
"libc",
"log",
@@ -2699,9 +2704,9 @@ dependencies = [
[[package]]
name = "pin-project-lite"
version = "0.2.10"
version = "0.2.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4c40d25201921e5ff0c862a505c6557ea88568a4e3ace775ab55e93f2f4f9d57"
checksum = "bda66fc9667c18cb2758a2ac84d1167245054bcf85d5d1aaa6923f45801bdd02"
[[package]]
name = "pin-utils"
@@ -2959,6 +2964,65 @@ dependencies = [
"ttrpc-codegen",
]
[[package]]
name = "qapi"
version = "0.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c6412bdd014ebee03ddbbe79ac03a0b622cce4d80ba45254f6357c847f06fa38"
dependencies = [
"bytes",
"futures 0.3.28",
"log",
"memchr",
"qapi-qmp",
"qapi-spec",
"serde",
"serde_json",
"tokio",
"tokio-util",
]
[[package]]
name = "qapi-codegen"
version = "0.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9ba4de731473de4c8bd508ddb38a9049e999b8a7429f3c052ba8735a178ff68c"
dependencies = [
"qapi-parser",
]
[[package]]
name = "qapi-parser"
version = "0.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "80044db145aa2953ef5803d0376dcbca50f2763242547e856b7f37507adca677"
dependencies = [
"serde",
"serde_json",
]
[[package]]
name = "qapi-qmp"
version = "0.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e8b944db7e544d2fa97595e9a000a6ba5c62c426fa185e7e00aabe4b5640b538"
dependencies = [
"qapi-codegen",
"qapi-spec",
"serde",
]
[[package]]
name = "qapi-spec"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b360919a24ea5fc02fa762cb01bd8f43b643fee51c585f763257773b4dc5a9e8"
dependencies = [
"base64 0.13.1",
"serde",
"serde_json",
]
[[package]]
name = "quote"
version = "1.0.35"
@@ -3825,6 +3889,16 @@ dependencies = [
"winapi",
]
[[package]]
name = "socket2"
version = "0.5.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ce305eb0b4296696835b71df73eb912e0f1ffd2556a501fcede6e0c50349191c"
dependencies = [
"libc",
"windows-sys 0.52.0",
]
[[package]]
name = "static_assertions"
version = "1.1.0"
@@ -4099,11 +4173,10 @@ checksum = "1f3ccbac311fea05f86f61904b462b55fb3df8837a366dfc601a0161d0532f20"
[[package]]
name = "tokio"
version = "1.29.1"
version = "1.38.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "532826ff75199d5833b9d2c5fe410f29235e25704ee5f0ef599fb51c21f4a4da"
checksum = "ba4f4a02a7a80d6f274636f0aa95c7e383b912d41fe721a31f29e29698585a4a"
dependencies = [
"autocfg",
"backtrace",
"bytes",
"libc",
@@ -4112,16 +4185,16 @@ dependencies = [
"parking_lot 0.12.1",
"pin-project-lite",
"signal-hook-registry",
"socket2",
"socket2 0.5.7",
"tokio-macros",
"windows-sys 0.48.0",
]
[[package]]
name = "tokio-macros"
version = "2.1.0"
version = "2.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "630bdcf245f78637c13ec01ffae6187cca34625e8c63150d424b59e55af2675e"
checksum = "5f5ae998a069d4b5aba8ee9dad856af7d520c3699e6159b185c2acd48155d39a"
dependencies = [
"proc-macro2",
"quote",

View File

@@ -109,6 +109,12 @@ ROOTFSTYPE_XFS := \"xfs\"
ROOTFSTYPE_EROFS := \"erofs\"
DEFROOTFSTYPE := $(ROOTFSTYPE_EXT4)
FCBINDIR := $(PREFIXDEPS)/bin
FCPATH = $(FCBINDIR)/$(FCCMD)
FCVALIDHYPERVISORPATHS := [\"$(FCPATH)\"]
FCJAILERPATH = $(FCBINDIR)/$(FCJAILERCMD)
FCVALIDJAILERPATHS = [\"$(FCJAILERPATH)\"]
PKGLIBEXECDIR := $(LIBEXECDIR)/$(PROJECT_DIR)
FIRMWAREPATH :=
FIRMWAREVOLUMEPATH :=
@@ -164,8 +170,11 @@ DEFMSIZE9P := 8192
DEFVFIOMODE := guest-kernel
##VAR DEFSANDBOXCGROUPONLY=<bool> Default cgroup model
DEFSANDBOXCGROUPONLY ?= false
DEFSANDBOXCGROUPONLY_DB ?= true
DEFSANDBOXCGROUPONLY_FC ?= true
DEFSTATICRESOURCEMGMT ?= false
DEFSTATICRESOURCEMGMT_DB ?= false
DEFSTATICRESOURCEMGMT_FC ?= true
DEFBINDMOUNTS := []
DEFDANCONF := /run/kata-containers/dans
SED = sed
@@ -208,18 +217,18 @@ ifneq (,$(DBCMD))
SYSCONFIG_PATHS += $(SYSCONFIG_DB)
CONFIGS += $(CONFIG_DB)
# dragonball-specific options (all should be suffixed by "_DB")
VMROOTFSDRIVER_DB := virtio-blk-pci
VMROOTFSDRIVER_DB := virtio-blk-pci
DEFMAXVCPUS_DB := 1
DEFBLOCKSTORAGEDRIVER_DB := virtio-blk-mmio
DEFNETWORKMODEL_DB := tcfilter
KERNELPARAMS_DB = console=ttyS1 agent.log_vport=1025
KERNELTYPE_DB = uncompressed
KERNELTYPE_DB = uncompressed
KERNEL_NAME_DB = $(call MAKE_KERNEL_NAME_DB,$(KERNELTYPE_DB))
KERNELPATH_DB = $(KERNELDIR)/$(KERNEL_NAME_DB)
DEFSANDBOXCGROUPONLY = true
DEFSANDBOXCGROUPONLY_DB = true
RUNTIMENAME := virt_container
PIPESIZE := 1
DBSHAREDFS := inline-virtio-fs
DBSHAREDFS := inline-virtio-fs
endif
ifneq (,$(CLHCMD))
@@ -244,6 +253,9 @@ ifneq (,$(CLHCMD))
KERNEL_NAME_CLH = $(call MAKE_KERNEL_NAME,$(KERNELTYPE_CLH))
KERNELPATH_CLH = $(KERNELDIR)/$(KERNEL_NAME_CLH)
VMROOTFSDRIVER_CLH := virtio-pmem
DEFSTATICRESOURCEMGMT = true
DEFSANDBOXCGROUPONLY = true
endif
ifneq (,$(QEMUCMD))
@@ -288,6 +300,28 @@ endif
DEFSECCOMPSANDBOXPARAM := on,obsolete=deny,spawn=deny,resourcecontrol=deny
DEFGUESTSELINUXLABEL := system_u:system_r:container_t
endif
ifneq (,$(FCCMD))
KNOWN_HYPERVISORS += $(HYPERVISOR_FC)
CONFIG_FILE_FC = configuration-rs-fc.toml
CONFIG_FC = config/$(CONFIG_FILE_FC)
CONFIG_FC_IN = $(CONFIG_FC).in
CONFIG_PATH_FC = $(abspath $(CONFDIR)/$(CONFIG_FILE_FC))
CONFIG_PATHS += $(CONFIG_PATH_FC)
SYSCONFIG_FC = $(abspath $(SYSCONFDIR)/$(CONFIG_FILE_FC))
SYSCONFIG_PATHS += $(SYSCONFIG_FC)
CONFIGS += $(CONFIG_FC)
# firecracker-specific options (all should be suffixed by "_FC")
DEFBLOCKSTORAGEDRIVER_FC := virtio-blk-mmio
DEFMAXMEMSZ_FC := 2048
DEFNETWORKMODEL_FC := tcfilter
KERNELPARAMS = console=ttyS0 agent.log_vport=1025
KERNELTYPE_FC = uncompressed
KERNEL_NAME_FC = $(call MAKE_KERNEL_NAME_FC,$(KERNELTYPE_FC))
KERNELPATH_FC = $(KERNELDIR)/$(KERNEL_NAME_FC)
DEFSANDBOXCGROUPONLY_FC = true
RUNTIMENAME := virt_container
DEFSTATICRESOURCEMGMT_FC ?= true
endif
ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_DB))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_DB)
@@ -296,16 +330,21 @@ endif
ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_QEMU))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_QEMU)
endif
ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_FC))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_FC)
endif
# list of variables the user may wish to override
USER_VARS += ARCH
USER_VARS += BINDIR
USER_VARS += CONFIG_DB_IN
USER_VARS += CONFIG_FC_IN
USER_VARS += CONFIG_PATH
USER_VARS += CONFIG_QEMU_IN
USER_VARS += DESTDIR
USER_VARS += DEFAULT_HYPERVISOR
USER_VARS += DBCMD
USER_VARS += DBCTLCMD
USER_VARS += FCCTLCMD
USER_VARS += DBPATH
USER_VARS += DBVALIDHYPERVISORPATHS
USER_VARS += DBCTLPATH
@@ -316,6 +355,13 @@ USER_VARS += QEMUPATH
USER_VARS += QEMUVALIDHYPERVISORPATHS
USER_VARS += FIRMWAREPATH_CLH
USER_VARS += KERNELPATH_CLH
USER_VARS += FCCMD
USER_VARS += FCPATH
USER_VARS += FCVALIDHYPERVISORPATHS
USER_VARS += FCJAILERPATH
USER_VARS += FCVALIDJAILERPATHS
USER_VARS += FCVALIDJAILERPATHS
USER_VARS += DEFMAXMEMSZ_FC
USER_VARS += SYSCONFIG
USER_VARS += IMAGENAME
USER_VARS += IMAGEPATH
@@ -329,6 +375,8 @@ USER_VARS += KERNELDIR
USER_VARS += KERNELTYPE
USER_VARS += KERNELPATH_DB
USER_VARS += KERNELPATH_QEMU
USER_VARS += KERNELPATH_FC
USER_VARS += KERNELPATH
USER_VARS += KERNELVIRTIOFSPATH
USER_VARS += FIRMWAREPATH
USER_VARS += FIRMWAREVOLUMEPATH
@@ -365,6 +413,7 @@ USER_VARS += DEFBRIDGES
USER_VARS += DEFNETWORKMODEL_DB
USER_VARS += DEFNETWORKMODEL_CLH
USER_VARS += DEFNETWORKMODEL_QEMU
USER_VARS += DEFNETWORKMODEL_FC
USER_VARS += DEFDISABLEGUESTEMPTYDIR
USER_VARS += DEFDISABLEGUESTSECCOMP
USER_VARS += DEFDISABLESELINUX
@@ -374,6 +423,7 @@ USER_VARS += DEFDISABLEBLOCK
USER_VARS += DEFBLOCKSTORAGEDRIVER_DB
USER_VARS += DEFBLOCKSTORAGEDRIVER_QEMU
USER_VARS += DEFBLOCKDEVICEAIO_QEMU
USER_VARS += DEFBLOCKSTORAGEDRIVER_FC
USER_VARS += DEFSHAREDFS_CLH_VIRTIOFS
USER_VARS += DEFSHAREDFS_QEMU_VIRTIOFS
USER_VARS += DEFVIRTIOFSDAEMON
@@ -396,8 +446,11 @@ USER_VARS += DEFENTROPYSOURCE
USER_VARS += DEFVALIDENTROPYSOURCES
USER_VARS += DEFSANDBOXCGROUPONLY
USER_VARS += DEFSANDBOXCGROUPONLY_QEMU
USER_VARS += DEFSANDBOXCGROUPONLY_DB
USER_VARS += DEFSANDBOXCGROUPONLY_FC
USER_VARS += DEFSTATICRESOURCEMGMT
USER_VARS += DEFSTATICRESOURCEMGMT_DB
USER_VARS += DEFSTATICRESOURCEMGMT_FC
USER_VARS += DEFBINDMOUNTS
USER_VARS += DEFVFIOMODE
USER_VARS += BUILDFLAGS
@@ -405,6 +458,7 @@ USER_VARS += RUNTIMENAME
USER_VARS += HYPERVISOR_DB
USER_VARS += HYPERVISOR_CLH
USER_VARS += HYPERVISOR_QEMU
USER_VARS += HYPERVISOR_FC
USER_VARS += PIPESIZE
USER_VARS += DBSHAREDFS
USER_VARS += KATA_INSTALL_GROUP
@@ -417,7 +471,7 @@ SOURCES := \
Cargo.toml
VERSION_FILE := ./VERSION
VERSION := $(shell grep -v ^\# $(VERSION_FILE))
VERSION := $(shell grep -v ^\# $(VERSION_FILE) 2>/dev/null || echo "unknown")
COMMIT_NO := $(shell git rev-parse HEAD 2>/dev/null || true)
COMMIT := $(if $(shell git status --porcelain --untracked-files=no 2>/dev/null || true),${COMMIT_NO}-dirty,${COMMIT_NO})
COMMIT_MSG = $(if $(COMMIT),$(COMMIT),unknown)
@@ -442,6 +496,7 @@ RUNTIME_VERSION=$(VERSION)
GENERATED_VARS = \
VERSION \
CONFIG_DB_IN \
CONFIG_FC_IN \
$(USER_VARS)
@@ -483,6 +538,9 @@ endef
define MAKE_KERNEL_NAME_DB
$(if $(findstring uncompressed,$1),vmlinux-dragonball-experimental.container,vmlinuz-dragonball-experimental.container)
endef
define MAKE_KERNEL_NAME_FC
$(if $(findstring uncompressed,$1),vmlinux.container,vmlinuz.container)
endef
# Returns the name of the kernel file to use based on the provided KERNELTYPE.
# # $1 : KERNELTYPE (compressed or uncompressed)

View File

@@ -13,3 +13,5 @@ QEMUCMD := qemu-system-aarch64
# dragonball binary name
DBCMD := dragonball
FCCMD := firecracker
FCJAILERCMD := jailer

View File

@@ -16,3 +16,7 @@ DBCMD := dragonball
# cloud-hypervisor binary name
CLHCMD := cloud-hypervisor
# firecracker binary (vmm and jailer)
FCCMD := firecracker
FCJAILERCMD := jailer

View File

@@ -226,9 +226,18 @@ container_pipe_size=@PIPESIZE@
#debug_console_enabled = true
# Agent connection dialing timeout value in seconds
# (default: 45)
dial_timeout = 45
# Agent dial timeout in millisecond.
# (default: 10)
#dial_timeout_ms = 10
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
#reconnect_timeout_ms = 3000
[runtime]
# If enabled, the runtime will log additional debug messages to the

View File

@@ -239,9 +239,18 @@ container_pipe_size=@PIPESIZE@
#debug_console_enabled = true
# Agent connection dialing timeout value in seconds
# (default: 45)
dial_timeout = 45
# Agent dial timeout in millisecond.
# (default: 10)
#dial_timeout_ms = 10
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
#reconnect_timeout_ms = 3000
[runtime]
# If enabled, the runtime will log additional debug messages to the
@@ -332,7 +341,7 @@ disable_guest_seccomp=@DEFDISABLEGUESTSECCOMP@
# The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
# The sandbox cgroup is constrained if there is no container type annotation.
# See: https://pkg.go.dev/github.com/kata-containers/kata-containers/src/runtime/virtcontainers#ContainerType
sandbox_cgroup_only=@DEFSANDBOXCGROUPONLY@
sandbox_cgroup_only=@DEFSANDBOXCGROUPONLY_DB@
# Enabled experimental feature list, format: ["a", "b"].
# Experimental features are features not stable enough for production,

View File

@@ -346,7 +346,7 @@ pflashes = []
#
# If set to the empty string "", no extra monitor socket is added. This is
# the default.
#extra_monitor_socket = hmp
#extra_monitor_socket = "hmp"
# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
@@ -567,9 +567,18 @@ kernel_modules=[]
#debug_console_enabled = true
# Agent connection dialing timeout value in seconds
# (default: 45)
dial_timeout = 45
# Agent dial timeout in millisecond.
# (default: 10)
#dial_timeout_ms = 10
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
#reconnect_timeout_ms = 3000
[runtime]
# If enabled, the runtime will log additional debug messages to the

View File

@@ -0,0 +1,373 @@
# Copyright (c) 2017-2023 Intel Corporation
# Copyright (c) Adobe Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "@CONFIG_FC_IN@"
# XXX: Project:
# XXX: Name: @PROJECT_NAME@
# XXX: Type: @PROJECT_TYPE@
[hypervisor.firecracker]
path = "@FCPATH@"
kernel = "@KERNELPATH_FC@"
image = "@IMAGEPATH@"
rootfs_type=@DEFROOTFSTYPE@
# List of valid annotation names for the hypervisor
# Each member of the list is a regular expression, which is the base name
# of the annotation, e.g. "path" for io.katacontainers.config.hypervisor.path"
enable_annotations = @DEFENABLEANNOTATIONS@
# List of valid annotations values for the hypervisor
# Each member of the list is a path pattern as described by glob(3).
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @FCVALIDHYPERVISORPATHS@
valid_hypervisor_paths = @FCVALIDHYPERVISORPATHS@
# Path for the jailer specific to firecracker
# If the jailer path is not set kata will launch firecracker
# without a jail. If the jailer is set firecracker will be
# launched in a jailed enviornment created by the jailer
#jailer_path = "@FCJAILERPATH@"
# List of valid jailer path values for the hypervisor
# Each member of the list can be a regular expression
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @FCVALIDJAILERPATHS@
valid_jailer_paths = @FCVALIDJAILERPATHS@
# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = "@KERNELPARAMS@"
# Default number of vCPUs per SB/VM:
# unspecified or 0 --> will be set to @DEFVCPUS@
# < 0 --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores --> will be set to the actual number of physical cores
default_vcpus = 1
# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0 --> will be set to the actual number of physical cores or to the maximum number
# of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores --> will be set to the actual number of physical cores or to the maximum number
# of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
# NOTICE: on arm platform with gicv2 interrupt controller, set it to 8.
default_maxvcpus = @DEFMAXVCPUS@
# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
# This limitation could be a bug in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0 --> will be set to @DEFBRIDGES@
# > 1 <= 5 --> will be set to the specified number
# > 5 --> will be set to 5
default_bridges = @DEFBRIDGES@
# Default memory size in MiB for SB/VM.
# If unspecified then it will be set @DEFMEMSZ@ MiB.
default_memory = @DEFMEMSZ@
#
# Default memory slots per SB/VM.
# If unspecified then it will be set @DEFMEMSLOTS@.
# This is will determine the times that memory will be hotadded to sandbox/VM.
memory_slots = @DEFMEMSLOTS@
# The size in MiB will be plused to max memory of hypervisor.
# It is the memory address space for the NVDIMM devie.
# If set block storage driver (block_device_driver) to "nvdimm",
# should set memory_offset to the size of block device.
# Default 0
#memory_offset = 0
# Default maximum memory in MiB per SB / VM
# unspecified or == 0 --> will be set to the actual amount of physical RAM
# > 0 <= amount of physical RAM --> will be set to the specified number
# > amount of physical RAM --> will be set to the actual amount of physical RAM
default_maxmemory = @DEFMAXMEMSZ_FC@
# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is virtio-scsi, virtio-blk
# or nvdimm.
block_device_driver = "@DEFBLOCKSTORAGEDRIVER_FC@"
# Specifies cache-related options will be set to block devices or not.
# Default false
#block_device_cache_set = true
# Specifies cache-related options for block devices.
# Denotes whether use of O_DIRECT (bypass the host page cache) is enabled.
# Default false
#block_device_cache_direct = true
# Specifies cache-related options for block devices.
# Denotes whether flush requests for the device are ignored.
# Default false
#block_device_cache_noflush = true
# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true
# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically
# result in memory pre allocation
#enable_hugepages = true
# Enable vIOMMU, default false
# Enabling this will result in the VM having a vIOMMU device
# This will also add the following options to the kernel's
# command line: intel_iommu=on,iommu=pt
#enable_iommu = true
# This option changes the default hypervisor and kernel parameters
# to enable debug output where available.
#
# Default false
#enable_debug = true
# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
#
#disable_nesting_checks = true
# This is the msize used for 9p shares. It is the number of bytes
# used for 9p packet payload.
#msize_9p = @DEFMSIZE9P@
# VFIO devices are hotplugged on a bridge by default.
# Enable hotplugging on root bus. This may be required for devices with
# a large PCI bar, as this is a current limitation with hotplugging on
# a bridge.
# Default false
#hotplug_vfio_on_root_bus = true
#
# Default entropy source.
# The path to a host source of entropy (including a real hardware RNG)
# /dev/urandom and /dev/random are two main options.
# Be aware that /dev/random is a blocking source of entropy. If the host
# runs out of entropy, the VMs boot time will increase leading to get startup
# timeouts.
# The source of entropy /dev/urandom is non-blocking and provides a
# generally acceptable source of entropy. It should work well for pretty much
# all practical purposes.
#entropy_source= "@DEFENTROPYSOURCE@"
# List of valid annotations values for entropy_source
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDENTROPYSOURCES@
valid_entropy_sources = @DEFVALIDENTROPYSOURCES@
# Path to OCI hook binaries in the *guest rootfs*.
# This does not affect host-side hooks which must instead be added to
# the OCI spec passed to the runtime.
#
# You can create a rootfs with hooks by customizing the osbuilder scripts:
# https://github.com/kata-containers/kata-containers/tree/main/tools/osbuilder
#
# Hooks must be stored in a subdirectory of guest_hook_path according to their
# hook type, i.e. "guest_hook_path/{prestart,poststart,poststop}".
# The agent will scan these directories for executable files and add them, in
# lexicographical order, to the lifecycle of the guest container.
# Hooks are executed in the runtime namespace of the guest. See the official documentation:
# https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
# Warnings will be logged if any error is encountered will scanning for hooks,
# but it will not abort container execution.
#guest_hook_path = "/usr/share/oci/hooks"
#
# Use rx Rate Limiter to control network I/O inbound bandwidth(size in bits/sec for SB/VM).
# In Firecracker, it provides a built-in rate limiter, which is based on TBF(Token Bucket Filter)
# queueing discipline.
# Default 0-sized value means unlimited rate.
#rx_rate_limiter_max_rate = 0
# Use tx Rate Limiter to control network I/O outbound bandwidth(size in bits/sec for SB/VM).
# In Firecracker, it provides a built-in rate limiter, which is based on TBF(Token Bucket Filter)
# queueing discipline.
# Default 0-sized value means unlimited rate.
#tx_rate_limiter_max_rate = 0
# disable applying SELinux on the VMM process (default false)
disable_selinux=@DEFDISABLESELINUX@
[factory]
# VM templating support. Once enabled, new VMs are created from template
# using vm cloning. They will share the same initial kernel, initramfs and
# agent memory by mapping it readonly. It helps speeding up new container
# creation and saves a lot of memory if there are many kata containers running
# on the same host.
#
# When disabled, new VMs are created from scratch.
#
# Note: Requires "initrd=" to be set ("image=" is not supported).
#
# Default false
#enable_template = true
[agent.@PROJECT_TYPE@]
# If enabled, make the agent display debug-level messages.
# (default: disabled)
#enable_debug = true
# Enable agent tracing.
#
# If enabled, the agent will generate OpenTelemetry trace spans.
#
# Notes:
#
# - If the runtime also has tracing enabled, the agent spans will be
# associated with the appropriate runtime parent span.
# - If enabled, the runtime will wait for the container to shutdown,
# increasing the container shutdown time slightly.
#
# (default: disabled)
#enable_tracing = true
# Comma separated list of kernel modules and their parameters.
# These modules will be loaded in the guest kernel using modprobe(8).
# The following example can be used to load two kernel modules with parameters
# - kernel_modules=["e1000e InterruptThrottleRate=3000,3000,3000 EEE=1", "i915 enable_ppgtt=0"]
# The first word is considered as the module name and the rest as its parameters.
# Container will not be started when:
# * A kernel module is specified and the modprobe command is not installed in the guest
# or it fails loading the module.
# * The module is not available in the guest or it doesn't met the guest kernel
# requirements, like architecture and version.
#
kernel_modules=[]
# Enable debug console.
# If enabled, user can connect guest OS running inside hypervisor
# through "kata-runtime exec <sandbox-id>" command
#debug_console_enabled = true
# Agent connection dialing timeout value in seconds
# (default: 45)
dial_timeout = 45
[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
# - macvtap
# Used when the Container network interface can be bridged using
# macvtap.
#
# - none
# Used when customize network. Only creates a tap device. No veth pair.
#
# - tcfilter
# Uses tc filter rules to redirect traffic from the network interface
# provided by plugin to a tap interface connected to the VM.
#
internetworking_model="@DEFNETWORKMODEL_FC@"
name="@RUNTIMENAME@"
hypervisor_name="@HYPERVISOR_FC@"
agent_name="@PROJECT_TYPE@"
# disable guest seccomp
# Determines whether container seccomp profiles are passed to the virtual
# machine and applied by the kata agent. If set to true, seccomp is not applied
# within the guest
# (default: true)
disable_guest_seccomp=@DEFDISABLEGUESTSECCOMP@
# If enabled, the runtime will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
# (default: disabled)
#enable_tracing = true
# Set the full url to the Jaeger HTTP Thrift collector.
# The default if not set will be "http://localhost:14268/api/traces"
#jaeger_endpoint = ""
# Sets the username to be used if basic auth is required for Jaeger.
#jaeger_user = ""
# Sets the password to be used if basic auth is required for Jaeger.
#jaeger_password = ""
# If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
# This option may have some potential impacts to your host. It should only be used when you know what you're doing.
# `disable_new_netns` conflicts with `internetworking_model=tcfilter` and `internetworking_model=macvtap`. It works only
# with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
# (like OVS) directly.
# (default: false)
#disable_new_netns = true
# if enabled, the runtime will add all the kata processes inside one dedicated cgroup.
# The container cgroups in the host are not created, just one single cgroup per sandbox.
# The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
# The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
# The sandbox cgroup is constrained if there is no container type annotation.
# See: https://pkg.go.dev/github.com/kata-containers/kata-containers/src/runtime/virtcontainers#ContainerType
sandbox_cgroup_only=@DEFSANDBOXCGROUPONLY_FC@
# If enabled, the runtime will attempt to determine appropriate sandbox size (memory, CPU) before booting the virtual machine. In
# this case, the runtime will not dynamically update the amount of memory and CPU in the virtual machine. This is generally helpful
# when a hardware architecture or hypervisor solutions is utilized which does not support CPU and/or memory hotplug.
# Compatibility for determining appropriate sandbox (VM) size:
# - When running with pods, sandbox sizing information will only be available if using Kubernetes >= 1.23 and containerd >= 1.6. CRI-O
# does not yet support sandbox sizing annotations.
# - When running single containers using a tool like ctr, container sizing information will be available.
static_sandbox_resource_mgmt=@DEFSTATICRESOURCEMGMT_FC@
# If enabled, the runtime will not create Kubernetes emptyDir mounts on the guest filesystem. Instead, emptyDir mounts will
# be created on the host and shared via virtio-fs. This is potentially slower, but allows sharing of files from host to guest.
disable_guest_empty_dir=@DEFDISABLEGUESTEMPTYDIR@
# Enabled experimental feature list, format: ["a", "b"].
# Experimental features are features not stable enough for production,
# they may break compatibility, and are prepared for a big version bump.
# Supported experimental features:
# (default: [])
experimental=@DEFAULTEXPFEATURES@
# If enabled, user can run pprof tools with shim v2 process through kata-monitor.
# (default: false)
# enable_pprof = true

View File

@@ -18,15 +18,15 @@ serde_json = ">=1.0.9"
slog = "2.5.2"
slog-scope = "4.4.0"
ttrpc = "0.8"
tokio = { version = "1.28.1", features = ["fs", "rt"] }
tokio = { version = "1.38.0", features = ["fs", "rt"] }
tracing = "0.1.36"
url = "2.2.2"
nix = "0.24.2"
kata-types = { path = "../../../libs/kata-types"}
logging = { path = "../../../libs/logging"}
kata-types = { path = "../../../libs/kata-types" }
logging = { path = "../../../libs/logging" }
oci = { path = "../../../libs/oci" }
protocols = { path = "../../../libs/protocols", features=["async"] }
protocols = { path = "../../../libs/protocols", features = ["async"] }
[features]
default = []

View File

@@ -22,7 +22,7 @@ serde_json = ">=1.0.9"
slog = "2.5.2"
slog-scope = "4.4.0"
thiserror = "1.0"
tokio = { version = "1.28.1", features = ["sync", "fs", "process", "io-util"] }
tokio = { version = "1.38.0", features = ["sync", "fs", "process", "io-util"] }
vmm-sys-util = "0.11.0"
rand = "0.8.4"
path-clean = "1.0.1"
@@ -43,8 +43,15 @@ safe-path = "0.1.0"
crossbeam-channel = "0.5.6"
tempdir = "0.3.7"
qapi = { version = "0.14", features = [ "qmp", "async-tokio-all" ] }
qapi-spec = "0.3.1"
qapi-qmp = "0.14.0"
[target.'cfg(not(target_arch = "s390x"))'.dependencies]
dragonball = { path = "../../../dragonball", features = ["atomic-guest-memory", "virtio-vsock", "hotplug", "virtio-blk", "virtio-net", "virtio-fs", "vhost-net", "dbs-upcall", "virtio-mem", "virtio-balloon", "vhost-user-net", "host-device"] }
dbs-utils = { path = "../../../dragonball/src/dbs_utils" }
hyperlocal = "0.8.0"
hyper = {version = "0.14.18", features = ["client"]}
[features]
default = []

View File

@@ -13,7 +13,7 @@ edition = "2021"
anyhow = "1.0.68"
serde = { version = "1.0.145", features = ["rc", "derive"] }
serde_json = "1.0.91"
tokio = { version = "1.28.1", features = ["sync", "rt"] }
tokio = { version = "1.38.0", features = ["sync", "rt"] }
# Cloud Hypervisor public HTTP API functions
# Note that the version specified is not necessarily the version of CH

View File

@@ -4,6 +4,7 @@
// SPDX-License-Identifier: Apache-2.0
//
use std::convert::TryFrom;
use std::path::PathBuf;
use anyhow::{anyhow, Context, Result};
@@ -19,6 +20,7 @@ use dragonball::device_manager::{
};
use super::{build_dragonball_network_config, DragonballInner};
use crate::device::pci_path::PciPath;
use crate::VhostUserConfig;
use crate::{
device::DeviceType, HybridVsockConfig, NetworkConfig, ShareFsConfig, ShareFsMountConfig,
@@ -37,46 +39,64 @@ pub(crate) fn drive_index_to_id(index: u64) -> String {
}
impl DragonballInner {
pub(crate) async fn add_device(&mut self, device: DeviceType) -> Result<()> {
pub(crate) async fn add_device(&mut self, device: DeviceType) -> Result<DeviceType> {
if self.state == VmmState::NotReady {
info!(sl!(), "VMM not ready, queueing device {}", device);
// add the pending device by reverse order, thus the
// start_vm would pop the devices in an right order
// to add the devices.
self.pending_devices.insert(0, device);
return Ok(());
self.pending_devices.insert(0, device.clone());
return Ok(device);
}
info!(sl!(), "dragonball add device {:?}", &device);
match device {
DeviceType::Network(network) => self
.add_net_device(&network.config)
.context("add net device"),
DeviceType::Vfio(hostdev) => self.add_vfio_device(&hostdev).context("add vfio device"),
DeviceType::Block(block) => self
.add_block_device(
DeviceType::Network(network) => {
self.add_net_device(&network.config)
.context("add net device")?;
Ok(DeviceType::Network(network))
}
DeviceType::Vfio(mut hostdev) => {
self.add_vfio_device(&mut hostdev)
.context("add vfio device")?;
Ok(DeviceType::Vfio(hostdev))
}
DeviceType::Block(block) => {
self.add_block_device(
block.config.path_on_host.as_str(),
block.device_id.as_str(),
block.config.is_readonly,
block.config.no_drop,
)
.context("add block device"),
DeviceType::VhostUserBlk(block) => self
.add_block_device(
.context("add block device")?;
Ok(DeviceType::Block(block))
}
DeviceType::VhostUserBlk(block) => {
self.add_block_device(
block.config.socket_path.as_str(),
block.device_id.as_str(),
block.is_readonly,
block.no_drop,
)
.context("add vhost user based block device"),
DeviceType::HybridVsock(hvsock) => self.add_hvsock(&hvsock.config).context("add vsock"),
DeviceType::ShareFs(sharefs) => self
.add_share_fs_device(&sharefs.config)
.context("add share fs device"),
DeviceType::VhostUserNetwork(dev) => self
.add_vhost_user_net_device(&dev.config)
.context("add vhost-user-net device"),
.context("add vhost user based block device")?;
Ok(DeviceType::VhostUserBlk(block))
}
DeviceType::HybridVsock(hvsock) => {
self.add_hvsock(&hvsock.config).context("add vsock")?;
Ok(DeviceType::HybridVsock(hvsock))
}
DeviceType::ShareFs(sharefs) => {
self.add_share_fs_device(&sharefs.config)
.context("add share fs device")?;
Ok(DeviceType::ShareFs(sharefs))
}
DeviceType::VhostUserNetwork(dev) => {
self.add_vhost_user_net_device(&dev.config)
.context("add vhost-user-net device")?;
Ok(DeviceType::VhostUserNetwork(dev))
}
DeviceType::Vsock(_) => todo!(),
}
}
@@ -121,56 +141,49 @@ impl DragonballInner {
}
}
fn add_vfio_device(&mut self, device: &VfioDevice) -> Result<()> {
let vfio_device = device.clone();
fn add_vfio_device(&mut self, device: &mut VfioDevice) -> Result<()> {
// FIXME:
// A device with multi-funtions, or a IOMMU group with one more
// devices, the Primary device is selected to be passed to VM.
// And the the first one is Primary device.
// safe here, devices is not empty.
let primary_device = vfio_device.devices.first().unwrap().clone();
let vendor_device_id = if let Some(vd) = primary_device.device_vendor {
let primary_device = device.devices.first_mut().unwrap();
let vendor_device_id = if let Some(vd) = primary_device.device_vendor.as_ref() {
vd.get_device_vendor_id()?
} else {
0
};
// It's safe to unwrap the guest_pci_path and get device slot,
// As it has been assigned in vfio device manager.
let pci_path = primary_device.guest_pci_path.unwrap();
let guest_dev_id = pci_path.get_device_slot().unwrap().0;
info!(
sl!(),
"insert host device.
host device id: {:?},
bus_slot_func: {:?},
guest device id: {:?},
vendor/device id: {:?}",
primary_device.hostdev_id,
primary_device.bus_slot_func,
guest_dev_id,
vendor_device_id,
);
let vfio_dev_config = VfioPciDeviceConfig {
bus_slot_func: primary_device.bus_slot_func,
bus_slot_func: primary_device.bus_slot_func.clone(),
vendor_device_id,
guest_dev_id: Some(guest_dev_id),
..Default::default()
};
let host_dev_config = HostDeviceConfig {
hostdev_id: primary_device.hostdev_id,
hostdev_id: primary_device.hostdev_id.clone(),
sysfs_path: primary_device.sysfs_path.clone(),
dev_config: vfio_dev_config,
};
self.vmm_instance
let guest_device_id = self
.vmm_instance
.insert_host_device(host_dev_config)
.context("insert host device failed")?;
// It's safe to unwrap guest_device_id as we can get a guest device id here.
primary_device.guest_pci_path = Some(PciPath::try_from(guest_device_id.unwrap() as u32)?);
Ok(())
}

View File

@@ -104,10 +104,7 @@ impl Hypervisor for Dragonball {
async fn add_device(&self, device: DeviceType) -> Result<DeviceType> {
let mut inner = self.inner.write().await;
match inner.add_device(device.clone()).await {
Ok(_) => Ok(device),
Err(err) => Err(err),
}
inner.add_device(device.clone()).await
}
async fn remove_device(&self, device: DeviceType) -> Result<()> {

Some files were not shown because too many files have changed in this diff Show More