Compare commits

..

666 Commits

Author SHA1 Message Date
Fabiano Fidêncio
a001021721 Merge pull request #8292 from fidencio/topic/release-ensure-gh-is-used-from-a-git-repo
release: Always use actions/checkout to ensure we're in a git repo
2023-10-23 15:16:12 +02:00
Fabiano Fidêncio
c5cfad7023 actions: Move all the checkout actions to v4
It's been released for a while now, and we need to keep consistency
between what we used.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-23 14:01:53 +02:00
Fabiano Fidêncio
b32c6bf805 release: Always use actions/checkout to ensure we're in a git repo
Otherwise we'll face issues like:
```
Run tag=$(echo $GITHUB_REF | cut -d/ -f3-)
  tag=$(echo $GITHUB_REF | cut -d/ -f3-)
  tarball="kata-static-$tag-amd64.tar.xz"
  mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"
  pushd $GITHUB_WORKSPACE
  echo "uploading asset '${tarball}' for tag: ${tag}"
  GITHUB_TOKEN=*** gh release upload "${tag}" "${tarball}"
  popd
  shell: /usr/bin/bash -e {0}
~/work/kata-containers/kata-containers ~/work/kata-containers/kata-containers
uploading asset 'kata-static-3.3.0-alpha0-amd64.tar.xz' for tag: 3.3.0-alpha0
failed to run git: fatal: not a git repository (or any of the parent directories): .git
```

Fixes: #8286 (or better, just a follow up of that)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-23 14:00:39 +02:00
Fabiano Fidêncio
8fe88696c0 Merge pull request #8287 from fidencio/topic/release-use-gh-cli-instead-of-hub
actions: release: Use GH cli instead of hub
2023-10-23 12:40:22 +02:00
Fabiano Fidêncio
710eb8ab9d actions: release: Use GH cli instead of hub
hub is now deprecated, which has been causing issues with our release
process.

Let's move to the GH cli (https://cli.github.com/manual), and unblock
this release.

**NOTE**: This commit is purposefully not touching anywhere else hub is
used, as that would require more time and investigation to do the
switch, and right now we just want to unblock the release.

Fixes: #8286

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-23 08:49:55 +02:00
Fabiano Fidêncio
74d4865189 Merge pull request #8275 from fidencio/topic/ci-adapt-kata-deploy-regex-on-repo-version-update
release: Adapt the CIs using the kata-deploy image
2023-10-23 00:37:19 +02:00
Dan Mihai
732fe163f3 Merge pull request #8229 from microsoft/danmihai1/no-config-toml-endpoints
agent: no endpoint blocking from agent-config.toml
2023-10-20 11:30:43 -07:00
Fabiano Fidêncio
026f6a1a4c release: Adapt the CIs using the kata-deploy image
This is needed in order to properly run the CIs in branches that are not
the main one, as the kata-deploy.yaml file on those branches do not have
the `latest` tag, but rather the latest stable release.

Fixes: #8274

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-20 18:59:14 +02:00
Fabiano Fidêncio
124f498830 Merge pull request #8266 from fidencio/3.3.0-alpha0-branch-bump
# Kata Containers 3.3.0-alpha0
2023-10-20 17:40:44 +02:00
GabyCT
8486283012 Merge pull request #8247 from GabyCT/topic/iperfudp
metrics: Add iperf udp benchmark
2023-10-20 09:21:37 -06:00
Fabiano Fidêncio
0fb69ddf6a release: Kata Containers 3.3.0-alpha0
- kata-deploy-stable: Switch to using the ubuntu based payload
- libs: protection: Fix typo in TDX output
- ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
- tests: Enable agent stability test
- docs: Fix paths to build kernel in SNP VMs documentation
- runtime-rs: ch: Add TDX CH features check
- runtime: Validate hypervisor section name in config file
- tests: query data from the OPA service
- release: tag_repos: Stop tagging the `tests` repo
- metrics: fixes common.sh function to always return true
- Memory footprint test removing trailing commas to make json results file valid
- policy: allow access to ReseedRandomDev
- runtime/kata-ctl: update dependencies
- runtime-rs : fix Nydus support for runtime-rs + Dragonball
- metrics: removal of reference in the documentation to the fio dax subtest.
- runtime-rs: ch: Detect Intel TDX version
- runitme-rs: use the same base64 as kata-runtime/direct-volume does
- tests: Enable scability test for stability CI
- runtime-rs: Add support for adding vfio device for cloud-hypervisor
- tests: Enable soak parallel stability test
- dragonball: vcpu metrics change to be recorded per vcpu
- ci: k8s: adapt gha-run.sh to run locally
- metrics: removes kata components and k8s deployment when test finishes
- GHA: fix up referenced yaml exceeding 20 limit problem
- gha: ci: Revert tracing test PR to unbreak CI
- runtime-rs: ch: Enable feature
- gha: ci: Port runk tests over
- ci: gha: Port tracing tests over
- Enable fio test using containerd client
- gha: Add stability tests workflow for gha
- gha: arm64: Ensure the builder is arm64-builder
- kata-deploy: Build kata-agent as we build all the other components
- versions: migrate out of k8s.gcr.io
- doc: Update crictl pod-config
- gha: Fix k0s deployment
- tests: Add stability test for kata CI
- docs: Update url in kata vra document
- gpu: Adding CDI support for cold and hot-plug of VFIO devices
- kata-deploy: build & ship the rust components from src/tools/
- metrics: Add latency value limits for kata CI
- runtime: fix reading cgroup stats of sandboxes
- Upgrade to Cloud Hypervisor v35.0
- ci: Port kata-monitor tests from Jenkins to GHA
- metrics: Fix latency yamls path
- metrics: Fix metrics README
- metrics: Fix C-Ray documentation
- runtime-rs: ch: Enable Intel TDX
- ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI
- metrics: Enable latency test in gha run script
- local-build: Fix .docker ownership before build-payload
- runtime-rs: Add network support for cloud-hypervisor
- osbuild: Reduce guest components binary size with strip
- gha: Add pandoc as a dependency for static checks
- ci: rootfs-image build-asset is failing
- feat(runtime-rs): introduce huge page mode to select VM RAM's backend
- clh: Direct IO support for block devices
- gha: Install hunspell for static checks
- ci: Trigger payload-after-push on workflow_dispatch
- ci: Actually enable the CRI-O tests
- protocol: remove gogoprotobuff tests
- ci: k8s: Also run tests with CRI-O
- runtime: support kernel params including spaces
- ci: kata-deploy: Fix runner name
- metrics: Enable parallel bandwidth iperf limit
- ci: kata-deploy: Enable all k8s flavours that we support
- ci: Create clusters in individual resource groups
- versions: Bump virtiofsd to v1.8.0
- clh: arm: Use static_sandbox_resource_mgmt=true
- Bump nydus versions and update nydus tests
- runtime/qemu: Rework QMP/HMP support
- clh:arm64: use arm AMBA UART for hypervisor debug
- ci: Use variable size of VMs depending on the tests running
- ci: Rework static checks
- runtime: incorrect handling of non-empty []Endpoint parameter in Remo…
- ci: cache: Check the sha256sum of the components & fix ovmf-sev cache usage
- ci: cache: Use the artefacts stored in ghcr.io/kata-containers/cached-artefacts/${component}
- ci: Run some of the GARM tests in smaller instances
- ci: Reduce the size of the AKS VMs
- ci: cache: Allow pushing our artefacts to an OCI registry
- metrics: Add iperf value for cpu utilization
- ci: cache: Export env vars needed to use ORAS
- gha: vfio: Import test script
- tests: fix kernel and initrd annotations
- metrics: Add iperf bandwidth value for kata metrics
- metrics: Add Cassandra Metrics documentation
- metrics: Remove warning from metrics documentation
- ci: docker: nerdctl: Switch to tcp port 80 ping
- runtime: Naming conflict of network devices
- Remove gogoproto.nullable extension
- metrics: Ensure docker is running in init_env
- metrics: this PR skips the FIO test temprarily to fix issues
- ci: Add a very basic nerdctl sanity test
- runtime-rs: hypervisor: Remove debug kernel options
- versions: Bump rust version
- ci: Add a very basic docker sanity test
- dragonball: fix for non-deterministic builds
- runtime-rs: bring hybrid vsock devices in manager.
- ci: use github.ref_name instead of $GITHUB_REF_NAME
- ci: Add more target-branch related fixes
- ci: Fix target-branch usage
- agent: optimize the code of systemd cgroup manager
- gha: Manually rebase PR atop of the target branch before testing
- Update kernel to the latest LTS release (v6.1.52) and bring in erofs patches needed for the CC work
- kata-deploy: Fix aarch64 image build
- runtime: Fix more virtiofs args
- kata-deploy: Switch to an alpine image
- metrics: Use TensorFlow optimized image
- metrics: fix FIO test initialization
- ci: k8s: Add clean-up-garm argument for gha-run.sh
- ci: k8s: Second round of fix-ups with the devmapper CI
- metrics: re-enable memory-usage initialization step
- Dragonball: optimize the placement of dbs-upcall features
- ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
- ci: k8s: Add k8s devmapper tests (part 0)
- kata-deploy: Create kata-static.tar with correct ownership
- runtime: run prestart hooks before starting VM for FC
- metrics: Add write 95 percentile FIO value
- runtime: Allow virtio_fs_extra_args annotation
- packaging: do not install docker-compose-plugin for s390x|ppc64le
- runtime-rs: Fix volumes and rootfs cleanup issues
- metrics: Enable iperf benchmark on gha for kata metrics
- CI: switch static-checks-dragonball CI machines to Azure
- metrics: Add README for kata metrics report
- osbuilder: Remove chcon operation for guest SELinux
- kata-sys-util: protection: Update TDX checks
- Improve the way to clean up storage devices for sandbox
- agent: avoid possible leakage of storage device
- tests: add policy to existing tests
- gha: Rebase PR atop of the target branch before testing
- versions: Update alpine to its 3.18 version
- runtime: Fix data race in ioCopy
- metrics: Add grabdata script for metrics report
- Fixes tests on AMD machines
- metrics: Enable FIO limits for kata metrics
- metrics: Add metrics report script
- metrics: Fix memory inside limits for kata metrics
- metrics: fix parsing issue on memory-usage test
- dragonball: vsock add fifo/pipe stream support for passed fd hybridSt…
- tests: Add confidential test
- tdx: Update the components needed for using the 6.2 kernel stack
- tests: delete k8s deployment at the test's end
- tests: use unique test name
- runtime-rs: check peer close in log_forwarder
- gha: Avoid "fail-fast" in tests that are known to be flaky
- Refine storage device management for kata-agent
- metrics: Remove unused variable in tensorflow nhwc script
- kata-deploy: Don't try to remove /opt/kata
- metrics: Add TensorFlow ResNet50 FP32 benchmark
- gha: vfio: Run on Ubuntu 23.04 runner
- kata-agent: use default filemode for block device when it is set to 0
- kata-types: introduce KataVirtualVolume to support nydus, direct volume and image pull
- libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml
- local-build: Remove GID before creating group
- kata-deploy: Avoid failing on content removal
- runtime: fix image and initrd assets handling
- metrics: Add disk link to README
- metrics: Fix FIO path
- gha: capture additional kata-deploy output
- metrics: Use function from metrics common in pytorch script
- metrics: Enable kata runtime in K8s for FIO test.
- metrics: Fix README for pytorch
- metrics: Remove unused variable in tensorflow mobilenet script
- rootfs: agent: Policy support with AGENT_INIT=yes
- gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy
- metrics: Fix check results for tensorflow benchmark
- metrics: Add Tensorflow ResNet50 int8 benchmark
- kata-deploy: Properly create default runtime class
- agent: simplify error handling
- metrics: Fix MobileNet help me description
- gha: ci: Start running kata-deploy tests
- runk: Modify kill command's error message for containerd tests
- runtime-rs: add driver option
- gha: cri-containerd: Enable tests
- metrics: Rename tensorflow scripts
- gha: tests: Add kata-deploy functional tests -- Part 1
- agent: runtime: add Agent Policy feature
- runk: Support without pid ns
- metrics: Add Cassandra Kubernetes benchmark for kata metrics
- metrics: Add common functions to the common script
- metrics: fix the loop used to stop kata components
- docs: Remove installation step in virtcontainers doc
- Propogate secrets, config maps etc into guest if sharedFS not available
- kata-deploy: Preliminary k0s support
- gha: static-checks: Move to the Azure instances
- versions: Update firecracker version to 1.4.0
- agent: Allow clippy::redundant_clone in the unit tests
- agent: avoid creating new `Vec` instances when easily avoidable
- metrics: compute tensorflow statistics
- metrics: Add network nginx benchmark
- metrics: install kata once and run multiple checks
- ci: unencrypted-image: Fix build context
- ci: create-confidential-image: Add dependent actions
- Follow up fixes for https://github.com/kata-containers/kata-containers/pull/7596
- tests: Create image that will be used in the unencrypted confidential tests
- kata-deploy: Ensure we cover SHIMS / DEFAULT_SHIM as part of our tests
- tests: upgrade bats version
- Fix mimor bugs and improve coding stype of agent rpc/sandbox/mount
- deps: Bump dependent crate versions
- fix number of queues handling in dragonball share fs device
- runtime-rs: Introduce directly attachable network
- metrics: General improvements to mobilenet tensorflow test
- gha: Add iperf network metrics
- docs: Use control-plane term instead of master
- agent: avoid unnecessary calls to `Arc::clone`
- metrics: Add network latency test
- Image pulling on the host
- Use version 0.10.4 of `fuse-backend-rs`
- kata-deploy: Use host's systemctl
- release: Revert kata-deploy changes after 3.2.0-rc0 release
- metrics: stop kata components before start a metric test.
- runtime-rs: Add block device handling for cloud hypervisor

a93fdb014 kata-deploy-stable: Adapt to what we're using in the stable branch
36109da93 ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
d01daf749 tests: Adjust timeout for agent stability test
9b14dda14 libs: protection: Fix typo in TDX output
0e0867f15 runtime-rs: ch: Add TDX CH features check
409eadddb runtime-rs: ch: Improve readability of guest protection checks
82a0814fc tests: Enable agent stability test
32be8e3a8 tests: query data from the OPA service
b81c0a669 tests: encode policy file during test
4f9681b41 metrics: fixes common.sh function to always return true
2ef2b2a6d docs: Fix paths to build kernel in SNP VMs documentation
408b59c02 runtime-rs: fix bugs to support Nydus v5
157caea9f Revert "nydus: Temporarily skip tests on dragonball"
678fe3cd3 Dragonball: fix Nydus config serde problem
b6ec62138 policy: allow access to ReseedRandomDev
908519db9 metrics: skips docker restart when it is not installed or is masked.
c2763120a metrics: removing trailing comma characters from json file.
3e8cf6959 runtime: Validate hypervisor section name in config file
ef6388e81 tests: Remove unused function from scability test
fbc8f8f46 scripts: Use install_yq from the `kata-containers`  repo
65b1a2d27 release: tag_repos: Stop tagging / updating the `tests` repo
87b760f56 runtime-rs: ch: Detect Intel TDX version
73e81f5e3 runitme-rs: unify base64 encoding for direct-volume
c6463cb5a tests: Fix path for versions yaml for soak parallel test
89c9454fc metrics: removal of reference in the documentation to the dax test.
30ff58904 tests: Enable scability test for stability CI
8d6f7b909 runtime-rs: Add support for handling vfio device for cloud-hypervisor
e786b2b01 gha: Add install dependencies for stability tests
dbfe6512f dragonball: vcpu metrics change to be recorded per vcpu
fa60fbe02 dragonball: METRICS is refactored to RwLock<DragonballMetrics>
500d1c5ce kata-ctl: update rustls-webpki/webpki dependency
d7660d82a runtime: unify gopkg.in/yaml.v3 to v3.0.1
fc9a107e8 runtime: unify swag and testify dependency
79ebb959c runtime: update runc dependency to v1.1.9
7f3e8bd65 runtime: unify golang.org/x/text to v0.7.0
df325ae37 runtime: update golang.org/x/net to v0.7.0
bba34910d metrics: stops kata components and k8s deployment when test finishes
84e3d884e gha: Add general dependencies to stability tests
dec3951ca tests: Add soak parallel stability test
0f04d527d tests: Enable soak parallel test
e669282c2 ci: k8s: set KUBERNETES default value
c30c3ff18 tests: run k8s-volume on a given node
666993da8 tests: run k8s-file-volume on a given node
3a00fc910 tests: exec_host() now gets the node name
61c9c17bf tests: add get_one_kata_node() to tests_common.sh
68f083c4d ci: k8s: set KATA_HYPERVISOR default value
6677a61fe ci: k8s: configurable deploy kata timeout
200e54292 ci: k8s: shellcheck fixes to gha-run.sh
4af78be13 kata-deploy: re-format kata-[deploy|cleanup].yaml
d54e6d9cd ci: k8s: run_tests() for kcli
c2ef1f0fb ci: k8s: add deploy-kata-kcli() to gh-run.sh
d2be8eef1 ci: k8s: add cleanup-kcli() to gha-run.sh
cbb9aa15b ci: k8s: set default image for deploy_kata()
89bef7d03 ci: k8s: create k8s clusters with kcli
954d40cce gha: combine coco jobs into a single yaml
b60e0a9b5 gha: combine basic amd64 jobs into a single yaml
e9bd85211 gha: ci: Revert tracing test PR to unbreak CI
b8a46a4b8 runtime-rs: ch: Enable feature
0f2dc8c67 gha: Add containerd stability tests to ci yaml
da91c9df8 ci: Port runk tests to this repo
7f2377276 ci: Add placeholder for runk tests
9205acc3d ci: Move tracing tests here
85d290a04 gha: Add stability gha run script
54f0c8f88 gha: Add stability tests workflow for gha
3bb2923e5 ci: Add placeholder for tracing tests
2c3bf406d ci: Create a function to install docker
119f03de2 gha: arm64: Ensure the builder is arm64-builder
8c498ef5e metrics: Use jq tool to pretty-print json metrics output
a2159a636 metrics: Enables FIO test for kata containers
70e7ec3e2 gha: Fix k0s deployment
560bbffb5 packaging: tools: Remove `set -x` leftover
18fa483d9 packaging: release: Mention newly added images
ca3b88837 packaging: tools: Fix container image env var name
5ca66795c packaging: Allow passing the TOOLS_CONTAINER_BUILDER
02acef957 gha: Build the kata-agent as part of our workflows
5208386ab packaging: Build the kata-agent
1727487ee agent: Allow specifying DESTDIR and AGENT_POLICY via env vars
45c118883 packaging: Add get_agent_image_name()
0db8fb8f9 versions: migrate out of k8s.gcr.io
a1a054367 doc: Fix spelling
6339605a1 tests: Add general stability fixes
59ae24444 doc: Update crictl pod-config
fd19f4082 tests: Add agent stability test
215577032 tests: Add cassandra stress in stability tests
f2d3ea988 tests: Add stressng dockerfile for stability tests
6493aa309 tests: Add stressor CPU test for stability tests
ef68a3a36 metrics: Add stability test for kata CI
7c934dc7d gpu: Fix cold-plug of VFIO devices
8d66ef518 metrics: Increase qemu jitter value
5600e28b5 metrics: Increase jitter value for clh
a6b1f5e21 ci: Build src/tools components as part of our tests / releases
501a168a8 kata-deploy: Build components from src/tools
6ef42db5e static-build: Add scripts to build content from src/tools
4d08ec29b packaging: Add get_tools_image_name()
98097c96d packaging: Use git abbreviated hash
489caf1ad ci: kata-monitor: Move tests over
a3fb067f1 ci: Add placeholder for kata-monitor tests
57cb4ce20 ci: Make install_kata aware of container engines
de1eeee33 ci: Create a generic install_crio function
64a200085 ci: Add install_cni_plugins helper
8132fe15c ci: Modify containerd default config
8cb7df1be metrics: Add checkmetrics for latency test
e90440ae2 metrics: Add qemu latency value limit
a74a8f8a9 metrics: Add latency value limits for kata CI
d7def8317 metrics: Fix general check static warnings
928553d1b docs: Update url in kata vra document
b0a3293d5 runtime-rs: ch: Enable Intel TDX
523399c32 runtime-rs: ch: Add more consts
dea806581 runtime-rs: ch: Remove unused function
995f2c015 runtime-rs: ch: Only handle particular pending device types
b1b96a5c4 runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check
9ac29b8d3 metrics: Add init_env function to latency test
dfd0c9fa9 runtime: clh: Re-generate the client code
8f9f087e3 versions: Upgrade to Cloud Hypervisor v35.0
81c8babca metrics: Fix latency yamls path
481573682 metrics: Fix C-Ray documentation
ef63d67c4 ci: crio: Trail '\r' from exec_host() output
74c12b292 ci: crio: Enable default capabilities
358dc2f56 kata-deploy: Fix CRI-O detection
ebaa4fa4c ci: crio: Pass `-y` to apt
97e73b223 metrics: Fix spelling warnings
36c8cd6f1 metrics: Fix metrics README
15425a2b8 local-build: Fix .docker ownership before build-payload
13ca7d9f9 gha: Add pandoc as a dependency for static checks
08bc8e4db metrics: Add latency benchmark for gha
6776b55d7 metrics: Enable latency test in gha run script
94e2ccc2d runtime: fix reading cgroup stats of sandboxes
d507d189b fc: Add support for noflush cache option
2ca781518 clh: Direct IO support for block devices
0c95697cc ci: Trigger payload-after-push on workflow_dispatch
28cbc3b51 ci: rootfs-image build-asset is failing Fixes: #8027
87a861648 gha: Install hunspell for static checks
8c3c50ca8 ci: Actually enable the CRI-O tests
3a6510ad6 osbuild: Reduce guest components binary size with strip
07a6e63a6 ci: k8s: rke2: Use sudo to call systemd
03b82e848 ci: k8s: Add a CRI-O test
d7105cf7a ci: k8s: Add a method to install CRI-O
54c0a471b ci: k8s: k0s: Allow passing parameters to the k0s installer
730ef5169 deps: updating dependencies
3a2c83d69 ci: kata-deploy: Fix runner name
82ff2db46 runtime: support kernel params including spaces
604a9dd67 protocol: remove gogoprotobuff tests
f7fa7f602 ci: Enable kata-deploy tests for all the supported k8s flavours
2c908b598 ci: kata-deploy: Add the ability to deploy rke2
eaf616491 ci: kata-deploy: Add the ability to deploy k0s
001525763 ci: kata-deploy: Add deploy-k8s argument to gha-run.sh
bf2cb0228 ci: kata-deploy: Expland tests to run on k0s / rke2
b12b9e188 ci: kata-deploy: Add placeholder for tests on GARM
9e1fb8a96 ci: kata-deploy: Export KUBERNETES env var
09cc0ed43 ci: Move deploy_k8s() to gha-run-k8s-common.sh
486fe14c9 ci: Properly set K8S_TEST_UNION
d9ef1352a ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name
68267a399 ci: Create clusters in individual resource groups
9aa8d1c91 metrics: Add parallel bandwidth limit for qemu
44c7c082d versions: Bump virtiofsd to v1.8.0
af59d4bf4 metrics: Enable parallel bandwidth iperf limit
aba36ab18 nydus: Temporarily skip tests on dragonball
b8a8dfcd1 nydus: Use `kata-${KATA_HYPERVISOR}` instead of `kata`
f6df3d6ef static-build: Fix arch error on nydus build
2f9c9e2e6 tests: nydus: Update nydus tests
c9a4e7e46 versions: Bump nydus and nydus-snapshotter to its latest release
b73bde320 gha: nydus: Populate run()
b3904a1a3 gha: nydus: Populate install_dependencies()
d2b3b67f5 gha: nydus: Actually install kata when `install-kata` is called
0ec00ad42 gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh
568439c77 tests: nydus: Add timeout to the crictl calls
5ac3b76eb tests: nydus: Add uid / namespace to the nydus container / sandbox
376574a16 tests: nydus: Decorate some calls with `sudo`
4290fd4b6 tests: nydus: Adapt "source ..." to GHA
a84efa3e8 tests: nydus: Adapt check to "clh" instead "cloud-hypervisor"
56a14b395 tests: common: Add install_nydus_snapshotter()
b6563783e tests: common: Add install_nydus()
72599f191 clh: arm: Use static_sandbox_resource_mgmt=true
1f16b6627 runtime/qemu: Rework QMP/HMP support
8b1e9b0c7 ci: static-checks: Clean up static-checks job
2c5ca2eaf ci: static-checks: Run tests depending on KVM
509c309ab ci: static-checks: Move "sudo make test" to the new test matrix
4e963cedf ci: static-checks: Move "make test" to the new test matrix
08f2e5ae0 runtime-rs: Ensure static-checks-build is a dep of `make test`
2bc3a616a kata-ctl: Use `loop` instead of `kvm` module in tests
46daddc50 kata-ctl: Ensure GENERATED_CODE is a dep of `make test`
ec826f328 agent: Ensure GENERATED_CODE is a dep of `make test`
1d32410a8 ci: install_libseccomp: Do not depend on the tests repo
bf888b9a5 ci: static-checks: Move "make check" to the new test matrix
473ec8780 kata-ctl: Add `kata-types` to the Cargo.lock file
ea19549a9 kata-ctl: Ensure GENERATED_CODE is a dep of `make check`
e12577586 tests: install_rust: Also install clippy
e2c61a152 ci: static-checks: Move vendor check to its own job
6794d4c84 tests: Move install_rust.sh from the tests repo
e64508c30 tests: install_go: Remove tests repo dependency
11dff731b tests: Move functions from kata_arch script here
75c974c80 ci: static-checks: Move kernel config check to its own job
9c233bb9e test: Add test to verify try_from for clh Netconfig
c69a1e33b ci: Use variable size of VMs depending on the tests running
9049d311d runtime-rs: Add network support for cloud-hypervisor
eecd5bf2a ci: cache: Fix ovmf-sev cache
86c41074b ci: cache: Check the sha256sum of the component
460988c5f ci: cache: Remove the script used to cache artefacts on Jenkins
4533a7a41 ci: cache: Also store the ${component} sha256sum
eccc76df6 ci: cache: Use the cached artefacts from ORAS
7f5e77bcb kernel: enable Arm pl011 support
241c355e0 clh:arm64: use arm AMBA uart for hypervisor debug
094b6b2cf ci: k8s: Temporarily disable tests that require a bigger VM instance
d0c257b3a ci: cache: Push cached artefacts to ghcr.io
108f1b60d kata-deploy: Generate latest_{artefact,image_builder} files
be2eb7b37 ci: cache: Install ORAS in the kata-deploy binaries builder container
fb24fb0dc ci: k8s: devmapper: Use a smaller / cheaper VM instance
1daf02f5d ci: nydus: Use a smaller / cheaper VM instance
e60d81f55 ci: nerdctl: Use a smaller / cheaper VM instance
4db416997 ci: docker: Use a smaller / cheaper VM instance
32841827b ci: cri-containerd: Use a smaller / cheaper VM instance
92fff129f ci: k8s: Don't set cpu limit request for k8s-inotofy test
faf98c062 ci: Reduce the size of the AKS VMs
adc18ecdb ci: cache: For consistency, read all used env vars
c7a851efd ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker
6bd15a85d ci: cache: Export env vars needed to use ORAS
cd4fd1292 metrics: Add iperf cpu utilization limit for qemu
df5cd10ea metrics: Add iperf value for cpu utilization
a96050a7a tests: Apply timeout to 'ctr t kill'
9d9303678 tests/vfio: Bump VM image to Fedora 38
faee59b52 tests/vfio: Accept single device in vfio group for CLH
df3dc1105 tests/vfio: Get rid of sync's
7211c3dcc gha: vfio: Set test timeout to 15m
1b02f89e4 packaging: kernel: Enable VIRTIO_IOMMU on x86_64
3a1db7a86 runtime: clh: Support enabling iommu
9f1a42c6c tests/vfio: Give commands 30s to execute
b46b0ecf8 tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms
bfc93927f runtime: Remove redundant check in checkPCIeConfig
7c4e73b60 runtime: Add test cases for checkPCIeConfig
fc51e4b9e runtime: Check config for supported CLH (cold|hot)_plug_vfio values
509771e6f runtime: clh: Add hot_plug_vfio entry to config
5f6475a28 tests/vfio: Gather debug info and disable tdp_mmu
8fffdc81c tests/vfio: Capture journal from vm
df815087e tests/vfio: Change to get the test working in GHA
a92ddeea1 tests/vfio: Move dependency installation to gha-run.sh
5a551a85b gha: vfio: Import jobs scripts from tests repo
49e2fa189 metrics: Increase jitter value for qemu
49234433a metrics: Increase value limit for jitter in clh
813bfdec0 ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io
46bc0b1c0 ci: nerdctl: Create the containerd config
13968aa7f ci: nerdctl: Switch to tcp port 80 ping
e0c811678 ci: docker: Switch to tcp port 80 ping
1636abbe1 runtime: issue with non-empty []Endpoint in RemoveEndpoints
0aa073967 metrics: Add iperf bandwidth value for qemu
c0ad91476 tests: fix kernel and initrd annotations
615c1cbf1 metrics: Add iperf bandwidth value for kata metrics
d53eb73ee metrics: Ensure docker is running in init_env
ad08321b8 metrics: Add Cassandra Metrics documentation
a58ea6659 metrics: this PR skips the FIO test temprarily to fix issues
f536ef5ce ci: docker: Also run the smoke test with runc
c83f167c5 ci: docker: Run the tests after the kata-static is created
12d833d07 ci: Add a very basic nerdctl sanity test
348b8644d ci: Add a very basic docker sanity test
a75fd5eb8 runk: Fix rust unecessary mut error
a31c14517 kata-ctl: useless-vec warning
c8419fc3b kata-ctl: Resolve non-minimal-cfg warning
3eaf68d95 agent-ctl: Allow clippy lint
1d8b78959 runtime-rs: Fix useless-vec warning
99f3d69e9 runtime-rs: Remove mut
16fbc27b0 dragonball: Allow ambiguous-glob-reexports
bbf191951 dragonball: Resolve non-minimal-cfg warning
75cfdd5d5 agent: config: Allow clippy lint
f3a0fd590 agent: config: Fix useles-vec warning
9e423bd3d libs: Fix clippy unnecesary hashes error
444395050 versions: Bump rust version
a16b0962b chore(cargo): update cargo lock
ca4b6b051 runtime: Naming conflict of network devices
202049f35 feat(runtime-rs): introduce huge page type to select VM RAM's backend
f811b064c ci: use github.ref_name instead of $GITHUB_REF_NAME
6d795c089 ci: Add more target-branch related fixes
8509c3187 ci: Fix target-branch usage
060499dca metrics: Remove warning from metrics documentation
c0f697fcc runtime: Allow kernel_params annotation
b03e49794 dragonball: fix for non-deterministic builds
976d10150 runtime-rs: hypervisor: Remove debug kernel options
fde34610c kernel: Add erofs patches needed for CC related work
dc6a4588a versions: Bump kernel to the latest LTS release (6.1.52)
52f6449b7 kata-manager: Remove initcall_debug kernel option
8b4a0b368 kata-deploy: Remove curl after it's used
139c7f03a kata-deploy: Fix aarch64 image build
470d06541 agent: optimize the code of systemd cgroup manager
bd24afcf7 gha: Manually rebase PR atop of the target branch before testing
72c510d05 runtime/virtiofsd: Drop all references to "--cache=none"
ead724bec protocol: removing gogo.nullable feature
d8e4bb985 protocol: remove unused PROTO_FILE env
5e1106a77 protocol: remove unused import_path
87accaaec protocol: use workdir during build
711a7ed96 protocol: remove mapping definitions
8db84c1bd protocol: force GOPATH to be set
68156d77a protocol: breaking lines to improve readability
670a8e9c7 kata-deploy: Switch to an alpine image
9d74b7ccc k8s: ci: Skip "Pod quota" test with firecracker
f6cd3930c ci: k8s: Remove useless skip statement from tests
3cc20b47a ci: k8s: Also check for "fc" (for firecracker)
b5bad3cb0 ci: k8s: Add clean-up-garm argument for gha-run.sh
aaec5a09f ci: k8s: devmapper tests should be using ubuntu 20.04
27fa7d828 ci: k8s: Add a kata-deploy-garm target
fa62a4c01 ci: k8s: Export KUBERNETES env var
8c9380a79 ci: k8s: Install bats on GARM runners
3de23034f ci: k8s: Wait some time after restarting k3s
adfea55b8 metrics: fix FIO test initialization
2df183fd9 ci: k8s: Append, instead of overwrite, the devmapper config
369a8af8f ci: k8s: Decrease k3s sleep from 4 to 2 minutes
ada65b988 ci: k8s: Use vanilla kubectl with k3s
ad45ab5d3 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
028a97e0d ci: k8s: Use the proper command for sleep
3a427795e metrics: Use TensorFlow optimized image
8d99972a8 ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
deed1b927 Dragonball: optimize the placement of dbs-upcall features
0e8bd50cb ci: k8s: Add k8s devmapper tests (part 0)
b28b54df0 ci: k8s: Add a function to configure devmapper for containerd
54f711721 ci: k8s: Add a function to deploy k3s
81536f21a runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr"
b1dd09a4d runtime: Allow virtio_fs_extra_args annotation
2efda20c7 packaging: do not install docker-compose-plugin for s390x|ppc64le
438fbf966 metrics: Add write 95 percentile for FIO for qemu
024b4d2ff metrics: Add write 95 percentile FIO value
e98e5cdea metrics: Add checkmetrics to gha run script
c1edfe551 metrics: Add checkmetrics value for qemu for iperf
6a79ecedf metrics: Add jitter value for clh
f609a9a75 metrics: Add test selector to iperf metrics
5b8db3042 metrics: Enable iperf benchmark on gha for kata metrics
60f733d30 CI: switch static-checks-dragonball CI machines to Azure
7870b33a2 runtime-rs: bring hybridVsock devices in manager.
18c94ebbe kata-deploy: Create kata-static.tar with correct ownership
57e7bf14a agent: refine StorageDeviceGeneric::cleanup()
53edb1937 agent: implement StorageDeviceGeneric::cleanup()
0c63453e2 types: make StorageDevice::cleanup() return possible error code
3a3d77b3b agent: move StorageDeviceGeneric from kata-types into agent
b151cfd14 metrics: re-enable memory-usage initialization step
f3e1a6a94 osbuilder: alpine: Change mirror
ac612aef5 osbuilder: alpine: Match the version on versions.yaml
9cd706d1c agent: avoid possible leakage of storage device
bf21411e9 tests: add policy to k8s tests
d0e061067 runtime: config: use the SEV initrd for SNP
67fed26f1 runtime: Use TDX image with in the qemu-tdx config
ac939c458 gha: Rebase atop of the target branch
82cd14ba3 versions: Update alpine to its 3.18 version
666882575 metrics: Add grabdata script for metrics report
c290eaed8 kata-sys-util: protection: Update TDX checks
d7a996c68 gha: Update to checkout@v3 action
c2ba29c15 runtime: Fix data race in ioCopy
211de08d9 osbuilder: Remove chcon operation for guest SELinux
9f21fa9b3 metrics: Add report generator link to general documentation
c0ed5ea0a metrics: Add README for kata metrics report
a7b59a5bf metrics: Add limit for 90 percentile for qemu value
99db6568e metrics: Add limit for write 90 percentile value for clh
6e06392c5 metrics: Enable FIO limits for kata metrics
2e4c87472 runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure
21204caf2 runtime: fail early when starting docker container with FC
32fd01371 runtime: run prestart hooks before starting VM for FC
00e7ffd98 tests: check vmx only on Intel machines
c8dd3c073 metrics: Fix memory footprint qemu limit
8877ec62f metrics: Fix memory inside limits for kata metrics
80146f207 tests: Fixes cpuType check on AMD machines
7e364716d metrics: Add test setup details to metrics report
17dc1b976 metrics: Add boot lifecycle times to metrics report
3b0d6538f metrics: Add memory inside container to metrics report
79fbb9d24 metrics: Add scaling system footprint in metrics report
8e6d4e6f3 metrics: Add metrics reportgen
139ffd4f7 metrics: Add report file titles
878d1a2e7 metrics: Generate PNGs alongside the PDF report
fce248797 metrics: Add metrics report R files
08812074d metrics: Add report dockerfile
69781fc02 metrics: Add metrics report script
e286e842c tests: Expand confidential test to support TDX
e31f099be tests: Expand confidential test to support SNP
c3b9d4945 tests: Add confidential test for SEV
538c965c2 metrics: fix parsing issue on memory-usage test
3818bf331 local-build: Remove $HOME/.docker/buildx/activity/default
d1b54ede2 qemu: tdx: Workaround SMP issue with TDX 1.5
1e34220c4 qemu: tdx: Adapt to the TDX 1.5 stack
8115a0522 versions: tdx: Update Kernel to 6.2 + TDX
ec18180f3 versions: tdx: Update TDVF to the "edk2-stable202302"
9803b2428 versions: tdx: Update QEMU to v7.2 + TDX v1.10
dffc16e5b runtime-rs: check peer close in log_forwarder
aaa5ab126 agent: simplify storage device by removing StorageDeviceObject
fb49d5d7c gha: Avoid "fail-fast" in tests that are known to be flaky
183f51d6f tests: use unique test name
6a974679f tests: delete k8s deployment at the test's end
32a778b6d metrics: Remove unused variable in tensorflow nhwc script
d8f3ce649 kata-deploy: Don't try to remove /opt/kata
936e8091a gha: vfio: Run on Ubuntu 23.04 runner
0e7248264 agent: move storage device related code into dedicated files
268e84655 runtime-rs: Fix volumes and rootfs cleanup issues
8f49ee33b agent: refine storage related code a bit
60ca12ccb agent: switch to new storage subsystem
fcbda0b41 kata-types: introduce StorageDevice and StorageHandlerManager
b03b1f613 agent: simplify the way to manage storage object
8392c71bf sys-util: support more mount flags in parse_mount_options()
c00d8f3d4 agent: use create_mount_destination() from kata-sys-util
5e867f053 types: add more mount related constants
880e6c9a7 agent: use function from kata-sys-utils to reduce code
3b881fbc0 local-build: Remove GID before creating group
959ca4944 metrics: Add TensorFlow ResNet50 fp32 Dockerfile
4b7d72c4a metrics: Add TensorFlow ResNet50 FP32 benchmark
5cba38c17 kata-deploy: Avoid failing on content removal
18d42da21 runtime/fc: fix image/initrd annotation handling
9fda7059a runtime/clh: fix image/initrd annotation handling
1a0092d63 runtime/qemu: fix image/initrd annotation handling
22d8f335d libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml
8afd158ce metrics: Add disk link to README
40914b25d kata-agent: use default filemode for block device when it is set to 0
eee2ee6ee metrics: Fix FIO path
39bc3488f metrics: Use function from metrics common in pytorch script
400eb8874 gha: capture additional kata-deploy output
4aee3eade kata-types: implement serde methods for KataVirtualVolume
b875e3932 kata-types: validate KataVirtualVolume object
fa2fdc105 kata-types: implement two conversion helpers for KataVirtualVolume
6326af20e kata-types: introduce KataVirtualVolume
c8b43f8b3 metrics: Fix README for pytorch
fb571f8be metrics: Enable kata runtime in K8s for FIO test.
cb056f8cb rootfs: agent: Policy support with AGENT_INIT=yes
85c02828e metrics: Update tensorflow name in gha run script
e8a511934 metrics: Fix check results for tensorflow benchmark
2d896ad12 gha: kata-deploy: Do the runtime class cleanup as part of the cleanup
4ffc2c86f gha: kata-deploy: Add the first kata-deploy test
8616c050a metrics: Remove unused variable in tensorflow mobilenet script
285e616b5 tests: common: Ensure test_type is used as part of the cluster's name
790bd3548 tests: commob: Don't fail if yq is not part of the cache
ce6adecd0 gha: kata-deploy: Add run-kata-deploy-tests.sh
cfc29c11a gha: k8s: Stop running kata-deploy tests as part of the k8s suite
f4dd15286 tests: k8s: Call ensure_yq() in setup.sh
339569b69 kata-deploy: Properly create default runtime class
2a491e9b1 metrics: Fix MobileNet help me description
d19a75e80 gha: ci: Start running kata-deploy tests
d90f7ac68 runtime-rs: add unit test for block driver
e44919f0d runtime-rs: add load_test_config for unit test
7f48a6937 runtime-rs: add driver option
bade6a5c3 docs: Fix TensorFlow word across the document
1a1b20776 docs: Add Tensorflow Resnet50 documentation
24baededc metrics: Add Dockerfile for ResNet50 int8
6d971ba8d metrics: Add Tensorflow ResNet50 int8 benchmark
25d151bd1 runk: Modify kill command's error message for containerd tests
b3592ab25 gha: cri-containerd: Enable tests
84dd02e0f gha: cri-containerd: Add timeout to the crictl calls on testContainerStop
b29782984 gha: cri-containerd: Show pod before deleting it
ae0930824 gha: cri-containerd: Print kata logs in case of error
6c8b2ffa6 gha: cri-containerd: Group containerd logs
9e898701f gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account
76dac8f22 agent: simplify error handling
18a7fd8e4 metrics: Rename tensorflow scripts
e55fa93db tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx
d9ee17aae tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks
ab829d103 agent: runtime: add the Agent Policy feature
831e73ff9 tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder
af1b46bbf tests: Add gha-run-k8s-common.sh
416445e7e docs: Remove installation step in virtcontainers doc
72cbcf040 kata-deploy: Add k0s support
767434d50 metrics: fix the loop used to stop kata components #7629
5d0f0d43c metrics: Add cassandra statefulset yaml
c1dcc1396 metrics: Add cassandra service yaml
2297a0d1c metrics: Add block loop pvc yaml for cassandra
e3d511946 metrics: Add block loop pv yaml for cassandra test
989027159 metrics: Add block loop pvc for cassandra test
349b89969 metrics: Add Cassandra Kubernetes benchmark for kata metrics
c52d09052 gha: static-checks: Move to the Azure instances
8815ed066 runtime: Remove config warnings
afe1a6ac5 agent: support copying of directories and symlinks
ab13ef87e runtime: propagate configmap/secrets etc changes for remote-hyp
c074ec4df runtime: Copy shared files recursively
fdcd52ff7 metrics: Add check containers are running in tensorflow mobilenet
36337ee14 metrics: Add check containers are up in tensorflow script
f700f9b0b metrics: Remove unused variable in tensorflow script
833cf7a68 metrics: Add check containers are running function
918c78308 metrics: Add check containers are up in tensorflow mobilenet script
9d57a1fab metrics: Use check containers are up in tensorflow script
1c84680d8 metrics: Add check containers are up in common script
d3e57cf45 metrics: Use collect_results function in tensorflow mobilenet test
286de046a metrics: Remove collect results function definition
9879709aa metrics: Add common functions to the common script
4746fa3da docs: Specify supported Firecracker version using `versions.yaml`
cc922be5e versions: Update firecracker version to 1.4.0
39e67b06e dragonball: vsock add fifo/pipe stream support for passed fd hybridStream
473b0d3a3 metrics: compute tensorflow statistics
03d1fa67b ci: unencrypted-image: Fix build context
eb463b38e ci: unencrypted-image: Don't fail to build on s390x
a2d731ad2 ci: create-confidential-image: Add dependent actions
d1a629622 metrics: Add nginx documentation to network README
498f7c054 metrics: Add nginx kubernetes yaml
f8a5255cf metrics: Add network nginx benchmark
43fe5d1b9 ci: k8s: tees: Ensure PR_NUMBER is exported
54f6a7850 ci: {{ pr-number }} should be {{ inputs.pr-number }}
034d7aab8 tests: k8s: Ensure the runtime classes are properly created
fac8ccf5c ci: Add build-and-publish-tee-confidential-unencrypted-image
ab5f603ff ci: k8s: Add the image used for unencrypted confidential tests
1e8fe131b k8s: tests: Take advantage of `SHIMS` and `DEFAULT_SHIM` env vars
729b2dd61 agent: avoid creating new `Vec` instances when easily avoidable
aeaec9dae tests: upgrade bats version
e66496986 metrics: install kata once and run multiple checks
baabfa9f1 agent: refine implementation of mount related code
98ba211a3 agent: fix a bug in update_ephemeral_mounts()
5333618d7 agent: make add_storage() take &[Storage] instead of Vec<Storage>
37f34781d agent: simplify function online_cpu_memory()
d3c542237 agent: refine style of code related to sandbox
71a9f6778 agent: avoid unwrap() in function do_remove_container()
84badd89d agent: avoid clone objects when possible
b23c5ed15 deps: Bump dependent crate versions
863283716 metrics: General improvements to mobilenet tensorflow test
3c319d8d4 metrics: Add iperf to gha run script
5b5caf890 gha: Add iperf network metrics
66db5b535 metrics: Add latency test to network README
c36572418 agent: avoid unnecessary calls to `Arc::clone`
4fbe0a3a5 runtime: bind-mount mounted block device into container
7e1b1949d runtime: add support for kata overlays
6c867d9e8 agent: add io.katacontainers.fs-opt.overlay-rw option
6163c3565 agent: skip mount options that start with "io.katacontainers."
b2ff97aa0 dragonball: use version 0.10.4 of `fuse-backend-rs`
845eeb4d7 agent: Allow clippy::redundant_clone in the unit tests
1163fc9de release: Revert kata-deploy changes after 3.2.0-rc0 release
3958a39d0 runtime-rs: Introduce directly attachable network
1e15369e5 metrics: Improve naming testing containers in launch times test
5dbe88330 metrics: Clean kata components before start a metric test.
3b45060b6 metrics: Add latency server yaml
9bb8451df metrics: Add latency client yaml
64fdb9870 metrics: Add network latency test
a81ad3b58 runtime-rs: Add block device handling in cloud hypervisor
3230dec95 kata-deploy: Use host's systemctl
1b21a4624 docs: Use control-plane term instead of master
28e5e9c86 runtime-rs: fix number of queues handling in dragonball share fs device
f1d8de9be runk: Allow runk to launch a container without pid namespace

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-20 14:44:50 +02:00
Fabiano Fidêncio
f6e20ac230 Merge pull request #7195 from fidencio/topic/adapt-kata-deploy-stable-to-using-ubuntu
kata-deploy-stable: Switch to using the ubuntu based payload
2023-10-20 14:42:04 +02:00
Fabiano Fidêncio
a93fdb014b kata-deploy-stable: Adapt to what we're using in the stable branch
This is basically to make sure that folks trying to use the kata-deploy
script from the main branch, to deploy **stable** kata-deploy images, do
not have a hard time.

Fixes: #7194

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-20 12:58:42 +02:00
James O. D. Hunt
79ed501a20 Merge pull request #8258 from jodh-intel/protection-fix-tdx-typo
libs: protection: Fix typo in TDX output
2023-10-20 08:36:22 +01:00
Dan Mihai
52aaf10759 agent: no endpoint blocking from agent-config.toml
Remove the ability to block access to kata agent endpoints by using
agent-config.toml. That functionality is now implemented using the
Agent Policy feature (#7573).

The CCv0 branch relied on blocking endpoints using agent-config.toml
but will set-up an equivalent default policy file instead (#8219).

Fixes: #8228

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-20 02:26:54 +00:00
Fabiano Fidêncio
468a3e4b53 Merge pull request #8260 from gkurz/fix-8259
ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
2023-10-19 23:58:22 +02:00
GabyCT
5d6bdbd0a1 Merge pull request #8241 from GabyCT/topic/enableagenttest
tests: Enable agent stability test
2023-10-19 14:12:49 -06:00
Greg Kurz
36109da93f ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
Fixes #8259

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-19 21:53:23 +02:00
GabyCT
dc295600b8 Merge pull request #8157 from GabyCT/topic/fixsevdoc
docs: Fix paths to build kernel in SNP VMs documentation
2023-10-19 11:42:03 -06:00
Gabriela Cervantes
d01daf749b tests: Adjust timeout for agent stability test
This PR adjusts the timeout for the agent stability test
to run on the gha.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-19 16:55:23 +00:00
James O. D. Hunt
9b14dda147 libs: protection: Fix typo in TDX output
Add the missing closing bracket to the output of the TDX details,
so rather than:

```bash
$ sudo kata-ctl env 2>/dev/null | grep available_guest_protection
available_guest_protection = "tdx (major_version: 1, minor_version: 0"
:                                                                    ^
:                                                           Missing ')' !
```

... we now have:

```bash
$ sudo kata-ctl env 2>/dev/null | grep available_guest_protection
available_guest_protection = "tdx (major_version: 1, minor_version: 0)"
:                                                                    ^
:                                                                   Aha!
```

Added a unit test for this scenario.

Fixes: #8257.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-19 16:06:08 +01:00
James O. D. Hunt
9336e2e492 Merge pull request #8155 from jodh-intel/runtime-rs-check-ch-tdx-build-feature
runtime-rs: ch: Add TDX CH features check
2023-10-19 14:13:08 +01:00
James O. D. Hunt
048cc70654 Merge pull request #8213 from jodh-intel/validate-hypervisor-cfg-name
runtime: Validate hypervisor section name in config file
2023-10-19 07:40:58 +01:00
Dan Mihai
99db6dff24 Merge pull request #8230 from microsoft/danmihai1/opa-data
tests: query data from the OPA service
2023-10-18 15:32:23 -07:00
James O. D. Hunt
0e0867f15d runtime-rs: ch: Add TDX CH features check
If you attempt to create a container (a TD) on a TDX system using a
custom build of Cloud Hypervisor (CH) that was not built with the `tdx`
CH feature, Kata will report the following, somewhat cryptic, CH error:

```
ApiError(VmBoot(InvalidPayload))
```

Newer versions of CH now report their build-time features in the ping
API response message so we now use that, if available, to detect this
scenario and generate a user-friendly error message instead.

This changes improves the readability of `handle_guest_protection()` and
adds a couple of additional tests for that method.

Fixes: #8152.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-18 18:07:39 +01:00
James O. D. Hunt
409eadddb2 runtime-rs: ch: Improve readability of guest protection checks
Improve the way `handle_guest_protection()` is structured by inverting
the logic and checking the value of the `confidential_guest` setting
before checking the guest protection. This makes the code easier to
understand.

> **Notes:**
>
> - This change also unconditionally saves the available guest protection
>   (where previously it was only saved when `confidential_guest=true`).
>   This explains the minor unit test fix.
>
> - This changes also errors if the CH driver finds an unexpected
>   protection (since only Intel TDX is currently tested).

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-18 18:06:02 +01:00
Greg Kurz
9863805752 Merge pull request #8201 from fidencio/topic/release-tag-repo-stop-tagging-the-tests-repo
release: tag_repos: Stop tagging the `tests` repo
2023-10-18 18:10:39 +02:00
Gabriela Cervantes
a58afe70b8 metrics: Add iperf udp benchmark
This PR adds the iperf udp benchmark for bandwdith measurement
for network metrics.

Fixes #8246

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-18 15:52:03 +00:00
Gabriela Cervantes
82a0814fc2 tests: Enable agent stability test
This PR enables the agent stability test for stability gha CI.

Fixes #8240

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-17 15:16:06 +00:00
Dan Mihai
32be8e3a87 tests: query data from the OPA service
Add example for querying json data from the OPA service.

Fixes: #8231

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-17 13:31:43 +00:00
David Esparza
d90d1c5c10 Merge pull request #8243 from dborquez/fix_systemctl_masked_query
metrics: fixes common.sh function to always return true
2023-10-16 20:17:24 -06:00
Dan Mihai
b81c0a6693 tests: encode policy file during test
Encode policy file during test - easier to understand than hard-coding
the encoded file contents.

Fixes: #8214

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-16 15:58:12 -07:00
David Esparza
4f9681b411 metrics: fixes common.sh function to always return true
This PR corrects the init env() helper function, to make that
systemctl always returns true when enumerating masked services,
and preventing the test from failing

Fixes: #8242

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-16 15:57:57 -06:00
David Esparza
59e8b1d5a7 Merge pull request #8206 from dborquez/memory_footprint_test_removing_trailing_commas_to_make_json_results_file_valid
Memory footprint test removing trailing commas to make json results file valid
2023-10-16 14:31:28 -06:00
Gabriela Cervantes
2ef2b2a6dc docs: Fix paths to build kernel in SNP VMs documentation
This PR fixes the correct path to setup, build and install properly
the kernel for snp.

Fixes #8156

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-16 20:09:02 +00:00
Fabiano Fidêncio
db37692f36 Merge pull request #8226 from microsoft/danmihai1/policy-typo
policy: allow access to ReseedRandomDev
2023-10-16 19:17:31 +02:00
Peng Tao
45e82b6581 Merge pull request #8192 from bergwolf/github/deps
runtime/kata-ctl: update dependencies
2023-10-16 16:39:17 +08:00
Chao Wu
44e602d69a Merge pull request #8014 from openanolis/chao/fix_nydus_break
runtime-rs : fix Nydus support for runtime-rs + Dragonball
2023-10-16 01:30:22 -05:00
Chao Wu
408b59c02c runtime-rs: fix bugs to support Nydus v5
1. enable virtio-fs-pro in Dragonball to have the ability to process nydus backend registry
2. change passthrough for rw layer's readonly config to false to have the accurate read write ability.

Fixes:#8013

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-10-16 10:22:21 +08:00
Chao Wu
157caea9fe Revert "nydus: Temporarily skip tests on dragonball"
This reverts commit aba36ab188.

Fixes: #8013

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-10-16 10:22:21 +08:00
Chao Wu
678fe3cd31 Dragonball: fix Nydus config serde problem
Since Nydus snapshotter has been updated in previous commits, there is a
problem that the config passthrough to Dragonball during mount_rafs is
RafsConfig instead of ConfigV2, but Dragonball could only serde ConfigV2
so it will panic.

We need to add the support for RafsConfig

Fixes:#8013

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-10-16 10:22:21 +08:00
Dan Mihai
b6ec621389 policy: allow access to ReseedRandomDev
Allow access to the ReseedRandomDev endpoint by default. Using false
for ReseedRandomDevRequest was unintended.

Fixes: #8225

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-13 21:18:27 +00:00
David Esparza
908519db9d metrics: skips docker restart when it is not installed or is masked.
To avoid errors when initializing the test environment, the
kill_processes_before_start() helper function needs to verify that
docker is installed before attempting to stop it.

Fixes: #8218

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-13 18:02:00 +00:00
David Esparza
c2763120aa metrics: removing trailing comma characters from json file.
This PR removes trailing commas so that the json results
file is valid.

This PR also changes the way data results are collected by
terating through the array of memory values to calculate
their average.

Fixes: #8204

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-13 18:00:57 +00:00
GabyCT
1974d13122 Merge pull request #8188 from dborquez/metrics_add_fio_readme.md
metrics: removal of reference in the documentation to the fio dax subtest.
2023-10-12 10:53:55 -06:00
James O. D. Hunt
3e8cf6959c runtime: Validate hypervisor section name in config file
Previously, if you accidentally modified the name of the hypervisor
section in the config file, the default golang runtime gives a cryptic
error message ("`VM memory cannot be zero`"). This can be demonstrated
using the `kata-runtime` utility program which uses the same golang
config package as the actual runtime (`containerd-shim-kata-v2`):

```bash
$ kata-runtime env >/dev/null; echo $?
0
$ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml
$ kata-runtime env >/dev/null; echo $?
VM memory cannot be zero
1
```

The hypervisor name is now validated so that the behaviour becomes:

```bash
$ kata-runtime env >/dev/null; echo $?
0
$ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml
$ ./kata-runtime env >/dev/null; echo $?
/etc/kata-containers/configuration.toml: configuration file contains invalid hypervisor section: "foo"
1
```

Fixes: #8212.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-12 13:53:37 +01:00
James O. D. Hunt
45d28998d9 Merge pull request #8149 from jodh-intel/runtime-rs-ch-detect-tdx-version
runtime-rs: ch: Detect Intel TDX version
2023-10-12 10:09:42 +01:00
QuanweiZhou
f904e64155 Merge pull request #8179 from Apokleos/directvol-urlEncode
runitme-rs: use the same base64 as kata-runtime/direct-volume does
2023-10-12 09:04:11 +08:00
GabyCT
bc6eadf4f6 Merge pull request #8197 from GabyCT/topic/enablescability
tests: Enable scability test for stability CI
2023-10-11 16:41:46 -06:00
Archana Shinde
f814b1a0a2 Merge pull request #8073 from amshinde/runtime-rs-vfio-clh
runtime-rs: Add support for adding vfio device for cloud-hypervisor
2023-10-11 15:01:55 -07:00
Gabriela Cervantes
ef6388e815 tests: Remove unused function from scability test
This PR removes an unused function from scability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-11 19:44:21 +00:00
Fabiano Fidêncio
fbc8f8f466 scripts: Use install_yq from the kata-containers repo
As the file is already part of the kata-containers repo, and the tests
repo is about to become read-only, we're good to drop the tests
references from here and use everything coming from the
`kata-containers` repo instead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-11 12:52:55 +02:00
Fabiano Fidêncio
65b1a2d277 release: tag_repos: Stop tagging / updating the tests repo
As we've moved all the tests to the `kata-containers` repo, the `tests`
repo will become a read-only repo.

Fixes: #8200

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-11 11:45:27 +02:00
James O. D. Hunt
87b760f569 runtime-rs: ch: Detect Intel TDX version
Improve the `GuestProtection` handling to detect the version of
Intel TDX available.

The TDX version is now logged by the Cloud Hypervisor driver.

Fixes: #8147.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-11 09:38:00 +01:00
alex.lyn
73e81f5e39 runitme-rs: unify base64 encoding for direct-volume
Direct-volume needs to use the same base64 character set as
kata-runtime/direct-volume does.

Fixes: #8175

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-10-11 14:00:13 +08:00
Gabriela Cervantes
c6463cb5ae tests: Fix path for versions yaml for soak parallel test
This PR fixes the path for versions yaml for soak parallel test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 22:29:20 +00:00
David Esparza
89c9454fca metrics: removal of reference in the documentation to the dax test.
This PR removes the reference in the documentation to the DAX
subtest of the FIO benchmark, because this metric is currently
WIP.

Fixes: #8159

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-10 15:55:59 -06:00
Gabriela Cervantes
30ff58904e tests: Enable scability test for stability CI
This PR enables the scability test for stability CI gha.

Fixes #8196

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 19:59:57 +00:00
GabyCT
538131ab44 Merge pull request #8154 from GabyCT/topic/addstability
tests: Enable soak parallel stability test
2023-10-10 13:53:14 -06:00
Archana Shinde
8d6f7b9096 runtime-rs: Add support for handling vfio device for cloud-hypervisor
This change adds support for adding and removing vfio devices for
 cloud-hypervisor.

Fixes: #6691

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-10 12:25:44 -07:00
Gabriela Cervantes
e786b2b019 gha: Add install dependencies for stability tests
This PR adds the install dependencies for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 16:05:48 +00:00
Chao Wu
936553ae79 Merge pull request #7505 from lisongqian/feat/dragonball_metrics
dragonball: vcpu metrics change to be recorded per vcpu
2023-10-10 10:52:40 -05:00
Wainer Moschetta
d311c3dd04 Merge pull request #7621 from wainersm/gha-run-local
ci: k8s: adapt gha-run.sh to run locally
2023-10-10 11:19:19 -03:00
David Esparza
93fef543e0 Merge pull request #8127 from dborquez/fix_iperf_check_kata_processes_issue
metrics: removes kata components and k8s deployment when test finishes
2023-10-10 07:05:24 -06:00
lisongqian
dbfe6512fc dragonball: vcpu metrics change to be recorded per vcpu
In this commit, the vcpu metrics in Dragonball will be changed to record per-vcpu.

Fixes: #7248

Signed-off-by: lisongqian <mail@lisongqian.cn>
2023-10-10 16:22:40 +08:00
lisongqian
fa60fbe023 dragonball: METRICS is refactored to RwLock<DragonballMetrics>
In this commit, the METRICS is refactored to RwLock<DragonballMetrics>.

Fixes: #7248

Signed-off-by: lisongqian <mail@lisongqian.cn>
2023-10-10 16:22:40 +08:00
Peng Tao
500d1c5cee kata-ctl: update rustls-webpki/webpki dependency
The old ones have security issues.
ref: https://github.com/briansmith/webpki/issues/69
https://github.com/briansmith/webpki/issues/69

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
d7660d82a0 runtime: unify gopkg.in/yaml.v3 to v3.0.1
The older versions have Denial of Service issues.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
fc9a107e8e runtime: unify swag and testify dependency
So that we don't need to depend on that many versions of them.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
79ebb959c5 runtime: update runc dependency to v1.1.9
To pick up security fixes.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
7f3e8bd65e runtime: unify golang.org/x/text to v0.7.0
The older versions contain security issues.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
df325ae371 runtime: update golang.org/x/net to v0.7.0
To pick up fix for the following issue:

A maliciously crafted HTTP/2 stream could cause excessive CPU
consumption in the HPACK decoder, sufficient to cause a denial of
service from a small number of small requests.

Fixes: #8190
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:39 +00:00
David Esparza
bba34910df metrics: stops kata components and k8s deployment when test finishes
This PR adds a trap whenever the scrip exits, it deletes the iperf
k8s deployment and k8s services, and deletes the kata components.

This way, when the script finishes, it verifies that there are
indeed no kata components still running.

Fixes: #8126

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-09 13:41:43 -06:00
Gabriela Cervantes
84e3d884e4 gha: Add general dependencies to stability tests
This PR adds the general dependencies to stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Gabriela Cervantes
dec3951ca5 tests: Add soak parallel stability test
This PR adds the soak parallel stability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Gabriela Cervantes
0f04d527d9 tests: Enable soak parallel test
This PR enables the soak parallel test for stability test.

Fixes #8153

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Wainer dos Santos Moschetta
e669282c25 ci: k8s: set KUBERNETES default value
The KUBERNETES variable is mostly used by kata-deploy whether to apply
k3s specific deployments or not. It is used to select the type of
kubernetes to be installed (k3s, k0s, rancher...etc) and it is always
set on CI. Running the script locally we want to set a value by default
to avoid `KUBERNETES: unbound variable` errors.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
c30c3ff185 tests: run k8s-volume on a given node
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
666993da8d tests: run k8s-file-volume on a given node
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
3a00fc9101 tests: exec_host() now gets the node name
The exec_host() simply fails on cluster with multi-nodes because
`kubectl get node -o name" will return a list o names. Moreover, it will
return control nodes names which usually don't have kata installed.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
61c9c17bff tests: add get_one_kata_node() to tests_common.sh
The introduced get_one_kata_node() returns the first node that
has the kata-runtime=true label, i.e., supposedly a node with
kata installed.

This is useful for tests that should run on a determined worker
node on a multi-nodes cluster.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
68f083c4d0 ci: k8s: set KATA_HYPERVISOR default value
Let KATA_HYPERVISOR be qemu by default in gh-run.sh as this variable
is required to tweak some configurations of kata-deploy.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
6677a61fe4 ci: k8s: configurable deploy kata timeout
The deploy-kata() of gha-run.sh will wait for 10 minutes for the kata
deploy installation finish. This allow users of the script to overwrite
that value by exporting the KATA_DEPLOY_WAIT_TIMEOUT environment
variable.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
200e542921 ci: k8s: shellcheck fixes to gha-run.sh
Fixed a couple of warns shellcheck emitted and disabled others:
 * SC2154 (var is referenced but not assigned)
 * SC2086 (Double quote to prevent globbing and word splitting)

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
4af78be13a kata-deploy: re-format kata-[deploy|cleanup].yaml
The .tests/integration/kubernetes/gh-run.sh script run `yq write` a
couple of times to edit the kata-[deploy|cleanup].yaml, resulting
on the file being formatted again. This is annoying because leaves
the git tree dirty.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
d54e6d9cda ci: k8s: run_tests() for kcli
The only difference to the other platforms is that it needs to
export KUBECONFIG.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
c2ef1f0fb0 ci: k8s: add deploy-kata-kcli() to gh-run.sh
The cleanup-kcli() behaves like other deploy kata for
bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG
should be exported.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
d2be8eef1a ci: k8s: add cleanup-kcli() to gha-run.sh
The cleanup-kcli() behaves like other clean up for bare-metal (e.g. sev,
tdx...etc) except that KUBECONFIG should be exported.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
cbb9aa15b6 ci: k8s: set default image for deploy_kata()
On CI workflows the variables DOCKER_REGISTRY, DOCKER_REPO and
DOCKER_TAG are exported to match the built image. However, when running
the script outside of CI context, a developer might just use the latest
image which in this case will be
`quay.io/kata-containers/kata-deploy-ci:kata-containers-latest`.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
89bef7d036 ci: k8s: create k8s clusters with kcli
Adapted the gha-run.sh script to create a Kubernetes cluster locally
using the kcli tool.

Use `./gha-run.sh create-cluster-kcli` to create it, and
`./gha-run.sh delete-cluster-kcli` to delete.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Fabiano Fidêncio
1280f85343 Merge pull request #8171 from bergwolf/github/fix-up-gha
GHA: fix up referenced yaml exceeding 20 limit problem
2023-10-09 09:37:03 +02:00
Peng Tao
954d40cce5 gha: combine coco jobs into a single yaml
So that we don't risk exceeding the GHA 20 rerefenced yaml files limit
that easy.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-08 14:22:01 +00:00
Peng Tao
b60e0a9b57 gha: combine basic amd64 jobs into a single yaml
GHA has an undocumented limitation that there can be at most 20
referenced yamls in a single yaml file. We workaround it by combining
multiple jobs into a single yaml file.

Fixes: #8161
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-08 13:55:01 +00:00
Fabiano Fidêncio
108db0a721 Merge pull request #8162 from sprt/sprt/unbreak-ci
gha: ci: Revert tracing test PR to unbreak CI
2023-10-08 10:13:46 +02:00
Aurélien Bombo
e9bd852113 gha: ci: Revert tracing test PR to unbreak CI
Revert "Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests"

This unbreaks CI as seen in https://github.com/kata-containers/kata-containers/actions/runs/6434757133

Fixes: #8161

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-10-06 14:13:17 -07:00
James O. D. Hunt
16fe81f27c Merge pull request #8124 from jodh-intel/ch-enable-feature
runtime-rs: ch: Enable feature
2023-10-06 13:02:08 +01:00
Fabiano Fidêncio
fa6786d1d7 Merge pull request #8117 from fidencio/topic/ci-add-runk-tests
gha: ci: Port runk tests over
2023-10-06 11:19:55 +02:00
Fabiano Fidêncio
8fec654716 Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests
ci: gha: Port tracing tests over
2023-10-06 10:06:57 +02:00
GabyCT
265f53e594 Merge pull request #8082 from dborquez/enable_fio_on_ctr
Enable fio test using containerd client
2023-10-05 17:26:22 -06:00
GabyCT
c8b9ec1cb5 Merge pull request #8108 from GabyCT/topic/ghastability
gha: Add stability tests workflow for gha
2023-10-05 17:10:10 -06:00
James O. D. Hunt
b8a46a4b85 runtime-rs: ch: Enable feature
Enable the Cloud Hypervisor driver (the `cloud-hypervisor` build feature) for the rust runtime.

Fixes: #6264.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-05 17:58:39 +01:00
Gabriela Cervantes
0f2dc8c675 gha: Add containerd stability tests to ci yaml
This PR adds containerd stability tests to ci yaml.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-05 15:21:24 +00:00
Fabiano Fidêncio
89f73e658d Merge pull request #8110 from fidencio/topic/gha-be-more-specific-about-the-arm-runners
gha: arm64: Ensure the builder is arm64-builder
2023-10-04 21:20:08 +02:00
Fabiano Fidêncio
da91c9df88 ci: Port runk tests to this repo
I'm basically moving the runk tests from the tests repo to this one, and
I'm adding the "Signed-off-by:" of every single contributor the tests.

Fixes: #8116

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Chen Yiyang <cyyzero@qq.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-04 20:41:29 +02:00
Fabiano Fidêncio
7f23772763 ci: Add placeholder for runk tests
The runk test has been executed as part of the former "ubuntu" jenkins
CI.

We're porting it to GHA and running it against LTS containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 20:40:32 +02:00
Fabiano Fidêncio
9205acc3d2 ci: Move tracing tests here
I'm basically moving the tracing tests from the tests repo to this one,
and I'm adding the "Signed-off-by:" of every single contributor to the
tests.

Fixes: #8114

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-10-04 20:02:27 +02:00
Gabriela Cervantes
85d290a048 gha: Add stability gha run script
This PR adds the stability gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-04 17:45:45 +00:00
Gabriela Cervantes
54f0c8f88e gha: Add stability tests workflow for gha
This PR adds the stability test workflow for gha for the kata CI.

Fixes #8107

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-04 16:32:13 +00:00
Fabiano Fidêncio
3bb2923e5d ci: Add placeholder for tracing tests
The tracing tests are currently running as part of the Jenkins CI with
the following setups:
* Container Engines: containerd
* VMMs: QEMU | Cloud Hypervisor
* Snapshotters: overlayfs | devmapper

We'll be restricting those tests to be running on LTS version of
containerd, without devmapper.

As it's known due to our GHA limitation, this is just a placeholder and
the tests will actually be added in the next interations.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 18:02:02 +02:00
Fabiano Fidêncio
2c3bf406dc ci: Create a function to install docker
This will be re-used in other tests as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 15:01:51 +02:00
Fabiano Fidêncio
c2cce12de5 Merge pull request #8100 from fidencio/topic/kata-deploy-build-agent
kata-deploy: Build kata-agent as we build all the other components
2023-10-04 11:56:03 +02:00
Steve Horsman
c430cc3707 Merge pull request #8098 from stevenhorsman/k8s-registry-suite
versions: migrate out of k8s.gcr.io
2023-10-04 10:51:39 +01:00
Fabiano Fidêncio
119f03de26 gha: arm64: Ensure the builder is arm64-builder
Otherwise we'll use any arm64 machine that's added as a runner, and
whenever new machines are added those may end up being only used for
running some specific set of the tests.

Fixes: #8109

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 11:08:11 +02:00
Fabiano Fidêncio
59b9380d1c Merge pull request #8093 from stevenhorsman/crictl-pod-config-update
doc: Update crictl pod-config
2023-10-04 10:49:04 +02:00
David Esparza
8c498ef5ee metrics: Use jq tool to pretty-print json metrics output
This PR enables the use of jq pretty-print feature to
improve the formatting of metric results json files.

Fixes: #8081

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-03 23:33:19 -06:00
David Esparza
a2159a6361 metrics: Enables FIO test for kata containers
FIO benchmark is enabled to measure IO in Kata
at different latencies using containerd client,
in order to complement the CI metrics testing set.

This PR asl deprecated the previous Fio bench
based on k8s.

Fixes: #8080

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-03 23:32:38 -06:00
Fabiano Fidêncio
f337315952 Merge pull request #8106 from fidencio/topic/gha-fix-k0s-related-cis
gha: Fix k0s deployment
2023-10-03 21:47:40 +02:00
GabyCT
d1d9af5de2 Merge pull request #8085 from GabyCT/topic/stabilitytests
tests: Add stability test for kata CI
2023-10-03 11:28:49 -06:00
Fabiano Fidêncio
70e7ec3e23 gha: Fix k0s deployment
The tests are failing when setting up k0s, and that happens because we
download a kubectl binary matching the kubernetes version k0s is using,
and we do that by:
```
sudo k0s kubectl version --short 2>/dev/null | ...
```

With kubectl 1.28, which is now the default on k0s, `kubectl version
--short` has been removed, leading us to an empty stringm causing then
the error in the CI.

Fixes: #8105

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 17:21:40 +02:00
Fabiano Fidêncio
560bbffb57 packaging: tools: Remove set -x leftover
This was used for debugging, and ended up being merged with that.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
18fa483d90 packaging: release: Mention newly added images
We've added two new containerd builder images recently, one for the
components under `src/tools` and another one for the Kata Containers
agent.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
ca3b888371 packaging: tools: Fix container image env var name
This should be TOOLS_CONTAINER_BUILDER instead of
VIRTIOFSD_CONTAINER_BUILDER.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
5ca66795c7 packaging: Allow passing the TOOLS_CONTAINER_BUILDER
This follows what we've been doing for all the components we're
building, but was missed as part of #8077.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
02acef9575 gha: Build the kata-agent as part of our workflows
The kata-agent binary won't be released, just built so it can be used,
later on,  as part of our tests and as part of the rootfs build.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
5208386ab1 packaging: Build the kata-agent
Let's add the needed functions to start building the kata-agent, with or
without the OPA support.

For now this build is not used as part of the rootfs build, but later on
this will (not as part of this series, though).

Fixes: #8099

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
1727487eef agent: Allow specifying DESTDIR and AGENT_POLICY via env vars
This will help to build the agent binary as part of the kata-deploy
localbuild, as we need to pass the DESTDIR to where the agent will be
installed, and also whether we're building the agent with policy support
enabled or not.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 14:18:45 +02:00
Fabiano Fidêncio
45c1188839 packaging: Add get_agent_image_name()
This will be used for building the kata-agent.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 14:17:38 +02:00
Wainer dos Santos Moschetta
0db8fb8f98 versions: migrate out of k8s.gcr.io
The k8s.gcr.io is deprecated for a while now and has been redirected to
registry.k8s.io. However on some bare-metal machines in our testing
pools that redirection is not working, so let's just replace the
registries.

Fixes #8098
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit b2c3bca558c38deff2117d5909d9071c23c05590)
2023-10-03 11:52:59 +01:00
stevenhorsman
a1a0543671 doc: Fix spelling
Spell check failed with:
```
[kata-spell-check.sh:275] WARNING: Word 'overcommitment':
did you mean one of the following?: over commitment, over-commitment,
commitment
```
So update this to pass the static checks

Fixes: #
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-10-03 10:17:38 +01:00
Gabriela Cervantes
6339605a14 tests: Add general stability fixes
This PR adds general stability fixes.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-02 19:42:46 +00:00
stevenhorsman
59ae244442 doc: Update crictl pod-config
- Ensure that our documented crictl pod config file contents have
uid  and namespace fields for compatibility with crictl 1.24+

This avoids a user potentially hitting the error:
```
getting sandbox status of pod "d3af2db414ce8": metadata.Name,
metadata.Namespace or metadata.Uid is not in metadata
"&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}"

getting sandbox status of pod "-A": rpc error: code = NotFound desc = an
error occurred when try to find sandbox: not found
```

Fixes: #8092
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(cherry picked from commit 8f8c2215)
2023-10-02 14:53:46 +01:00
Gabriela Cervantes
fd19f4082f tests: Add agent stability test
This PR adds the agent stability test to stability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 22:37:02 +00:00
Gabriela Cervantes
215577032f tests: Add cassandra stress in stability tests
This PR adds the cassandra stress at the stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 22:34:45 +00:00
GabyCT
a890ad3a16 Merge pull request #8066 from GabyCT/topic/urlvra
docs: Update url in kata vra document
2023-09-28 14:59:34 -06:00
Zvonko Kaiser
79e33c211c Merge pull request #7325 from zvonkok/vfio-sandbox-id-debug
gpu: Adding CDI support for cold and hot-plug of VFIO devices
2023-09-28 21:31:12 +02:00
Gabriela Cervantes
f2d3ea988d tests: Add stressng dockerfile for stability tests
This PR adds the stressng dockerfile for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:35:22 +00:00
Gabriela Cervantes
6493aa309e tests: Add stressor CPU test for stability tests
This PR adds the stressor CPU test for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:33:08 +00:00
Gabriela Cervantes
ef68a3a36b metrics: Add stability test for kata CI
This PR adds the stability test for kata containers repository.

Fixes #8084

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:23:36 +00:00
David Esparza
f7ef45b167 Merge pull request #8077 from fidencio/topic/kata-deploy-ship-the-tools
kata-deploy: build & ship the rust components from src/tools/
2023-09-28 09:59:19 -06:00
Zvonko Kaiser
7c934dc7da gpu: Fix cold-plug of VFIO devices
We need to do proper sandbox sizing when we're doing cold-plug introduce CDI,
the de-facto standard for enabling devices in containers. containerd
will pass-through annotations for accumulated CPU,Memory and now CDI
devices. With that information sandbox sizing can be derived correctly.

Fixes: #7331

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-09-28 09:49:13 +00:00
GabyCT
fcc755fc3b Merge pull request #8068 from GabyCT/topic/limitlatency
metrics: Add latency value limits for kata CI
2023-09-27 13:28:41 -06:00
Greg Kurz
defbb64ac8 Merge pull request #8036 from rye-stripe/bugfix/overhead-metrics
runtime: fix reading cgroup stats of sandboxes
2023-09-27 19:39:55 +02:00
Archana Shinde
95455e6fe8 Merge pull request #8058 from likebreath/0925/clh_v35.0
Upgrade to Cloud Hypervisor v35.0
2023-09-27 10:39:32 -07:00
Gabriela Cervantes
8d66ef5185 metrics: Increase qemu jitter value
This PR increases qemu jitter value.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-27 17:31:07 +00:00
Gabriela Cervantes
5600e28b54 metrics: Increase jitter value for clh
This PR increases jitter value for clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-27 17:30:19 +00:00
Fabiano Fidêncio
a6b1f5e21b ci: Build src/tools components as part of our tests / releases
Build those as part of our CI and release workflows.

Fixes #5520 #5348

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:50:25 +02:00
Fabiano Fidêncio
501a168a81 kata-deploy: Build components from src/tools
Let's add targets and actually enable users and oursevles to build those
components in the same way we build the rest of the project.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:49:02 +02:00
Fabiano Fidêncio
6ef42db5ec static-build: Add scripts to build content from src/tools
As we'd like to ship the content from src/tools, we need to build them
in the very same way we build the other components, and the first step
is providing scripts that can build those inside a container.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:56 +02:00
Fabiano Fidêncio
4d08ec29bc packaging: Add get_tools_image_name()
This will be used for building all the (rust) components from src/tools.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:35 +02:00
Fabiano Fidêncio
98097c96de packaging: Use git abbreviated hash
This will make it easier to build images that rely on several
directories hashes.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:30 +02:00
Fabiano Fidêncio
8b25e90027 Merge pull request #8075 from fidencio/topic/ci-add-kata-monitor-tests
ci: Port kata-monitor tests from Jenkins to GHA
2023-09-27 15:48:46 +02:00
Fabiano Fidêncio
489caf1ad0 ci: kata-monitor: Move tests over
Let's move, adapt, and use the kata-monitor tests from the tests repo.
In this PR I'm keeping the SoB from every single contributor from who
touched those tests in the past.

Fixes: #8074

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-27 11:40:31 +02:00
Fabiano Fidêncio
a3fb067f1b ci: Add placeholder for kata-monitor tests
The kata-monitor tests is currently running as part of the Jenkins CI
with the following setups:
* Container Engines: CRI-O | containerd
* VMMs: QEMU

When using containerd, we're testing it with:
* Snapshotter: overlayfs | devmapper

We will stop running those tests on devmapper / overlayfs as that hardly
would get us a functionality issue.

Also, we're restricting this to run with the LTS version of containerd,
when containerd is used.

As it's known due to our GHA limitation, this is just a placeholder and
the tests will actually be added in the next iterations.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:31:17 +02:00
Fabiano Fidêncio
57cb4ce204 ci: Make install_kata aware of container engines
This will help us when running tests using CRI-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:31:17 +02:00
Fabiano Fidêncio
de1eeee334 ci: Create a generic install_crio function
This will serve us quite will in the upcoming tests addition, which will
also have to be executed using CRi-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
64a2000859 ci: Add install_cni_plugins helper
This will become handy when doing tests with CRI-O, as CRI-O doesn't
install the CNI plugins for us.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
8132fe15c9 ci: Modify containerd default config
Let's ensure we have runc running with `SystemdCgroups = false`,
otherwise we'll face failures when running tests depending on runc on
Ubuntu 22.04, woth LTS containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:16:12 +02:00
Gabriela Cervantes
8cb7df1bed metrics: Add checkmetrics for latency test
This PR adds the checkmetrics for latency test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 19:11:08 +00:00
Gabriela Cervantes
e90440ae24 metrics: Add qemu latency value limit
This PR adds the qemu latency value limit for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 17:30:09 +00:00
Gabriela Cervantes
a74a8f8a9d metrics: Add latency value limits for kata CI
This PR adds latency value limits for kata CI.

Fixes #8067

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 17:29:07 +00:00
Gabriela Cervantes
d7def8317a metrics: Fix general check static warnings
This PR fixes general check static warnings.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 16:30:59 +00:00
GabyCT
309103169d Merge pull request #8056 from GabyCT/topic/fixlatencypath
metrics: Fix latency yamls path
2023-09-26 10:16:55 -06:00
Gabriela Cervantes
928553d1ba docs: Update url in kata vra document
This PR updates the url in kata vra document.

Fixes #8065

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 16:13:12 +00:00
GabyCT
5c0afaacf4 Merge pull request #8018 from GabyCT/topic/fixreadme
metrics: Fix metrics README
2023-09-26 09:51:47 -06:00
David Esparza
83326f89b3 Merge pull request #8054 from GabyCT/topic/fixcrdoc
metrics: Fix C-Ray documentation
2023-09-26 09:50:19 -06:00
James O. D. Hunt
31478b9c33 Merge pull request #7944 from jodh-intel/runtime-rs-ch-enable-tdx
runtime-rs: ch: Enable Intel TDX
2023-09-26 14:11:12 +01:00
James O. D. Hunt
b0a3293d53 runtime-rs: ch: Enable Intel TDX
Allow Cloud Hypervisor to create a confidential guest (a TD or
"Trust Domain") rather than a VM (Virtual Machine) on Intel systems
that provide TDX functionality.

> **Notes:**
>
> - At least currently, when built with the `tdx` feature, Cloud Hypervisor
>   cannot create a standard VM on a TDX capable system: it can only create
>   a TD. This implies that on TDX capable systems, the Kata Configuration
>   option `confidential_guest=` must be set to `true`. If it is not, Kata
>   will detect this and display the following error:
>
>   ```
>   TDX guest protection available and must be used with Cloud Hypervisor (set 'confidential_guest=true')
>   ```
>
> - This change expands the scope of the protection code, changing
>   Intel TDX specific booleans to more generic "available guest protection"
>   code that could be "none" or "TDX", or some other form of guest
>   protection.

Fixes: #6448.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 10:55:25 +01:00
James O. D. Hunt
523399c329 runtime-rs: ch: Add more consts
Introduce a few new constants (for PCI segment count and FS queues) and
move the disk queue constants to `convert.rs` to allow them to be used
there too.

> **Note:**
>
> This change gives the `ShareFs` code it's own set of values rather
> than relying on the disk queue constants.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
dea8065811 runtime-rs: ch: Remove unused function
Delete the `handle_pending_devices_after_boot()` function which is no
longer required.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
995f2c015f runtime-rs: ch: Only handle particular pending device types
Modify the Cloud Hypervisor `add_device()` method to add `ShareFs` and
`Network` devices to the list of pending devices since only these two
device types need to be cached before VM startup. Full details in the
comments.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
b1b96a5c49 runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check
Remove the `VIRTIO_BLK_MMIO` check which appears to have been added
erroneously in the first place.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
Gabriela Cervantes
9ac29b8d38 metrics: Add init_env function to latency test
This Pr adds the init_env function to latency test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 22:06:00 +00:00
Bo Chen
dfd0c9fa9a runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v35.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8057

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-25 12:22:37 -07:00
Bo Chen
8f9f087e35 versions: Upgrade to Cloud Hypervisor v35.0
Details of this release can be found in ourroadmap project as iteration
v35.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #8057

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-25 12:22:01 -07:00
Fabiano Fidêncio
a4daa86535 Merge pull request #8028 from fidencio/topic/ci-test-with-crio-part-2
ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI
2023-09-25 18:40:42 +02:00
Gabriela Cervantes
81c8babca9 metrics: Fix latency yamls path
This PR fixes the latency yamls path for the latency test for
kata metrics.

Fixes #8055

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 15:52:24 +00:00
Gabriela Cervantes
4815736820 metrics: Fix C-Ray documentation
This PR fixes the C-Ray documentation for kata metrics.

Fixes #8052

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 15:27:58 +00:00
Fabiano Fidêncio
ef63d67c41 ci: crio: Trail '\r' from exec_host() output
We've faced this as part of the CI, only happening with the CRI-O tests:
```
 not ok 1 Test readonly volume for pods
 # (from function `exec_host' in file tests_common.sh, line 51,
 #  in test file k8s-file-volume.bats, line 25)
 #   `exec_host "echo "$file_body" > $tmp_file"' failed with status 127
 # [bats-exec-test:38] INFO: k8s configured to use runtimeclass
 # bash: line 1: $'\r': command not found
 #
 # Error from server (NotFound): pods "test-file-volume" not found
```

I must say I didn't dig into figuring out why this is happening, but we
may be safe enough to just trail the '\r', as long as all the tests keep
passing on containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 16:42:18 +02:00
Fabiano Fidêncio
74c12b2927 ci: crio: Enable default capabilities
We need the default capabilities to be enabled, especially `SYS_CHROOT`,
in order to have tests accessing the host to pass.

A huge thanks to Greg Kurz for spotting this and suggesting the fix.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
358dc2f569 kata-deploy: Fix CRI-O detection
Some of the "k8s distros" allow using CRI-O in a non-official way, and
if that's done we cannot simply assume they're on containerd, otherwise
kata-deploy will simply not work.

In order to avoid such issue, let's check for `cri-o` as the container
engine as the first place and only proceed with the checks for the "k8s
distros" after we rule out that CRI-O is not being used.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
ebaa4fa4c1 ci: crio: Pass -y to apt
That was something overlooked during my tests. :-/

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 14:56:15 +02:00
GabyCT
11cf0e2d28 Merge pull request #8038 from GabyCT/topic/latency
metrics: Enable latency test in gha run script
2023-09-22 16:57:53 -06:00
GabyCT
3ef57b335e Merge pull request #8045 from jepio/fix-docker-ownership
local-build: Fix .docker ownership before build-payload
2023-09-22 14:43:38 -06:00
Archana Shinde
9bb9a3e7a4 Merge pull request #7966 from amshinde/runtime-rs-network-clh
runtime-rs: Add network support for cloud-hypervisor
2023-09-22 13:08:09 -07:00
Gabriela Cervantes
97e73b2234 metrics: Fix spelling warnings
This PR fixes general spelling warnings detected by the spelling check.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-22 15:50:51 +00:00
Gabriela Cervantes
36c8cd6f1f metrics: Fix metrics README
This PR fixes the network metrics section at the README by leaving
the current tests that we have in our kata metrics.

Fixes #8017

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-22 15:28:58 +00:00
Fabiano Fidêncio
c5a5a0c95e Merge pull request #8012 from arronwy/strip
osbuild: Reduce guest components binary size with strip
2023-09-22 15:45:38 +02:00
Fabiano Fidêncio
9d190f2390 Merge pull request #8042 from GabyCT/topic/pandoc
gha: Add pandoc as a dependency for static checks
2023-09-22 15:31:18 +02:00
Jeremi Piotrowski
15425a2b80 local-build: Fix .docker ownership before build-payload
The permissions on .docker/buildx/activity/default are regularly broken by us
passing docker.sock + $HOME/.docker to a container running as root and then
using buildx inside. Fixup ownership before executing docker commands.

Fixes: #8027
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-22 13:44:53 +02:00
Jeremi Piotrowski
a5338e885e Merge pull request #8030 from portersrc/8027-ci-rootfs-image-build-asset-is-failing-oras
ci: rootfs-image build-asset is failing
2023-09-22 11:07:50 +02:00
Chao Wu
6f98fbafde Merge pull request #6706 from guixiongwei/feat/thp
feat(runtime-rs): introduce huge page mode to select VM RAM's backend
2023-09-22 15:27:06 +08:00
Gabriela Cervantes
13ca7d9f97 gha: Add pandoc as a dependency for static checks
To avoid the failure of not finding pandoc command this PR adds that
package as a dependency for static checks.

Fixes #8041

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-21 20:14:41 +00:00
Jeremi Piotrowski
28dd5ae91e Merge pull request #7799 from UiPath/clh-directio-support
clh: Direct IO support for block devices
2023-09-21 19:16:08 +02:00
David Esparza
6de9f39895 Merge pull request #8020 from GabyCT/topic/fixhunspell
gha: Install hunspell for static checks
2023-09-21 10:58:40 -06:00
Gabriela Cervantes
08bc8e4db4 metrics: Add latency benchmark for gha
This PR adds the latency benchmark for gha for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-21 16:14:39 +00:00
Gabriela Cervantes
6776b55d7e metrics: Enable latency test in gha run script
This PR enables the latency test for gha run script for kata metrics.

Fixes #8037

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-21 16:11:58 +00:00
Peteris Rudzusiks
94e2ccc2d5 runtime: fix reading cgroup stats of sandboxes
The cgroup stats come from resourcecontrol package in the form of pointers
to structs. The sandbox Stat() method incorrectly was expecting structs.
This caused the cpu and memory stats to always be 0, which in turn caused
incorrect pod overhead metrics.

Fixes #8035

Signed-off-by: Peteris Rudzusiks <rye@stripe.com>
2023-09-21 17:00:53 +02:00
Alexandru Matei
d507d189bb fc: Add support for noflush cache option
Firecracker supports noflush semantic via Unsafe cache type.
There is no support for direct i/o, remove it from config file

Fixes: #7823

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-09-21 14:48:24 +03:00
Alexandru Matei
2ca781518a clh: Direct IO support for block devices
Clh suports direct i/o for disks. It doesn't
offer any support for noflush, removed passing
of option to cloud-hypervisor internal config

Fixes: #7798

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-09-21 14:48:24 +03:00
Fabiano Fidêncio
dd27912f31 Merge pull request #8032 from fidencio/topic/ci-make-push-after-build-be-trigger-by-workflow-dispatch
ci: Trigger payload-after-push on workflow_dispatch
2023-09-21 10:25:24 +02:00
Fabiano Fidêncio
0c95697cc4 ci: Trigger payload-after-push on workflow_dispatch
This will allow us to easily test failures and fixes on that workflows.

Fixes: #8031

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-21 09:24:13 +02:00
Chris Porter
28cbc3b51c ci: rootfs-image build-asset is failing
Fixes: #8027

Signed-off-by: Chris Porter <porter@ibm.com>
2023-09-21 00:58:42 -05:00
Fabiano Fidêncio
21f6f9a173 Merge pull request #8016 from fidencio/topic/ci-test-with-crio-part-1
ci: Actually enable the CRI-O tests
2023-09-21 07:42:27 +02:00
Wainer Moschetta
87e64a07ed Merge pull request #7979 from beraldoleal/gogo-removal
protocol: remove gogoprotobuff tests
2023-09-20 22:38:10 -03:00
Gabriela Cervantes
87a8616488 gha: Install hunspell for static checks
Seems like the static checks are failing due the missing of the hunspell
package this PR fixes that.

Fixes #8019

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-20 16:58:10 +00:00
Fabiano Fidêncio
8c3c50ca8a ci: Actually enable the CRI-O tests
The test has been added to the repo, but we have to also add it to the
list of jobs to be executed.

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 18:01:25 +02:00
David Esparza
03554c799a Merge pull request #8006 from fidencio/topic/ci-test-with-crio-part-0
ci: k8s: Also run tests with CRI-O
2023-09-20 07:45:17 -06:00
Fabiano Fidêncio
c6a9e50c37 Merge pull request #8004 from microsoft/danmihai1/quoted-spaces
runtime: support kernel params including spaces
2023-09-20 12:10:51 +02:00
Wang, Arron
3a6510ad61 osbuild: Reduce guest components binary size with strip
opa_linux_amd64_static 38M => 27M
kata-agent 30M => 23M

ls -alh opa_linux_amd64_static
-rw-rw-r-- 1 arron arron 38M Jul 28 01:59 opa_linux_amd64_static
➜ kata-containers git:(main) ✗ strip opa_linux_amd64_static
➜ kata-containers git:(main) ✗ ls -alh opa_linux_amd64_static
-rw-rw-r-- 1 arron arron 27M Sep 20 16:12 opa_linux_amd64_static

ls -alh ./usr/bin/kata-agent
-rwxr-xr-x. 1 root root 30M Jul 30 23:41 ./usr/bin/kata-agent
ls -alh ./usr/bin/kata-agent
-rwxr-xr-x. 1 root root 23M Sep 20 16:13 ./usr/bin/kata-agent

Fixes: #8011

Signed-off-by: Wang, Arron <arron.wang@intel.com>
2023-09-20 16:23:17 +08:00
Fabiano Fidêncio
07a6e63a6b ci: k8s: rke2: Use sudo to call systemd
Otherwise we'll face the following error:
```
Failed to enable unit: Interactive authentication required.
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 08:48:29 +02:00
Fabiano Fidêncio
03b82e8484 ci: k8s: Add a CRI-O test
Let's make sure we'll also be testing k8s using CRI-O.

For now, we'll only be running the CRI-O test with QEMU.  Once it
becomes stable we can expand this to other Hypervisors as well.

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
d7105cf7a4 ci: k8s: Add a method to install CRI-O
This is based on official CRI-O documentations[0] and right now we're
making this specific to Ubuntu as that's what we have as runners.

We may want to expand this in the future, but we're good for now.

[0]:
https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
54c0a471b1 ci: k8s: k0s: Allow passing parameters to the k0s installer
We'll need this in order to setup k0s with a different container engine.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
31ef64606c Merge pull request #8007 from fidencio/topic/ci-kata-deploy-fix-garm-runner-name
ci: kata-deploy: Fix runner name
2023-09-20 00:58:33 +02:00
Beraldo Leal
730ef51693 deps: updating dependencies
Updating dependencies after make check, make test.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-19 16:54:35 -04:00
GabyCT
6111ef6fb6 Merge pull request #7990 from GabyCT/topic/parallelbandwidth
metrics: Enable parallel bandwidth iperf limit
2023-09-19 14:52:21 -06:00
Fabiano Fidêncio
3a2c83d69b ci: kata-deploy: Fix runner name
It should be garm-ubuntu-2004-smaller instead of garm-ubuntu-2004-small.

Fixes: #7890

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 22:34:37 +02:00
Dan Mihai
82ff2db460 runtime: support kernel params including spaces
Support quoted kernel command line parameters that include space
characters. Example:

dm-mod.create="dm-verity,,,ro,0 736328 verity 1
/dev/vda1 /dev/vda2 4096 4096 92041 0 sha256
f211b9f1921ef726d57a72bf82be23a510076639fa8549ade10f85e214e0ddb4
065c13dfb5b4e0af034685aa5442bddda47b17c182ee44ba55a373835d18a038"

Fixes: #8003

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-19 20:26:38 +00:00
Beraldo Leal
604a9dd673 protocol: remove gogoprotobuff tests
This is part of a bigger effort to drop gogoprotobuff from our code
base. IIUC, those options are basically used by *pb_test.go, and since
we are dropping gogoprotobuff and those are auto generated tests, let's
just remove it.

Fixes #7978.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-19 12:55:42 -04:00
Fabiano Fidêncio
5560e72024 Merge pull request #7896 from fidencio/topic/ground-work-for-testing-all-k8s-flavours-we-support
ci: kata-deploy: Enable all k8s flavours that we support
2023-09-19 17:44:34 +02:00
Fabiano Fidêncio
f7fa7f602a ci: Enable kata-deploy tests for all the supported k8s flavours
Let's ensure we test kata-deploy on RKE2 and k0s as well.

Fixes: #7890

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
2c908b598c ci: kata-deploy: Add the ability to deploy rke2
This will be very useful in the near future, when we start testing
kata-deploy with rke2 as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
eaf6164916 ci: kata-deploy: Add the ability to deploy k0s
This will be very useful in the near future, when we start testing
kata-deploy with k0s as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
0015257636 ci: kata-deploy: Add deploy-k8s argument to gha-run.sh
We'll be using exactly the same code used for the k8s tests, which are
already deploying k3s on GARM.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
bf2cb02283 ci: kata-deploy: Expland tests to run on k0s / rke2
We just need to make sure the correct overlay is applied, following what
we already have been doing for k3s.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
6d5d844e5c Merge pull request #7983 from sprt/resource-group-naming
ci: Create clusters in individual resource groups
2023-09-19 12:54:21 +02:00
Fabiano Fidêncio
b12b9e1886 ci: kata-deploy: Add placeholder for tests on GARM
We'll be testing kata-deploy with different kubernetes flavours as part
of our GARM tests, and this is a place-holder for this.

Once enabled, we'll do nothing, just `return 0`, so we can then properly
add the tests after this commit gets merged.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:42:02 +02:00
Fabiano Fidêncio
9e1fb8a966 ci: kata-deploy: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

This was also done as part of fa62a4c01b,
for the k8s tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:37:56 +02:00
Fabiano Fidêncio
09cc0ed438 ci: Move deploy_k8s() to gha-run-k8s-common.sh
This will allow us to re-use the function in the kata-deploy tests,
which will come soon.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:37:56 +02:00
Fabiano Fidêncio
1829f5c049 Merge pull request #7992 from skaegi/virtiofsd-1.8.0
versions: Bump virtiofsd to v1.8.0
2023-09-19 11:52:49 +02:00
Fabiano Fidêncio
486fe14c99 ci: Properly set K8S_TEST_UNION
Otherwise only the first test will be executed

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 10:23:58 +02:00
Aurélien Bombo
d9ef1352af ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name
Ideally we'd add the instance_type or the full K8S_TEST_HOST_TYPE but
that exceeds the maximum amount of characteres allowed for the cluster
name.  With this in mind, let's use the first letter of
K8S_TEST_HOST_TYPE instead.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-09-19 10:23:58 +02:00
Aurélien Bombo
68267a3996 ci: Create clusters in individual resource groups
This makes it so that each AKS cluster is created in its own individual
resource group, rather than using the "kataCI" resource group for all
test clusters.

This is to accommodate a tool that we recently introduced in our Azure
subscription which automatically deletes resource groups after a set
amount of time, in order to keep spending under control.

The tool will automatically delete any resource group, unless it has a
tag SkipAutoDeleteTill = YYYY-MM-DD. When this tag is present, the
resource group will be retained until the specified date.

Note that I tagged all current resource groups in our subscription with
SkipAutoDeleteTill = 2043-01-01 so that we don't lose any existing
resources.

Fixes: #7982

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-09-19 10:23:55 +02:00
Fabiano Fidêncio
84c0d59d23 Merge pull request #7985 from fidencio/topic/clh-use-static_sandbox_resource_mgmt-as-default-on-arm
clh: arm: Use static_sandbox_resource_mgmt=true
2023-09-19 09:25:34 +02:00
Gabriela Cervantes
9aa8d1c917 metrics: Add parallel bandwidth limit for qemu
This PR adds the parallel bandwidth limit for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-18 21:08:54 +00:00
Simon Kaegi
44c7c082d9 versions: Bump virtiofsd to v1.8.0
https://gitlab.com/virtio-fs/virtiofsd/-/releases/v1.8.0 was released two weeks ago. We have fully tested and are using this version.

Also bumps toolchain version to match what virtiofsd used.

Fixes: #7960

Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>
2023-09-18 15:21:15 -04:00
Fabiano Fidêncio
5f8e210d3b Merge pull request #7961 from ChengyuZhu6/update_nydus
Bump nydus versions and update nydus tests
2023-09-18 21:02:20 +02:00
Fabiano Fidêncio
c3ee913bf6 Merge pull request #7953 from gkurz/extra-monitor-socket
runtime/qemu: Rework QMP/HMP support
2023-09-18 19:04:14 +02:00
Gabriela Cervantes
af59d4bf4a metrics: Enable parallel bandwidth iperf limit
This PR enables the parallel bandwidth iperf limit for kata metrics.

Fixes #7989

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-18 16:32:11 +00:00
Fabiano Fidêncio
aba36ab188 nydus: Temporarily skip tests on dragonball
We're hitting a specific issue after updating, which will require some
work on dragonball before it can be re-added here.

The issue:
```
...
3: failed to do rafs mount\\n
4: fail to attach rafs \\\"/var/lib/containerd-nydus/snapshots/2/fs/image/image.boot\\\"\\n
5: add share fs mount\\n
6: Mount rafs at
   /rafs/197ef3db03c86b91bf3045ff59183ce8b5750941ad1d3484f4a8301a70f5109f/rootfs_lower
   error: Failed to Mount backend
...

Caused by:
vmm action error: FsDevice(AttachBackendFailed(\\\"attach/detach a
backend filesystem failed:: missing field `version` at line 1 column
489\\\"))\"): unknown"
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b8a8dfcd15 nydus: Use kata-${KATA_HYPERVISOR} instead of kata
This will ensure we're testing with the correct runtime, instead of
using the `default` one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
ChengyuZhu6
f6df3d6efb static-build: Fix arch error on nydus build
Fix the arch error when downloading the nydus tarball.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Steven Horsman <steven@uk.ibm.com>
2023-09-18 17:40:06 +02:00
ChengyuZhu6
2f9c9e2e63 tests: nydus: Update nydus tests
To support the v0.12.0 nydus-snapshotter, we need to update the config
files and the commandline to start nydus-snapshotter.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
c9a4e7e46d versions: Bump nydus and nydus-snapshotter to its latest release
As we need https://github.com/containerd/nydus-snapshotter/pull/530 in.

Fixes #7984

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b73bde320d gha: nydus: Populate run()
And with this we finally enable the nydus tests to run as part of our
GHA CI.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b3904a1a30 gha: nydus: Populate install_dependencies()
Let's have all the dependencies needed for running the nydus tests
installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
d2b3b67f5d gha: nydus: Actually install kata when install-kata is called
We've been simply doing nothing whenever `install-kata` was called, and
that was the intent when we added the placeholder calls.

Now, let's install kata, as expected. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
0ec00ad42e gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh
As we've added install_nydus() and install_nydus_snapshotter(), which do
conform with the pattern we're following on GHA, let's rely on them
rather than relying on the bits coming from nydus_test.sh.

Later on we'll have install_nydus() and install_nydus_snapshotter() as
part of the dependencies install in our `gha-run.sh`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
568439c77b tests: nydus: Add timeout to the crictl calls
Similarly to what's been done for the cri-containerd tests, as part of
84dd02e0f9, we need to add the timeout
here for the crictl calls.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
5ac3b76eb1 tests: nydus: Add uid / namespace to the nydus container / sandbox
Otherwise we may face errors like:
```
getting sandbox status of pod "d3af2db414ce8": metadata.Name,
metadata.Namespace or metadata.Uid is not in metadata
"&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}"

getting sandbox status of pod "-A": rpc error: code = NotFound desc = an
error occurred when try to find sandbox: not found
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
376574a16c tests: nydus: Decorate some calls with sudo
Otherwise we canoot properly start the nydus snapshotter, nor properly
kill it after it's been started.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
4290fd4b67 tests: nydus: Adapt "source ..." to GHA
The "source ..." we've been doing was not changed since those tests were
part of the Jenkins tests, and we need to adapt them, either setting the
correct path or entirely removing the ones that are not relevant to us
anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
a84efa3e87 tests: nydus: Adapt check to "clh" instead "cloud-hypervisor"
As that's what we've been using as part of the GHA.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
56a14b3950 tests: common: Add install_nydus_snapshotter()
This function will be used to download and install the
nydus-snapshotter, and it follows the same pattern we already have
introduced for downloading and installing another dependencies from
GitHub.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b6563783e2 tests: common: Add install_nydus()
This function will be used to download and install nydus, and it follows
the same pattern we already have introduced for downloading and
installing another dependencies from GitHub.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
72599f1911 clh: arm: Use static_sandbox_resource_mgmt=true
Users have noticed that this is needed, as CLH does not yet implement a
way to hotplug resources on aarh64.

With this patch, when building for x86_64, I can see the this is the
resulting config:
```
$ ARCH=amd64 make
...

$ cat config/configuration-clh.toml | grep static_sandbox_resource_mgmt
static_sandbox_resource_mgmt=false

```

And when building for aarch64:
```
$ ARCH=arm64 make
...

$ cat config/configuration-clh.toml | grep static_sandbox_resource_mgmt
static_sandbox_resource_mgmt=true
```

Fixes: #7941

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 14:14:10 +02:00
Jeremi Piotrowski
dfa6af54df Merge pull request #7806 from jongwu/clh_serial
clh:arm64: use arm AMBA UART for hypervisor debug
2023-09-18 12:29:07 +02:00
Greg Kurz
1f16b6627b runtime/qemu: Rework QMP/HMP support
PR #6146 added the possibility to control QEMU with an extra HMP socket
as an aid for debugging. This is great for development or bug chasing
but this raises some concerns in production.

The HMP monitor allows to temper with the VM state in a variety of ways.
This could be intentionally or mistakenly used to inject subtle bugs in
the VM that would be extremely hard if not even impossible to debug. We
definitely don't want that to be enabled by default.

The feature is currently wired to the `enable_debug` setting in the
`[hypervisor.qemu]` section of the configuration file. This setting has
historically been used to control "debug output" and it is used as such
by some downstream users (e.g. Openshift). Forcing people to have the
extra HMP backdoor at the same time is abusive and dangerous.

A new `extra_monitor_socket` is added to `[hypervisor.qemu]` to give
fine control on whether the HMP socket is wanted or not. This setting
is still gated by `enable_debug = true` to make it clear it is for
debug only. The default is to not have the HMP socket though. This
isn't backward compatible with #6416 but it is for the sake of "better
safe than sorry".

An extra monitor socket makes the QEMU instance untrusted. A warning is
thus logged to the journal when one is requested.

While here, also allow the user to choose between HMP and QMP for the
extra monitor socket. Motivation is that QMP offers way more options to
control or introspect the VM than HMP does. Users can also ask for
pretty json formatting well suited for human reading. This will improve
the debugging experience.

This feature is only made visible in the base and GPU configurations
of QEMU for now.

Fixes #7952

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-18 12:13:01 +02:00
Greg Kurz
cab46c9e23 Merge pull request #7973 from fidencio/topic/ci-use-bigger-machine-sizes-for-the-needed-tests-part-0
ci: Use variable size of VMs depending on the tests running
2023-09-18 12:06:44 +02:00
Fabiano Fidêncio
0e3bfac3b3 Merge pull request #7976 from fidencio/topic/ci-static-checks-rework-part-0
ci: Rework static checks
2023-09-18 11:01:18 +02:00
Peng Tao
6eedd9b0b9 Merge pull request #7738 from Xuanqing-Shi/7732/handle-non-empty-endpoints-in-RemoveEndpoints
runtime: incorrect handling of non-empty []Endpoint parameter in Remo…
2023-09-18 10:58:28 +08:00
Fabiano Fidêncio
8b1e9b0c75 ci: static-checks: Clean up static-checks job
Now that the static-checks job only takes care of running the
static-checks, let's clean it up, remove all the unneeded steps, make
sure that we're using the actions in their latest version, and have it
running in a cost free runner.

At some point I'd like to see those tests done in parallel, in the same
way that I've organised the build-checks, but that's something for
someone else, at some other time.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 14:23:02 +02:00
Fabiano Fidêncio
2c5ca2eaf8 ci: static-checks: Run tests depending on KVM
With this we're removing the dragonball static-checks CI, as the test is
running here now. :-)

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 14:22:38 +02:00
Fabiano Fidêncio
509c309ab2 ci: static-checks: Move "sudo make test" to the new test matrix
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` and `make check` checks.

This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:23 +02:00
Fabiano Fidêncio
4e963cedf4 ci: static-checks: Move "make test" to the new test matrix
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` and `make check` checks.

This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:17 +02:00
Fabiano Fidêncio
08f2e5ae0b runtime-rs: Ensure static-checks-build is a dep of make test
Otherwise `make test` will simply fail with:
```
error[E0583]: file not found for module `config`
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:13 +02:00
Fabiano Fidêncio
2bc3a616ae kata-ctl: Use loop instead of kvm module in tests
This makes it pssible to run the tests in the cost free runners, which
are not KVM capable.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:08 +02:00
Fabiano Fidêncio
46daddc500 kata-ctl: Ensure GENERATED_CODE is a dep of make test
Otherwise `make test` will simply fail with:
```
error[E0583]: file not found for module `version`
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:01 +02:00
Fabiano Fidêncio
ec826f328f agent: Ensure GENERATED_CODE is a dep of make test
Otherwise `make test` will fail with:
```
error[E0583]: file not found for module `version`
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:57 +02:00
Fabiano Fidêncio
1d32410a83 ci: install_libseccomp: Do not depend on the tests repo
It makes things way simpler, waaaaay simpler.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:49 +02:00
Fabiano Fidêncio
bf888b9a5e ci: static-checks: Move "make check" to the new test matrix
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` checks.

This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:45 +02:00
Fabiano Fidêncio
473ec87806 kata-ctl: Add kata-types to the Cargo.lock file
Commit message covered everything. :-)

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:40 +02:00
Fabiano Fidêncio
ea19549a99 kata-ctl: Ensure GENERATED_CODE is a dep of make check
Otherwise `make check` would fail with:
```
Error writing files: failed to resolve mod `version`:
/home/runner/work/kata-containers/kata-containers/src/tools/kata-ctl/src/ops/version.rs
does not exist make: *** [../../../utils.mk:176: standard_rust_check] Error 1
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:36 +02:00
Fabiano Fidêncio
e125775863 tests: install_rust: Also install clippy
clippy is used as part our tests, so it's useful to have it installed
while we're already installing rust.

In case of developers, they also better be using it. :-)

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:31 +02:00
Fabiano Fidêncio
e2c61a152c ci: static-checks: Move vendor check to its own job
Similarly to the static-check jobs, those jobs can be run on the zero
cost runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:30 +02:00
Fabiano Fidêncio
6794d4c843 tests: Move install_rust.sh from the tests repo
We'll use it as part of the refactoring we're doing in the static check
tests.

I can see a lot of other uses of this, but changing all of them to this
one is out of the scope for this PR.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:29 +02:00
Fabiano Fidêncio
e64508c308 tests: install_go: Remove tests repo dependency
We can rely on the functions that are now part of the common.bash.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:28 +02:00
Fabiano Fidêncio
11dff731b7 tests: Move functions from kata_arch script here
We can use this a lot as part of our CI, but right now I'm just moving
those here with the intent to use later on in this series.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:28 +02:00
Fabiano Fidêncio
75c974c802 ci: static-checks: Move kernel config check to its own job
It doesn't make sense to run this for all the bits of the matrix,
neither it's demanding enough to require running this in one of our
Azure sponsored runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:25 +02:00
Archana Shinde
9c233bb9e0 test: Add test to verify try_from for clh Netconfig
Add tests to verify conversion from runtime NetworkConfig
to clh specific config.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-09-16 00:24:14 -07:00
Fabiano Fidêncio
c69a1e33bd ci: Use variable size of VMs depending on the tests running
Let me start with a fair warning that this commit is hard to split into
different parts that could be easily tested (or not tested, just
ignored) without breaking pieces.

Now, about the commit itself, as we're on the run to reduce costs
related to our sponsorship on Azure, we can split the k8s tests we run
in 2 simple groups:
* Tests that can be run in the smaller Azure instance (D2s_v5)
* Tests that required the normal Azure instance (D4s_v5)

With this in mind, we're now passing to the tests which type of host
we're using, which allows us to select to run either one of the two
types of tests, or even both in case of running the tests on a baremetal
system.

Fixes: #7972

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 09:13:54 +02:00
Archana Shinde
9049d311df runtime-rs: Add network support for cloud-hypervisor
This PR adds support for adding a network device before starting the
cloud-hypervisor VM.

Support for adding and removing network devices is not really added to
the resource manager, so supporting this for cloud-hypervisor is not
scoped in this PR.

This also changes "pending_devices" for clh implementation from an
Option of vector to simply a vector. This simplifies the structure a bit
as we can simple iterate over the pending devices instead of having to
check for a "Some" value as this is not really required.

Fixes: #6333

Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-09-15 23:25:20 -07:00
Greg Kurz
79c494eb4e Merge pull request #7969 from fidencio/topic/ci-cache-using-oras-part-3
ci: cache: Check the sha256sum of the components & fix ovmf-sev cache usage
2023-09-15 16:30:22 +02:00
Fabiano Fidêncio
eecd5bf2aa ci: cache: Fix ovmf-sev cache
The cached tarball is relying on the component name, thus it's important
to set it correctly, otherwise we'll end up always building it.

With this patch applied:
```
≡ ⨯ make ovmf-sev-tarball
make ovmf-sev-tarball-build
make[1]: Entering directory '/home/ffidenci/src/upstream/kata-containers/kata-containers'
/home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-binaries-in-docker.sh  --build=ovmf-sev
sha256:67cc94e393dc1d5bfc2b77a77e83c9b1c0833d0fbbebaa9e9e36f938bb841fcc
Build kata version 3.2.0-rc0: ovmf-sev
INFO: DESTDIR /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/destdir
Downloading a76f5522493f ovmf-sev-builder-image-version
Downloading 7e98c854bd94 kata-static-ovmf-sev.tar.xz
Downloading 559311973ff8 ovmf-sev-version
Downloaded  a76f5522493f ovmf-sev-builder-image-version
Downloading 353b655c2297 ovmf-sev-sha256sum
Downloaded  559311973ff8 ovmf-sev-version
Downloaded  353b655c2297 ovmf-sev-sha256sum
Downloaded  7e98c854bd94 kata-static-ovmf-sev.tar.xz
Pulled [registry] ghcr.io/kata-containers/cached-artefacts/ovmf-sev:latest-main-x86_64
Digest: sha256:933236c2c79e53be3ca7acc0b966d0ddac9c0335edcb1e8cad8b9bb3aaf508ce
kata-static-ovmf-sev.tar.xz: OK
INFO: Using cached tarball of ovmf-sev
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./opt/
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./opt/kata/
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./opt/kata/share/
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./opt/kata/share/ovmf/
-rwxr-xr-x runner/runner 4194304 2023-09-15 10:34 ./opt/kata/share/ovmf/AMDSEV.fd
~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/builddir
~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/builddir
make[1]: Leaving directory '/home/ffidenci/src/upstream/kata-containers/kata-containers'
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 12:39:22 +02:00
Fabiano Fidêncio
86c41074b4 ci: cache: Check the sha256sum of the component
We've removed this in the part 2 of this effort, as we were not caching
the sha256sum of the component.  Now that this part has been merged,
let's get back to checking it.

Fixes: #7834 -- part 3

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 12:34:30 +02:00
Fabiano Fidêncio
f5e52d02d3 Merge pull request #7964 from fidencio/topic/ci-cache-using-oras-part-2
ci: cache: Use the artefacts stored in ghcr.io/kata-containers/cached-artefacts/${component}
2023-09-15 12:29:28 +02:00
Fabiano Fidêncio
2fe0b494da Merge pull request #7959 from fidencio/topic/ci-run-on-smaller-garm-instances
ci: Run some of the GARM tests in smaller instances
2023-09-15 11:30:13 +02:00
Fabiano Fidêncio
460988c5f7 ci: cache: Remove the script used to cache artefacts on Jenkins
That's not needed anymore, as we've switched to using ORAS and an OCI
registry to cache the artefacts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 10:27:55 +02:00
Fabiano Fidêncio
4533a7a416 ci: cache: Also store the ${component} sha256sum
This is something that was done by our Jenkins jobs, but that I ended up
missing when writing d0c257b3a7.

Now, let's also add the sha256sum to the cached artefact, and in a
coming up PR (after this one is merged) we will also start checking for
that.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 10:25:26 +02:00
Fabiano Fidêncio
eccc76df63 ci: cache: Use the cached artefacts from ORAS
In the previous series related to the artefacts we build, we've
switching from storing the artefacts on Jenkins, to storing those in the
ghcr.io/kata-containers/cached-artefacts/${artefact_name}.

Now, let's take advantage of that and actually use the artefacts coming
from that "package" (as GitHub calls it).

NOTE: One thing that I've noticed that we're missing, is storing and
checking the sha256sum of the artefact.  The storing part will be done
in a different commit, and the checking the sha256sum will be done in a
different PR, as we need to ensure those were pushed to the registry
before actually taking the bullet to check for them.

Fixes: #7834 -- part 2

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 10:13:47 +02:00
Jeremi Piotrowski
6f30d00ae7 Merge pull request #7956 from fidencio/topic/ci-reduce-the-machine-size-used
ci: Reduce the size of the AKS VMs
2023-09-15 08:49:08 +02:00
Steve Horsman
1b8f3fa9ae Merge pull request #7957 from fidencio/topic/ci-cache-using-oras-part-1
ci: cache: Allow pushing our artefacts to an OCI registry
2023-09-15 07:45:24 +01:00
Jianyong Wu
7f5e77bcb8 kernel: enable Arm pl011 support
Enable pl011 (ttyAMA0) support in kernel for aarch64.

Fixes: #5080
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-09-15 01:45:16 +00:00
Jianyong Wu
241c355e07 clh:arm64: use arm AMBA uart for hypervisor debug
cloud hypervisor on arm64 only support arm AMBA UART(pl011) as
tty. So, the console should be set to "ttyAMA0" instead of "ttyS0"
when enable hypervisor debug mode.

Fixes: #5080
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-09-15 01:44:23 +00:00
Fabiano Fidêncio
094b6b2cf8 ci: k8s: Temporarily disable tests that require a bigger VM instance
The list of tests which require a bigger VM instance is:
* k8s-number-cpus.bats -- failing on all CIs
* k8s-parallel.bats -- only failing on the cbl-mariner CI
* k8s-scale-nginx.bats -- only failing on the cbl-mariner CI

We'll keep those disabled while we re-work the logic to **only run
those** in a bigger (and more expensive) VM instance.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 01:33:19 +02:00
GabyCT
6fe5cd3bd5 Merge pull request #7937 from GabyCT/topic/iperfbandwidth
metrics: Add iperf value for cpu utilization
2023-09-14 16:47:19 -06:00
Fabiano Fidêncio
d0c257b3a7 ci: cache: Push cached artefacts to ghcr.io
Let's push the artefacts to ghcr.io and stop relying on jenkins for
that.

Fixes: #7834 -- part 1

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:39:57 +02:00
Fabiano Fidêncio
108f1b60dd kata-deploy: Generate latest_{artefact,image_builder} files
Right now this is not used, but it'll be used when we start caching the
artefacts using ORAS.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:39:57 +02:00
Fabiano Fidêncio
be2eb7b378 ci: cache: Install ORAS in the kata-deploy binaries builder container
ORAS is the tool which will help us to deal with our artefacts being
pushed to and pulled from a container registry.

As both the push to and the pull from will be done inside the
kata-deploy binaries builder container, we need it installed there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:39:57 +02:00
Fabiano Fidêncio
fb24fb0dc1 ci: k8s: devmapper: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:27:05 +02:00
Fabiano Fidêncio
1daf02f5d4 ci: nydus: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:25:41 +02:00
Fabiano Fidêncio
e60d81f554 ci: nerdctl: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:25:41 +02:00
Fabiano Fidêncio
4db416997c ci: docker: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:25:41 +02:00
Fabiano Fidêncio
32841827b8 ci: cri-containerd: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:25:35 +02:00
Fabiano Fidêncio
92fff129fd ci: k8s: Don't set cpu limit request for k8s-inotofy test
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 22:03:16 +02:00
Fabiano Fidêncio
faf98c0623 ci: Reduce the size of the AKS VMs
We do **not** need a very powerful machine for our tests, as we're not
building anything there.

The instance we switched to (Standard_D2s_v5) still has nested virt
available, as shown here[0], but has half of the amount of vCPUs /
Memory, which should be fine only for running the tests, costing us
basically half of the price[1].

[0]:
https://learn.microsoft.com/en-us/azure/virtual-machines/dv5-dsv5-series
[1]:
https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/#pricing

Fixes: #7955

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 22:03:16 +02:00
Fabiano Fidêncio
adc18ecdb1 ci: cache: For consistency, read all used env vars
Instead of having some of them only being considered if explicitly
passed to the script.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 20:24:48 +02:00
Fabiano Fidêncio
c7a851efd7 ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker
As the environment variables are now being passed down from the GitHub
Actions, let's make sure they're exposed to the container used to build
the kata-deploy binaries, and during the build process we'll be able to
use those to log in and push the artefacts to the OCI registry, using
ORAS.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 20:24:48 +02:00
Fabiano Fidêncio
2e8b41f39c Merge pull request #7954 from fidencio/topic/ci-cache-using-oras-part-0
ci: cache: Export env vars needed to use ORAS
2023-09-14 20:23:55 +02:00
Fabiano Fidêncio
6bd15a85d5 ci: cache: Export env vars needed to use ORAS
We do the build of our artefacts inside a container image, and we need
to expose some env vars to the container so ORAS can be used there to
push the artefacts we want to cache to ghcr.io.

The env vars we're exposing are:
* ARTEFACT_REGISTRY: The registry where we're going to save the
  artefacts.
* ARTEFACT_REGISTRY_USERNAME: The username to log in to the registry, as
  ORAS does not use the same json file used by docker.
* ARTEFACT_REGISTRY_PASSWORD: The pasword to log in to the the registry,
  as the ORAS does not use the same json file used by docker.
* TARGET_BRANCH: The target branch, which will be part of the tag of the
  artefact, as we may end up caching the artefacts for both main and
  stable branches.

Fixes: #7834 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 19:36:33 +02:00
Gabriela Cervantes
cd4fd1292a metrics: Add iperf cpu utilization limit for qemu
This PR adds the iperf cpu utilization limit for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-14 17:17:47 +00:00
Gabriela Cervantes
df5cd10ea0 metrics: Add iperf value for cpu utilization
This PR adds the iperf value for cpu utilization for kata metrics.

Fixes #7936

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-14 16:06:49 +00:00
Jeremi Piotrowski
b54dd8cdf4 Merge pull request #7704 from jepio/vfio-part-1
gha: vfio: Import test script
2023-09-14 16:45:31 +02:00
Jeremi Piotrowski
a96050a7ad tests: Apply timeout to 'ctr t kill'
This task has been observed to hang at times.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
9d93036783 tests/vfio: Bump VM image to Fedora 38
We need a very recent L2 guest kernel to fix all the bugs that occur in nested
virtualization.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
faee59b520 tests/vfio: Accept single device in vfio group for CLH
cloud hypervisor does not emulate pcie switches or pci bridges, so we need to
accept a lonely device.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
df3dc1105c tests/vfio: Get rid of sync's
It is fine to start a VM with the disk image without syncing it as we now run
the test in an ephemeral Azure instance.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
7211c3dccc gha: vfio: Set test timeout to 15m
Sometimes the test gets stuck running commands in the container - need to
investigate why later.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
1b02f89e4f packaging: kernel: Enable VIRTIO_IOMMU on x86_64
Cloud Hypervisor exposes a VIRTIO_IOMMU device to the VM when IOMMU support is
enabled. We need to add it to the whitelist because dragonball uses kernel
v5.10 which restricted VIRTIO_IOMMU to ARM64 only.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
3a1db7a86b runtime: clh: Support enabling iommu
by enabling IOMMU on the default PCI segment. For hotplug to work we need a
virtualized iommu and clh exposes one if there is some device or PCI segment
that requests it. I would have preferred to add a separate PCI segment for
hotplugging vfio devices but unfortunately kata assumes there is only one
segment all over the place. See create_pci_root_bus_path(),
split_vfio_pci_option() and grep for '0000'.

Enabling the IOMMU on the default PCI segment requires passing enabling IOMMU on
every device that is attached to it, which is why it is sprinkled all over the
place.

CLH does not support IOMMU for VirtioFs, so I've added a non IOMMU segment for
that device.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
9f1a42c6cc tests/vfio: Give commands 30s to execute
This is a to catch the case of the guest getting stuck.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
b46b0ecf8b tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms
This shouldn't be hiding behind only a qemu check, we need this for clh as
well.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
bfc93927fb runtime: Remove redundant check in checkPCIeConfig
There is no way for this branch to be hit, as port is only set when it is
different than config.NoPort.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
7c4e73b609 runtime: Add test cases for checkPCIeConfig
These test cases shows which options are valid for CLH/Qemu, and test that we
correctly catch unsupported combinations.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
fc51e4b9eb runtime: Check config for supported CLH (cold|hot)_plug_vfio values
The only supported options are hot_plug_vfio=root-port or no-port.
cold_plug_vfio not supported yet.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
509771e6f5 runtime: clh: Add hot_plug_vfio entry to config
hot_plug_vfio needs to be set to root-port, otherwise attaching vfio devices to
CLH VMs fails. Either cold_plug_vfio or hot_plug_vfio is required, and we have
not implemented support for cold_plug_vfio in CLH yet.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
5f6475a28a tests/vfio: Gather debug info and disable tdp_mmu
tdp_mmu had some issues up until around Linux v6.3 that make it work
particularly bad when running nested on Hyper-V. Reload the module at the start
of the test and disable the tdp_mmu param.

Gather debug info at the end of the test to make it easier to figure out what
went wrong. This uses github actions group syntax so that each section can be
collapsed.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
8fffdc81c5 tests/vfio: Capture journal from vm
For debugging (though this doesn't get exposed yet).

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
df815087e7 tests/vfio: Change to get the test working in GHA
- reduce memory and cpu usage to fit in a D4s_v5
- source correct lib
- mount workspace from 9p
- disable cpu mitigations for speed
- drop unused commands and variables
- install containerd
- install kata from built artifacts

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
a92ddeea15 tests/vfio: Move dependency installation to gha-run.sh
To match the flow of other github actions workflows.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
5a551a85b1 gha: vfio: Import jobs scripts from tests repo
This imports the vfio test scripts github.com/kata-containers/tests. The test
case doesn't work yet but doing the changes in a separate commit will make it
easier to track the changes. The only change in this commit is renaming
vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh

Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Fabiano Fidêncio
a1e3fa7ac4 Merge pull request #7905 from microsoft/danmihai1/mariner-annotations
tests: fix kernel and initrd annotations
2023-09-14 10:37:42 +02:00
GabyCT
1d331124ad Merge pull request #7925 from GabyCT/topic/bandwidthlimit
metrics: Add iperf bandwidth value for kata metrics
2023-09-13 17:43:55 -06:00
Gabriela Cervantes
49e2fa189c metrics: Increase jitter value for qemu
This PR increases the jitter value for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-13 22:36:09 +00:00
Gabriela Cervantes
49234433a7 metrics: Increase value limit for jitter in clh
This PR increases the value limit for jitter in clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-13 21:27:08 +00:00
David Esparza
0a24d3f718 Merge pull request #7923 from GabyCT/topic/addcassandradoc
metrics: Add Cassandra Metrics documentation
2023-09-13 10:17:00 -06:00
GabyCT
c565053bac Merge pull request #7895 from GabyCT/topic/removewarning
metrics: Remove warning from metrics documentation
2023-09-13 10:16:38 -06:00
Fabiano Fidêncio
8b9df1d32e Merge pull request #7929 from fidencio/topic/use-tcp-port-ping-on-docker-nerdctl-tests
ci: docker: nerdctl: Switch to tcp port 80 ping
2023-09-13 15:46:31 +02:00
Peng Tao
55ca7e8aec Merge pull request #7907 from Xuanqing-Shi/7876/network-devices-naming-conflict
runtime: Naming conflict of network devices
2023-09-13 19:29:41 +08:00
Fabiano Fidêncio
813bfdec01 ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io
This will ensure that we're calling the correct binary for the
hypervisor.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:10:14 +02:00
Fabiano Fidêncio
46bc0b1c01 ci: nerdctl: Create the containerd config
Otherwise we'll fail to configure kata-containers in the `install-kata`
step.

This is mostly needed because the nerdctl-full tarball doesn't provide a
contaienrd configuration, just the binary, as contaienrd does not
actually require a configuration file to run with the default config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Fabiano Fidêncio
13968aa7f6 ci: nerdctl: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Fabiano Fidêncio
e0c811678b ci: docker: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
shixuanqing
1636abbe1c runtime: issue with non-empty []Endpoint in RemoveEndpoints
In the RemoveEndpoints(), when the endpoints paramete isn't empty,
using idx may result in wrong endpoint removals. To improve,
directly passing the endpoint parameter helps
locate the correct elements within n.eps.

Fixes: #7732

Signed-off-by: shixuanqing <1356292400@qq.com>

Fixes: #7732

Signed-off-by: shixuanqing <1356292400@qq.com>

Update src/runtime/virtcontainers/network_linux.go

Co-authored-by: Xuewei Niu <justxuewei@apache.org>
2023-09-13 09:47:18 +00:00
Peng Tao
9766f9090c Merge pull request #7719 from beraldoleal/nullable
Remove gogoproto.nullable extension
2023-09-13 15:11:56 +08:00
David Esparza
c2b2a00ad9 Merge pull request #7899 from GabyCT/topic/startdocker
metrics: Ensure docker is running in init_env
2023-09-12 23:01:26 -06:00
Gabriela Cervantes
0aa073967d metrics: Add iperf bandwidth value for qemu
This PR adds the iperf bandwidth value for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 20:57:14 +00:00
Dan Mihai
c0ad914766 tests: fix kernel and initrd annotations
Fix kernel and initrd annotations in the k8s tests on Mariner. These
annotations must be applied to the spec.template for Deployment, Job
and ReplicationController resources.

Fixes: #7764

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-12 20:15:25 +00:00
Gabriela Cervantes
615c1cbf19 metrics: Add iperf bandwidth value for kata metrics
This PR adds the iperf bandwidth value for kata metrics.

Fixes #7924

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 19:30:24 +00:00
Gabriela Cervantes
d53eb73eec metrics: Ensure docker is running in init_env
This PR ensures that docker is running as part of the init_env function
in kata metrics to avoid failures like docker is not running and making
the kata metrics CI to fail.

Fixes #7898

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 19:13:09 +00:00
GabyCT
c0d502493e Merge pull request #7921 from dborquez/metrics_disable_fio_test
metrics: this PR skips the FIO test temprarily to fix issues
2023-09-12 12:08:48 -06:00
Gabriela Cervantes
ad08321b83 metrics: Add Cassandra Metrics documentation
This PR adds the Cassandra Metrics documentation for kata metrics.

Fixes #7922

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 16:30:35 +00:00
David Esparza
a58ea66592 metrics: this PR skips the FIO test temprarily to fix issues
FIO test is showing ongoing issues when running in k8s.
Working on running FIO on the ctr client which has been
shown to be stable.

Fixes: #7920

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-12 10:23:57 -06:00
Fabiano Fidêncio
2d8447fc6b Merge pull request #7916 from fidencio/topic/add-functional-nerdctl-tests
ci: Add a very basic nerdctl sanity test
2023-09-12 17:47:08 +02:00
James O. D. Hunt
7feb8de9dc Merge pull request #7887 from jodh-intel/hypervisor-remove-debug-kernel-options
runtime-rs: hypervisor: Remove debug kernel options
2023-09-12 16:31:48 +01:00
Fabiano Fidêncio
f536ef5ce1 ci: docker: Also run the smoke test with runc
This will help us to make sure that the failure is actually related to
Kata Containers.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:54:02 +02:00
Fabiano Fidêncio
c83f167c59 ci: docker: Run the tests after the kata-static is created
There's no reason to wait till the payload is created to run the tests,
as we rely on the tarball, not on the kata-deploy payload.

That was a mistake on my side, and that's already fixed for the nerdctl
tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:53:47 +02:00
Fabiano Fidêncio
12d833d07d ci: Add a very basic nerdctl sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using nerdctl + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7911

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:52:55 +02:00
Greg Kurz
be71a0ab4e Merge pull request #7811 from stevenhorsman/bump-rust-to-1.72
versions: Bump rust version
2023-09-12 15:30:35 +02:00
Fabiano Fidêncio
b020912629 Merge pull request #7913 from fidencio/topic/add-functional-docker-tests
ci: Add a very basic docker sanity test
2023-09-12 15:28:49 +02:00
Fabiano Fidêncio
348b8644d6 ci: Add a very basic docker sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using docker + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

For now we're running this test against Cloud Hypervisor and QEMU only,
due to an already reported issue with dragonball:
https://github.com/kata-containers/kata-containers/issues/7912

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 15:15:26 +02:00
stevenhorsman
a75fd5eb81 runk: Fix rust unecessary mut error
- Fix `error: variable does not need to be mutable`
in rust 1.72

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
a31c145172 kata-ctl: useless-vec warning
- Fix clippy::useless-vec warning

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
c8419fc3bb kata-ctl: Resolve non-minimal-cfg warning
- In rust 1.72, clippy warned clippy::non-minimal-cfg
as the cfg has only one condition, so doesn't
need to be wrapped in the any combinator.

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
3eaf68d954 agent-ctl: Allow clippy lint
- Allow `clippy::redundant-closure-call`
which has issues with the guard function passed into
the `run_if_auto_values` macro

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
1d8b78959d runtime-rs: Fix useless-vec warning
Fix clippy::useless-vec warning

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
99f3d69e94 runtime-rs: Remove mut
Fix `error: variable does not need to be mutable`

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
16fbc27b09 dragonball: Allow ambiguous-glob-reexports
The bindgen generated code is triggering lots of
ambiguous-glob-reexports warnings in rust 1.70+

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
bbf1919516 dragonball: Resolve non-minimal-cfg warning
- In rust 1.72, clippy warned clippy::non-minimal-cfg
as the cfg has only one condition, so doesn't
need to be wrapped in the all combinators.

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
75cfdd5d59 agent: config: Allow clippy lint
- Allow `clippy::redundant-closure-call` in `from_cmdline`
which has issues with the guard function passed into
the `parse_cmdline_param` macro

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
f3a0fd5907 agent: config: Fix useles-vec warning
Fix clippy::useless-vec warning

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
9e423bd3d6 libs: Fix clippy unnecesary hashes error
- Fix error: unnecessary hashes around raw string literal

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
444395050a versions: Bump rust version
Bump rust to 1.72.0 to test what extra warnings/issues we get

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
Yipeng Yin
a16b0962b5 chore(cargo): update cargo lock
Update cargo lock for runtime-rs, agent and kata-ctl.

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2023-09-12 15:27:38 +08:00
Chao Wu
c800d0739f Merge pull request #7889 from UiPath/fix-dragonball-build
dragonball: fix for non-deterministic builds
2023-09-12 14:06:18 +08:00
shixuanqing
ca4b6b051d runtime: Naming conflict of network devices
When creating a new endpoint, we check existing endpoint names and automatically adjust the naming of the new endpoint to ensure uniqueness.

Fixes: #7876

Signed-off-by: shixuanqing <1356292400@qq.com>
2023-09-12 04:29:51 +00:00
Guixiong Wei
202049f35e feat(runtime-rs): introduce huge page type to select VM RAM's backend
This commit allows us to specify the huge page backend when enabling huge
page. Currently, we support two backends: thp and hugetlbfs, the default
is hugetlbfs.

To ensure backward compatibility, we introduce another configuration item
"hugepage_type" to select the memory backend, which is available only when
"enable_hugepages" is true. Besides, we add an annotation
"io.katacontainers.config.hypervisor.hugepage_type" to configure huge page
type per pod.

Fixes: #6703

Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2023-09-12 11:28:27 +08:00
Zhongtao Hu
e1f54f96d0 Merge pull request #7766 from Apokleos/wrap-vsock-virtiofs
runtime-rs: bring hybrid vsock devices in manager.
2023-09-12 09:27:34 +08:00
GabyCT
af29eeb8b1 Merge pull request #7901 from fidencio/topic/ci-target-branch-fixes-follow-up-3
ci: use github.ref_name instead of $GITHUB_REF_NAME
2023-09-11 15:31:29 -06:00
Fabiano Fidêncio
f811b064ca ci: use github.ref_name instead of $GITHUB_REF_NAME
As, regardless of what's mentioned in the documentation, it seems that
$GITHUB_REF_NAME is passed down as a literal string.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 22:14:55 +02:00
Fabiano Fidêncio
dc0b350e49 Merge pull request #7900 from fidencio/topic/ci-target-branch-fixes-follow-up-2
ci: Add more target-branch related fixes
2023-09-11 21:26:26 +02:00
Fabiano Fidêncio
6d795c089e ci: Add more target-branch related fixes
The ones for the payload-after-push.yamland ci-nightly.yaml are not that
much important right now, but they're needed for when we start running
those on stable branches as well.

The other ones were missed during
bd24afcf73.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 20:42:57 +02:00
Fabiano Fidêncio
07d0ad0ad7 Merge pull request #7897 from fidencio/topic/ci-devmapper-do-the-rebase-as-well
ci: Fix target-branch usage
2023-09-11 20:30:53 +02:00
Fabiano Fidêncio
d7f991d139 Merge pull request #7151 from Yuan-Zhuo/fix-systemd-cgroup
agent: optimize the code of systemd cgroup manager
2023-09-11 20:15:51 +02:00
Fabiano Fidêncio
8509c31870 ci: Fix target-branch usage
We missed those one as part of bd24afcf73.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 20:10:27 +02:00
Gabriela Cervantes
060499dcae metrics: Remove warning from metrics documentation
Now that the metrics migration from the tests to kata containers has been completed, this PR removes the warning from the main metrics documentation.

Fixes #7894

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-11 16:41:48 +00:00
GabyCT
b384757ac7 Merge pull request #7874 from fidencio/topic/manually-rebase-branches-atop-of-the-target-one
gha: Manually rebase PR atop of the target branch before testing
2023-09-11 10:35:01 -06:00
Fabiano Fidêncio
46e73cf7a2 Merge pull request #7884 from fidencio/topic/update-kernel-to-the-latest-lts-plus-bring-in-erofs-patches
Update kernel to the latest LTS release (v6.1.52) and bring in erofs patches needed for the CC work
2023-09-11 13:58:43 +02:00
James O. D. Hunt
c0f697fcc5 runtime: Allow kernel_params annotation
To support the removal of the `initcall_debug` and `earlyprintk=`
options from the default guest kernel cmdline, add `kernel_params` to the list
of enabled annotations to allow those kernel options (or others) to be
set using `kata-deploy` for either runtime.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 12:12:12 +01:00
Alexandru Matei
b03e49794e dragonball: fix for non-deterministic builds
Fixes: #7888

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-09-11 14:07:10 +03:00
Fabiano Fidêncio
93bad13769 Merge pull request #7875 from fidencio/topic/kata-deploy-fix-arm64-image-build
kata-deploy: Fix aarch64 image build
2023-09-11 11:36:52 +02:00
James O. D. Hunt
976d10150c runtime-rs: hypervisor: Remove debug kernel options
Removed the following kernel command line options:

- `earlyprintk=ttyS0`
- `initcall_debug`

Both these options are only useful when debugging a guest kernel failure
which is not a common occurrence.

Further, the `earlyprintk=` option can have a large negative performance
impact (it can increase the VM boot time significantly).

If the user wishes to use either of these options, they can add them to the
`kernel_params=` setting in the Kata configuration file's hypervisor
stanza.

Fixes: #7886.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 09:43:39 +01:00
Fabiano Fidêncio
fde34610cd kernel: Add erofs patches needed for CC related work
All the patches have already been merged upstream and they've just been
cherry-picked to this branch.

Fixes: #7885

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 10:39:37 +02:00
Fabiano Fidêncio
dc6a4588a2 versions: Bump kernel to the latest LTS release (6.1.52)
We're bumping here in order to make our lives easier backporting EROFS
patches needed for the CC related work.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 10:32:16 +02:00
James O. D. Hunt
52f6449b70 kata-manager: Remove initcall_debug kernel option
Removed the addition of the `initcall_debug` kernel option when agent
debugging enabled. This option has nothing to do with the agent.

If the user wishes to use this option, they can add it to the
`kernel_params=` setting in the Kata configuration file's hypervisor
stanza.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 09:31:44 +01:00
Fabiano Fidêncio
6cd5d83a37 Merge pull request #7865 from gkurz/fix-more-virtiofs-args
runtime: Fix more virtiofs args
2023-09-09 21:30:16 +02:00
Fabiano Fidêncio
8b4a0b368f kata-deploy: Remove curl after it's used
There's no need to keep curl there after the kubectl binary has already
been downloaded.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-09 10:52:05 +02:00
Fabiano Fidêncio
139c7f03ab kata-deploy: Fix aarch64 image build
Similarly to what's been done for x86_64 -> amd64, we need to do a
aarch64 -> arm64 change in order to be able to download the kubectl
binary.

Fixes: #7861

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-09 10:51:52 +02:00
Fabiano Fidêncio
94f5a69346 Merge pull request #7862 from fidencio/topic/kata-deploy-use-alpine-as-base-image
kata-deploy: Switch to an alpine image
2023-09-09 09:02:13 +02:00
Yuan-Zhuo
470d065415 agent: optimize the code of systemd cgroup manager
1. Directly support CgroupManager::freeze through systemd API.
2. Avoid always passing unit_name by storing it into DBusClient.
3. Realize CgroupManager::destroy more accurately by killing systemd unit rather than stop it.
4. Ignore no such unit error when destroying systemd unit.
5. Update zbus version and corresponding interface file.

Acknowledgement: error handling for no such systemd unit error refers to

Fixes: #7080, #7142, #7143, #7166

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2023-09-09 13:56:43 +08:00
GabyCT
fa818bfad1 Merge pull request #7867 from GabyCT/topic/optimizedimage
metrics: Use TensorFlow optimized image
2023-09-08 11:34:21 -06:00
Fabiano Fidêncio
bd24afcf73 gha: Manually rebase PR atop of the target branch before testing
We're changing what's been done as part of ac939c458c, as we've
notcied issues using `github.event.pull_request.merge_commit_sha`.

Basically, whenever a force-push would happen, the reference of
merge_commit_sha wouldn't be updated, leading us to test PRs with the
old code. :-/

In order to get the rebase properly working, we need to ensure we pull
the hash of the commit as part of checkout action, and ensure
fetch-depth is set to 0.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 18:56:31 +02:00
GabyCT
dc7414f5c1 Merge pull request #7870 from dborquez/metrics_fio_fix_clean_env_order
metrics: fix FIO test initialization
2023-09-08 10:28:10 -06:00
Greg Kurz
72c510d057 runtime/virtiofsd: Drop all references to "--cache=none"
This syntax belongs to the legacy C virtiofsd implementation that
we don't support anymore since kata-containers 3.1.3 because
of other API breaking changes.

People have been warned to switch from "none" to "never" since
kata-containers 2.5.2. Let's officially do that.

The compat code that would convert "none" to "never" isn't
needed anymore. Just drop it.

Fixes #7864

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-08 17:57:30 +02:00
Beraldo Leal
ead724bec1 protocol: removing gogo.nullable feature
gogo.nullable is the main gogo.protobuf' feature used here. Since we are
trying to remove gogo.protobuf, the first reasonable step seems to be
remove this feature. This is a core update, and it will change how the
structs are defined. I could spot only a few places using those structs,
based on make check/build.

Fixes #7723.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
d8e4bb9859 protocol: remove unused PROTO_FILE env
There is no reference to PROTO_FILE and this is not working. Also we are
not inside a Makefile, so makes sense to adapt the usage to reflect the
script instead of a make command.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
5e1106a770 protocol: remove unused import_path
import_path is used as the default package when no input files specify
go_package. However, all the files we are currently building already
have a go_package definition, making this behavior both redundant and
error-prone.

Additionally, one of our files (types.pb.go) resides outside the grpc
directory, indicating that it's indeed ignored but also inconsistent.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
87accaaecb protocol: use workdir during build
Currently, the script searches for .proto files within $GOPATH/.
Consequently, modifications to a definition file in the current working
directory won't influence the output .pb.go if the directory is outside
of $GOPATH. For developers, it's more intuitive to alter the local
codebase than the version stored in $GOPATH.

With this modification, the generated .pb.go files will be relative to
the current working directory, removing the need to clone this project
under $GOPATH/src/github.com/kata-containers.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
711a7ed965 protocol: remove mapping definitions
The definitions are already specified in the .proto files using the
go_package option. Centralizing them in one location reduces the
potential for errors and simplifies the script.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
8db84c1bd2 protocol: force GOPATH to be set
Currently, if GOPATH is not set, errors will raise since protoc is using
GOPATH to find packages.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
68156d77ac protocol: breaking lines to improve readability
Just a small change to improve the readability of modules before the
actual changes.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Fabiano Fidêncio
670a8e9c73 kata-deploy: Switch to an alpine image
This will make our image smaller, and still ensure it's multi-arch
support.

Fixes: #7861

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 17:39:51 +02:00
Fabiano Fidêncio
0b26a5d053 Merge pull request #7871 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-3
ci: k8s: Add clean-up-garm argument for gha-run.sh
2023-09-08 17:27:57 +02:00
Fabiano Fidêncio
9d74b7ccc9 k8s: ci: Skip "Pod quota" test with firecracker
The test is failing, and an issue has been opened to track it.
For now, let's skip it.

Issue:
https://github.com/kata-containers/kata-containers/issues/7873

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 15:51:46 +02:00
Fabiano Fidêncio
f6cd3930c5 ci: k8s: Remove useless skip statement from tests
There's absolutely no need to have the skip check as part of the test
itself when it's already done as part of the setup function.

We're only touching the files here that were touched in the previous
commit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:25:29 +02:00
Fabiano Fidêncio
3cc20b47a6 ci: k8s: Also check for "fc" (for firecracker)
Let's keep both checks for now, but in the future we'll be able to
remove the check for "firecracker", as the hypervisor name used as part
of the GitHub Actions has to match what's used as part of the
kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class)
instead of `firecracker`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:25:24 +02:00
Fabiano Fidêncio
b5bad3cb0f ci: k8s: Add clean-up-garm argument for gha-run.sh
The tests are failing to finish as the argument is invalid.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:04:50 +02:00
Fabiano Fidêncio
05e2e7636e Merge pull request #7868 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-2
ci: k8s: Second round of fix-ups with the devmapper CI
2023-09-08 11:02:20 +02:00
Fabiano Fidêncio
aaec5a09f3 ci: k8s: devmapper tests should be using ubuntu 20.04
That's what we've been using as part of Jenkins, so let's ensure things
will work as they did before, and only after that consider upgrading the
base OS used for the tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
27fa7d828d ci: k8s: Add a kata-deploy-garm target
We've been using the `kata-deploy-tdx` target as that also uses k3s as
base, but it's better to just have a specific garm target.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
fa62a4c01b ci: k8s: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
8c9380a798 ci: k8s: Install bats on GARM runners
GARM runners do not come with the whole set of tools we need, or are
used to when it comes to the GHA runners, so we need to manually install
bats on those.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
3de23034f8 ci: k8s: Wait some time after restarting k3s
Let's put a 1 minute sleep, just to make sure everything is back up
again.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:46:58 +02:00
David Esparza
adfea55b8f metrics: fix FIO test initialization
This PR changes the order in which the FIO test first
cleans the environment and then checks if the environment
is indeed clean.

Fixes: #7869

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-07 15:41:59 -06:00
Fabiano Fidêncio
2df183fd99 ci: k8s: Append, instead of overwrite, the devmapper config
As we were using `tee` without the `-a` (or `--apend`) aptton, the
containerd config would be overwritten, leading to a NotReady state of
the Node.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
369a8af8f7 ci: k8s: Decrease k3s sleep from 4 to 2 minutes
It should be plenty, and worked well in local tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
ada65b988a ci: k8s: Use vanilla kubectl with k3s
Let's download the vanilla kubectl binary into `/usr/bin/`, as we need
to avoid hitting issues like:
```sh
error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied
```

The issue basically happens because k3s links `/usr/local/bin/kubectl`
to `/usr/local/bin/k3s`, and that does extra stuff that vanilla
`kubectl` doesn't do.

Also, in order to properly use the k3s.yaml config with the vanilla
kubectl, we're copying it to ~/.kube/config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
ad45ab5d33 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users
than root.

As --write-config-mode is being passed, and that's an option that has to
be passed to the `server`, -s is also added to the command line.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
028a97e0d5 ci: k8s: Use the proper command for sleep
`wait` waits for a job to complete, not a number of seconds.  Not sure
how I got that wrong in the first place, but it's what it's.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
David Esparza
34f580901f Merge pull request #7824 from dborquez/fix_memory_usage_initialization
metrics: re-enable memory-usage initialization step
2023-09-07 14:24:27 -06:00
Gabriela Cervantes
3a427795ea metrics: Use TensorFlow optimized image
This PR replaces the ubuntu image for one which has TensorFlow optimized
for kata metrics.

Fixes #7866

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-07 15:38:51 +00:00
Chao Wu
cd8c217ee1 Merge pull request #6879 from openanolis/chao/update_upstream_upcall_feature
Dragonball: optimize the placement of dbs-upcall features
2023-09-07 18:07:53 +08:00
Fabiano Fidêncio
dfa1cce916 Merge pull request #7860 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-1
ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
2023-09-07 11:48:30 +02:00
Fabiano Fidêncio
8d99972a8a ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
integrations -> integration
integrtion -> integration

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 11:31:30 +02:00
Fabiano Fidêncio
0483d3d16d Merge pull request #7841 from fidencio/topic/ci-add-k8s-devmapper-tests
ci: k8s: Add k8s devmapper tests (part 0)
2023-09-07 10:53:09 +02:00
Jeremi Piotrowski
f6cc01d77c Merge pull request #7833 from jepio/kata-static-fix-ownership
kata-deploy: Create kata-static.tar with correct ownership
2023-09-07 10:16:23 +02:00
Peng Tao
435e890cd9 Merge pull request #7703 from bergwolf/github/nerdctl-fc
runtime: run prestart hooks before starting VM for FC
2023-09-07 10:55:31 +08:00
Chao Wu
deed1b927d Dragonball: optimize the placement of dbs-upcall features
Currently, the dbs-upcall features have 2 problems that are needed to be
fixed :

There are redundant dbs-upcall features that are needed to be removed.
Some place should be controlled by dbs-upcall but not being implemented.

This commit will fix those two problems.

fixes: #6878

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-09-07 10:27:29 +08:00
Fabiano Fidêncio
0e8bd50cbb ci: k8s: Add k8s devmapper tests (part 0)
Let's enable the devmapper kubernetes tests to match exactly what's been
tested as part of the Jenkins CI.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-06 23:08:38 +02:00
Fabiano Fidêncio
b28b54df04 ci: k8s: Add a function to configure devmapper for containerd
This function right now is completely based on what's part of the tests
repo[0], and that's the reason I'm keeping the `Signed-off-by` of all
the contributors to that file.

This is not perfect, though, as it changes the default snapshotter to
devmapper, instead of only doing so for the Kata Containers specific
runtime handlers.  OTOH, this is exactly what we've always been doing as
part of the tests.

We'll improve it, soon enough, when we get to also add a way for
kata-deploy to set up different snapshotters for different handlers.
But, for now, this is as good (or as bad) as it's always been.

It's important to note that the devmapper setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-06 23:08:17 +02:00
Fabiano Fidêncio
54f7117212 ci: k8s: Add a function to deploy k3s
One can use different kubernetes flavours for getting a kubernetes
cluster up and running.

As part of our CI, though, I really would like to avoid contributors
spending time maintaining and updating kubernetes dependencies, as done
with the tests repo, and which has been proven to be really good on
getting things rotten.

With this in mind, I'm taking the bullet and using "k3s" as the way to
deploy kubernetes for the devmapper related tests, and that's the reason
I'm adding a function to do so, and this will be used later on as part
of this series.

It's important to note that the k3s setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-06 23:07:41 +02:00
David Esparza
cf258090aa Merge pull request #7843 from GabyCT/topic/ffiolimit
metrics: Add write 95 percentile FIO value
2023-09-06 14:52:00 -06:00
Fabiano Fidêncio
c5e1e7ddc3 Merge pull request #7854 from fidencio/topic/runtime-allow-virtio_fs_extra_args-annotation
runtime: Allow virtio_fs_extra_args annotation
2023-09-06 19:20:40 +02:00
Greg Kurz
81536f21af runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr"
The "-o" syntax belongs to the legacy C virtiofsd. It is deprecated
with the rust implementation.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-06 17:50:35 +02:00
Fabiano Fidêncio
b1dd09a4d3 runtime: Allow virtio_fs_extra_args annotation
Some use cases may just require passing extra arguments to virtiofsd,
and having this disabled by default makes it impossible to set when
using kata-deploy, as changes in the configuration file would be
overwritten by the daemon-set.

With this in mind, let's allow users to pass whatever thet need (and
here I'm specifically looking at `--xattr`) as a virtio_fs_extra_arg.

Fixes: #7853

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-06 17:11:16 +02:00
Hyounggyu Choi
d27fe18167 Merge pull request #7849 from BbolroC/hot-fix-dockerbuild
packaging: do not install docker-compose-plugin for s390x|ppc64le
2023-09-06 13:13:25 +02:00
Hyounggyu Choi
2efda20c77 packaging: do not install docker-compose-plugin for s390x|ppc64le
This PR is to skip installing docker-compose-plugin while buiding a `build-kata-deploy` image for s390x|ppc64le.
It is a temporary solution to fix current CI failures for s390x regarding `hash sum mismatch`.

Fixes: #7848
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-09-06 11:12:03 +02:00
Zhongtao Hu
aa85e0b3ec Merge pull request #7714 from justxuewei/volumes-cleanup
runtime-rs: Fix volumes and rootfs cleanup issues
2023-09-06 10:13:55 +08:00
Gabriela Cervantes
438fbf9669 metrics: Add write 95 percentile for FIO for qemu
This PR adds the write 95 percentile for FIO for qemu for
checkmetrics for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 22:50:31 +00:00
Gabriela Cervantes
024b4d2ffe metrics: Add write 95 percentile FIO value
This PR adds the write 95 percentile FIO value for checkmetrics
for kata metrics.

Fixes #7842

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 21:00:05 +00:00
GabyCT
3e3a91fd2c Merge pull request #7577 from GabyCT/topic/enableiperfm
metrics: Enable iperf benchmark on gha for kata metrics
2023-09-05 14:53:47 -06:00
Gabriela Cervantes
e98e5cdea2 metrics: Add checkmetrics to gha run script
This PR adds the checkmetrics to gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 17:05:03 +00:00
Gabriela Cervantes
c1edfe5511 metrics: Add checkmetrics value for qemu for iperf
This PR adds the checkmetrics value for qemu for iperf benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
6a79ecedf9 metrics: Add jitter value for clh
This PR adds jitter value for clh for iperf metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
f609a9a754 metrics: Add test selector to iperf metrics
This PR adds test selector to iperf metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
5b8db30422 metrics: Enable iperf benchmark on gha for kata metrics
This PR enables the iperf benchmark to run on the gha for kata metrics.

Fixes #7575

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Jeremi Piotrowski
cf46b056fd Merge pull request #7839 from openanolis/chao/switch_to_azure
CI: switch static-checks-dragonball CI machines to Azure
2023-09-05 10:59:02 +02:00
Chao Wu
60f733d301 CI: switch static-checks-dragonball CI machines to Azure
Previously, static-checks-dragonball is using machines from Alibaba
Cloud to run all the CI jobs.

Currently, we are going through an internal process to apply for the new
machines for Dragonball CI. Before the internal process is over, we will
temporarily use Azure VM to run static-checks-dragonball jobs.

fixes: #7838

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-09-05 15:19:07 +08:00
alex.lyn
7870b33a2d runtime-rs: bring hybridVsock devices in manager.
Currently, virtio_vsock are still outside of the device
manager. This causes some management issues,such as the
inability to unify PCI address management.

Just do some work for hybrid vsock.

Fixes: #7655

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-09-05 08:46:56 +08:00
Jeremi Piotrowski
18c94ebbe3 kata-deploy: Create kata-static.tar with correct ownership
Pass --owner and --group to the tar invokation to prevent gihtub runner user
from leaking into release artifacts.

Fixes: #7832
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-04 17:24:00 +02:00
Fabiano Fidêncio
b663ec21ac Merge pull request #7803 from GabyCT/topic/readmereportdoc
metrics: Add README for kata metrics report
2023-09-03 21:57:13 +02:00
Fabiano Fidêncio
e490b0bc76 Merge pull request #7808 from ManaSugi/fix/remove-manual-chcon
osbuilder: Remove chcon operation for guest SELinux
2023-09-03 21:55:02 +02:00
Fabiano Fidêncio
27dab249a0 Merge pull request #7800 from jodh-intel/kata-sys-util-update-tdx-protection-checks
kata-sys-util: protection: Update TDX checks
2023-09-02 14:47:51 +02:00
Jiang Liu
d5729e818c Merge pull request #7819 from jiangliu/storage-cleanup
Improve the way to clean up storage devices for sandbox
2023-09-02 17:02:51 +08:00
Jiang Liu
57e7bf14a6 agent: refine StorageDeviceGeneric::cleanup()
Refine StorageDeviceGeneric::cleanup() to improve safety.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-02 14:22:21 +08:00
Jiang Liu
53edb19374 agent: implement StorageDeviceGeneric::cleanup()
Refactor cleanup_sandbox_storage as StorageDeviceGeneric::cleanup().

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-02 14:00:26 +08:00
Jiang Liu
0c63453e28 types: make StorageDevice::cleanup() return possible error code
Make StorageDevice::cleanup() return possible error code.

Fixes: #7818

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-02 13:27:06 +08:00
Jiang Liu
3a3d77b3b5 agent: move StorageDeviceGeneric from kata-types into agent
Move StorageDeviceGeneric from kata-types into agent, so we can
refactor code later.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-02 13:12:17 +08:00
Jiang Liu
d848126b61 Merge pull request #7821 from jiangliu/storage-leak
agent: avoid possible leakage of storage device
2023-09-02 12:40:40 +08:00
Fabiano Fidêncio
4f92e6df90 Merge pull request #7683 from microsoft/danmihai1/policy-tests
tests: add policy to existing tests
2023-09-01 23:52:15 +02:00
David Esparza
b151cfd140 metrics: re-enable memory-usage initialization step
This PR re-enables the initialization step disabled
on 538c965c2b.

Fixes: #7804

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-01 14:29:34 -06:00
Fabiano Fidêncio
f3e1a6a94f osbuilder: alpine: Change mirror
As we're hitting a lot of:
```
ERROR: https://dl-5.alpinelinux.org/alpine/v3.18/main: operation timed
out
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 16:01:42 +00:00
Fabiano Fidêncio
ac612aef5e osbuilder: alpine: Match the version on versions.yaml
We've switching to 3.18 as part of
82cd14ba39.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 16:01:33 +00:00
Jiang Liu
9cd706d1c9 agent: avoid possible leakage of storage device
When a storage device is used by more than one container, the second
and forth instances will cause storage device reference count leakage,
thus cause storage device leakage. The reason is:
add_storages() will increase reference count of existing storage device,
but forget to add the device to the `mount_list` array, thus leak the
reference count.

Fixes: #7820

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-01 22:52:42 +08:00
Dan Mihai
bf21411e90 tests: add policy to k8s tests
Use AGENT_POLICY=yes when building the Guest images, and add a
permissive test policy to the k8s tests for:
- CBL-Mariner
- SEV
- SNP
- TDX

Also, add an example of policy rejecting ExecProcessRequest.

Fixes: #7667

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-01 14:28:08 +00:00
Dan Mihai
d0e0610679 runtime: config: use the SEV initrd for SNP
Thanks Unmesh Deodhar!

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-01 14:28:08 +00:00
Fabiano Fidêncio
67fed26f18 runtime: Use TDX image with in the qemu-tdx config
Let's make sure we use the TDX image as part of the QEMU TDX
configuration, which will help us to have the policies tested here.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 14:28:08 +00:00
Fabiano Fidêncio
f65ffb23da Merge pull request #7814 from fidencio/topic/gha-rebase-prs-atop-of-main-for-the-tests
gha: Rebase PR atop of the target branch before testing
2023-09-01 16:26:32 +02:00
Fabiano Fidêncio
ef70aeb6b8 Merge pull request #7817 from fidencio/topic/update-alpine-to-its-latest-release
versions: Update alpine to its 3.18 version
2023-09-01 14:51:58 +02:00
Fabiano Fidêncio
ac939c458c gha: Rebase atop of the target branch
We have two scenarios we care about this, `pull_request` and
`pull_request_target` events triggered a job.

`pull_request` event:
When using the checkout action, it'll already provide a "rebased atop of
main" repo for us, nothing else is needed, and that's basically what we
already have as part of the jobs in our CI.

`pull_request_target` event:
This one is a little bit tricky, as the checkout action, unless passing
a spsecific repo, give us the PR checked out rebased atop of the HEAD of
the PR branch.  Jeremi Piotrowski nicely pointed out that we could use
github.event.pull_request.merge_commit_sha instead, which is the result
of the PR's branch with the official repo target branch.

Now, the only cases where the contributor's rebase would still be needed
is when the action itself has been changed.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 11:23:31 +02:00
Jeremi Piotrowski
bde06758b1 Merge pull request #7761 from jepio/iocopy-fix-race
runtime: Fix data race in ioCopy
2023-09-01 09:30:54 +02:00
Fabiano Fidêncio
82cd14ba39 versions: Update alpine to its 3.18 version
3.15 will be out of life in 2 months from now.

Fixes: #7816

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-31 23:02:54 +02:00
GabyCT
d75c7b5f9c Merge pull request #7813 from GabyCT/topic/genreport
metrics: Add grabdata script for metrics report
2023-08-31 13:33:38 -06:00
Gabriela Cervantes
6668825752 metrics: Add grabdata script for metrics report
This PR adds the grabdata script so it can be used for the metrics report
for kata metrics.

Fixes #7812

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-31 16:17:29 +00:00
James O. D. Hunt
c290eaed8c kata-sys-util: protection: Update TDX checks
Update the protection checking code to detect newer versions of Intel
TDX (whose userland interface has now stabilised).

> **Note:** that we don't need to retain the existing behaviour since:
>
> - We haven't yet landed the TDX feature (#6448).
> - Systems wishing to use TDX will need to use the latest available
>   system components (such as firmware and host kernel).

Also added an explicit TDX unit test.

Fixes: #7384.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-08-31 16:15:15 +01:00
Fabiano Fidêncio
d7a996c686 gha: Update to checkout@v3 action
At this point we should always be using the latest checkout action.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-31 16:02:31 +02:00
Jeremi Piotrowski
d7612440b8 Merge pull request #7789 from beraldoleal/tests/amd
Fixes tests on AMD machines
2023-08-31 11:23:51 +02:00
Jeremi Piotrowski
c2ba29c15b runtime: Fix data race in ioCopy
IoCopy is a tricky function (I don't claim to fully understand its contract),
but here is what I see: The goroutine that runs it spawns 3 goroutines - one
for each stream to handle (stdin/stdout/stderr). The goroutine then waits for
the stream goroutines to exit. The idea is that when the process exits and is
closed, the stdout goroutine will be unblocked and close stdin - this should
unblock the stdin goroutine. The stderr goroutine will exit at the same time as
the stdout goroutine. The iocopy routine then closes all tty.io streams.

The problem is that the stdout goroutine decrements the WaitGroup before
closing the stdin stream, which causes the iocopy goroutine to race to close
the streams. Move the wg.Done() of the stdout routine past the close so that
*this* race becomes impossible. I can't guarantee that this doesn't affect some
unspecified behavior.

Fixes: #5031
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-08-31 10:17:38 +02:00
Manabu Sugimoto
211de08d9e osbuilder: Remove chcon operation for guest SELinux
Remove the `chcon` operation which adds `container_runtime_exec_t` label to
the `kata-agent` binary because the container-selinux package including
the 39f83cc74d
commit has been released officially.
Ref. https://centos.pkgs.org/9-stream/centos-appstream-x86_64/container-selinux-2.221.0-1.el9.noarch.rpm.html

The container-selinux package is installed in a guest rootfs when we create it with `SELinux = yes`,
and `restorecon` sets `container_runtime_exec_t` to the `kata-agent`.

Fixes: #7807

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-31 16:44:32 +09:00
GabyCT
b467f2ef68 Merge pull request #7772 from GabyCT/topic/fiolimit
metrics: Enable FIO limits for kata metrics
2023-08-30 14:49:04 -06:00
Gabriela Cervantes
9f21fa9b39 metrics: Add report generator link to general documentation
This PR adds the report generator link to general documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 16:55:14 +00:00
Gabriela Cervantes
c0ed5ea0ad metrics: Add README for kata metrics report
This PR adds the README for kata metrics report.

Fixes #7802

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 16:36:08 +00:00
Fabiano Fidêncio
aa2b51a831 Merge pull request #7783 from GabyCT/topic/makereport
metrics: Add metrics report script
2023-08-30 17:11:39 +02:00
Gabriela Cervantes
a7b59a5bf9 metrics: Add limit for 90 percentile for qemu value
This PR adds the limit for 90 percentile for qemu value for
FIO kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
99db6568e9 metrics: Add limit for write 90 percentile value for clh
This PR adds the limit for write 90 percentile value for clh for
FIO metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
6e06392c55 metrics: Enable FIO limits for kata metrics
This PR enables the FIO limits for kata metrics.

Fixes #7771

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
David Esparza
924d06a7f5 Merge pull request #7787 from GabyCT/topic/fixmemoryinsidelimit
metrics: Fix memory inside limits for kata metrics
2023-08-30 07:45:17 -06:00
Peng Tao
2e4c874726 runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure
If we are running FC hypervisor, it is not started when prestart hooks
are executed. So we should just ignore such error and just go ahead and
run the hooks.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-30 03:06:11 +00:00
Peng Tao
21204caf20 runtime: fail early when starting docker container with FC
FC does not support network device hotplug. Let's add a check to fail
early when starting containers created by docker.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-30 02:52:01 +00:00
Peng Tao
32fd013716 runtime: run prestart hooks before starting VM for FC
Add a new hypervisor capability to tell if it supports device hotplug.
If not, we should run prestart hooks before starting new VMs as nerdctl
is using the prestart hooks to set up netns. To make nerdctl + FC
to work, we need to run the prestart hooks before starting new VMs.

Fixes: #6384
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-30 02:52:01 +00:00
Beraldo Leal
00e7ffd988 tests: check vmx only on Intel machines
When running on amd machines, those tests will fail because there is no
vmx flag. Following other tests that checks for cpuType, let's adapt
them to restrict vmx only on Intel machines.

Fixes #7788.
Related #5066

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-08-29 20:04:31 -04:00
Gabriela Cervantes
c8dd3c0737 metrics: Fix memory footprint qemu limit
This PR fixes the memory footprint qemu limit for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 22:51:21 +00:00
Gabriela Cervantes
8877ec62fb metrics: Fix memory inside limits for kata metrics
This PR fixes the memory inside limit for clh for kata metrics due
to the recent changes that we had in the script which impacted
in the performance measurement.

Fixes #7786

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 21:38:18 +00:00
Beraldo Leal
80146f2078 tests: Fixes cpuType check on AMD machines
cpuType is not initialized yet. gets 0 (Intel) by default, failing on
AMD machines.

Fixes #7785

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-08-29 17:04:07 -04:00
Gabriela Cervantes
7e364716dd metrics: Add test setup details to metrics report
This PR adds test setup details to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:56:53 +00:00
Gabriela Cervantes
17dc1b9760 metrics: Add boot lifecycle times to metrics report
This PR adds the boot lifecycle times to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:55:44 +00:00
Gabriela Cervantes
3b0d6538f2 metrics: Add memory inside container to metrics report
This PR adds memory inside container to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:53:17 +00:00
Gabriela Cervantes
79fbb9d243 metrics: Add scaling system footprint in metrics report
This PR adds scaling system footprint in metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:51:27 +00:00
Gabriela Cervantes
8e6d4e6f3d metrics: Add metrics reportgen
This PR adds metrics reportgen for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:45:36 +00:00
Gabriela Cervantes
139ffd4f75 metrics: Add report file titles
This PR adds report file titles for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:43:06 +00:00
GabyCT
8f2dae7b53 Merge pull request #7775 from dborquez/fix_memory_usage_parsing_results
metrics: fix parsing issue on memory-usage test
2023-08-29 11:26:13 -06:00
Gabriela Cervantes
878d1a2e7d metrics: Generate PNGs alongside the PDF report
This PR generates the PNGs for the kata metrics PDF report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:50:32 +00:00
Gabriela Cervantes
fce2487971 metrics: Add metrics report R files
This PR adds the metrics report R files.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:45:22 +00:00
Gabriela Cervantes
08812074d1 metrics: Add report dockerfile
This PR adds the report dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:28:32 +00:00
Gabriela Cervantes
69781fc027 metrics: Add metrics report script
This PR adds metrics report script for kata metrics.

Fixes #7782

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:25:14 +00:00
Chao Wu
e4fb20c74a Merge pull request #7585 from lifupan/main
dragonball: vsock add fifo/pipe stream support for passed fd hybridSt…
2023-08-29 23:39:21 +08:00
Fabiano Fidêncio
50e51bcafe Merge pull request #7185 from UnmeshDeodhar/add-cc-sev-test
tests: Add confidential test
2023-08-29 15:32:25 +02:00
Fabiano Fidêncio
e286e842c1 tests: Expand confidential test to support TDX
Let's expand the confidential test to also support TDX.

The main difference on the test, though, is that we're not grepping for
a string in the `dmesg` output, but rather relying on `cpuid` to detect
a TDX guest.

Fixes: #7184

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
e31f099be1 tests: Expand confidential test to support SNP
Let's expand the confidential test to also support SNP.

Fixes: #7184

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
c3b9d4945e tests: Add confidential test for SEV
Add a test case for the launch of unencrypted confidential
container, verifying that we are running inside a TEE.

Right now the test only works with SEV, but it'll be expanded in the
coming commits, as part of this very same series.

Fixes: #7184

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:34 +02:00
David Esparza
538c965c2b metrics: fix parsing issue on memory-usage test
This PR fixes an issues in the parsing results stage,
by collecting just the n-results from the n-running
containers, discarding irrelevant data.

Fixes: #7774

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-28 23:39:46 -06:00
Fabiano Fidêncio
708b0a3052 Merge pull request #7768 from fidencio/topic/update-tdx-to-the-6.2-kernel-based-stack
tdx: Update the components needed for using the 6.2 kernel stack
2023-08-28 19:27:15 +02:00
Fabiano Fidêncio
3818bf3311 local-build: Remove $HOME/.docker/buildx/activity/default
The file can be removed between builds without causing any issue, and
leaving it around has been causing us some headache due to:
```
ERROR: open /home/runner/.docker/buildx/activity/default: permission denied
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Fabiano Fidêncio
d1b54ede29 qemu: tdx: Workaround SMP issue with TDX 1.5
`...,sockets=1,cores=numvcpus,threads=1,...` must be used.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Archana Shinde
1e34220c41 qemu: tdx: Adapt to the TDX 1.5 stack
QEMU for TDX 1.5 makes use of private memory map/unmap.
Make changes to govmm to support this. Support for private backing fd
for memory is added as knob to the qemu config.

Userspace's map/unmap operations are done by fallocate() ioctl on the
backing store fd.
Reference:
https://lore.kernel.org/linux-mm/20220519153713.819591-1-chao.p.peng@linux.intel.com/

Fixes: #7770

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Fabiano Fidêncio
8115a0522d versions: tdx: Update Kernel to 6.2 + TDX
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:34 +02:00
Fabiano Fidêncio
ec18180f34 versions: tdx: Update TDVF to the "edk2-stable202302"
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:34 +02:00
Fabiano Fidêncio
9803b24286 versions: tdx: Update QEMU to v7.2 + TDX v1.10
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:27 +02:00
Fabiano Fidêncio
02a08c956b Merge pull request #7754 from microsoft/danmihai1/pod-quota-deployment
tests: delete k8s deployment at the test's end
2023-08-27 17:52:00 +02:00
Fabiano Fidêncio
98037ced52 Merge pull request #7755 from microsoft/danmihai1/unique-test-name
tests: use unique test name
2023-08-27 17:27:40 +02:00
Zhongtao Hu
f0440a9cfe Merge pull request #7742 from frezcirno/fix-log-forwarder-loop
runtime-rs: check peer close in log_forwarder
2023-08-26 10:44:09 +08:00
Fabiano Fidêncio
16a610d788 Merge pull request #7758 from fidencio/topic/gha-avoid-fail-fast-till-everything-is-ultra-stable
gha: Avoid "fail-fast" in tests that are known to be flaky
2023-08-25 16:49:26 +02:00
Jiang Liu
91db888d83 Merge pull request #7602 from jiangliu/agent-storage
Refine storage device management for kata-agent
2023-08-25 22:20:18 +08:00
Zixuan Tan
dffc16e5b3 runtime-rs: check peer close in log_forwarder
The log_forwarder task does not check if the peer has closed, causing a
meaningless loop during the period of “kata vm exit”, when the peer
closed, and “ShutdownContainer RPC received” that aborts the log forwarder.

This patch fixes the problem.

Fixes: #7741

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2023-08-25 19:00:07 +08:00
Jiang Liu
aaa5ab1264 agent: simplify storage device by removing StorageDeviceObject
Simplify storage device implementation by removing StorageDeviceObject.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-25 17:23:16 +08:00
Fabiano Fidêncio
fb49d5d7ce gha: Avoid "fail-fast" in tests that are known to be flaky
Otherwise we'll have to re-run all the tests due to a flaky behaviour in
one of the parts.

Fixes: #7757

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-25 10:00:17 +02:00
Dan Mihai
183f51d6f6 tests: use unique test name
k8s-pid-ns.bats was already using the test name from
k8s-kill-all-process-in-container.bats - probably a copy/paste bug.

Fixes: #7753

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-25 03:41:06 +00:00
Dan Mihai
6a974679f2 tests: delete k8s deployment at the test's end
At the end of k8s-kill-all-process-in-container.bats, delete the
deployment it created.

Fixes: #7752

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-25 03:34:37 +00:00
David Esparza
686eb3878b Merge pull request #7751 from GabyCT/topic/unusednhwc
metrics: Remove unused variable in tensorflow nhwc script
2023-08-24 18:34:06 -06:00
Fabiano Fidêncio
f1d8e1f513 Merge pull request #7747 from fidencio/topic/kata-deploy-dont-try-to-remove-opt-kata
kata-deploy: Don't try to remove /opt/kata
2023-08-24 18:56:52 +02:00
Gabriela Cervantes
32a778b6da metrics: Remove unused variable in tensorflow nhwc script
This PR removes unused variable in tensorflow nhwc script.

Fixes #7750

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-24 15:54:27 +00:00
David Esparza
875a85ee14 Merge pull request #7736 from GabyCT/topic/tensorflowfp32
metrics: Add TensorFlow ResNet50 FP32 benchmark
2023-08-24 08:56:24 -06:00
Fabiano Fidêncio
d8f3ce6497 kata-deploy: Don't try to remove /opt/kata
The directory is a host path mount and cannot be removed from within the
container.  What we actually want to remove is whatever is inside that
directory.

This may raise errors like:
```
rm: cannot remove '/opt/kata/': Device or resource busy
```

Fixes: #7746

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-24 13:57:36 +02:00
Jeremi Piotrowski
71c90b994a Merge pull request #7745 from jepio/vfio-part-0
gha: vfio: Run on Ubuntu 23.04 runner
2023-08-24 12:15:19 +02:00
Greg Kurz
9991772b26 Merge pull request #7718 from littlejawa/fix_filemode_when_zero
kata-agent: use default filemode for block device when it is set to 0
2023-08-24 11:40:28 +02:00
Jeremi Piotrowski
936e8091a7 gha: vfio: Run on Ubuntu 23.04 runner
The vfio test requires nested-nested virtualization:

L0 Azure host
-> L1 Ubuntu VM
  -> L2 Fedora VM
    -> L3 Kata

This hits a kernel bug on v5.15 but works quite nicely on the v6.2 kernel
included in Ubuntu 23.04. We can switch back to Ubuntu 22.04 when they roll out
v6.2.

Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-08-24 10:10:02 +02:00
Jiang Liu
0e7248264d agent: move storage device related code into dedicated files
Move storage device related code into dedicated files.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 13:48:51 +08:00
Xuewei Niu
268e846558 runtime-rs: Fix volumes and rootfs cleanup issues
There are several processes for container exit:

- Non-detach mode: `Wait` request is sent by containerd, then
  `wait_process()` will be called eventually.
- Detach mode: `Wait` request is not sent, the `wait_process()` won’t be
  called.
    - Killed by ctr: For example, a container runs `tail -f /dev/null`, and
      is killed by `sudo ctr t kill -a -s SIGTERM <CID>`. Kill request is
      sent, then `kill_process()` will be called. User executes `sudo ctr c
      rm <CID>`, `Delete` request is sent, then `delete_process()` will be
      called.
    - Exited on its own: For example, a container runs `sleep 1s`. The
      container’s state goes to `Stopped` after 1 second. User executes
      the delete command as below.

Where do we do container cleanup things?

- `wait_process()`: No, because it won’t be called in detach mode.
- `delete_process()`: No, because it depends on when the user executes the
  delete command.
- `run_io_wait()`: Yes. A container is considered exited once its IO ended.
  And this always be called once a container is launched.

Fixes: #7713

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-24 13:23:47 +08:00
Jiang Liu
8f49ee33b2 agent: refine storage related code a bit
Refine storage related code by:
- remove the STORAGE_HANDLER_LIST
- define type alias
- move code near to its caller

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 13:09:10 +08:00
Jiang Liu
60ca12ccb0 agent: switch to new storage subsystem
Switch to new storage subsystem to create a StorageDevice for each
storage object.

Fixes: #7614

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 13:09:09 +08:00
Jiang Liu
fcbda0b419 kata-types: introduce StorageDevice and StorageHandlerManager
Introduce StorageDevice and StorageHandlerManager, which will be used
to refine storage device management for kata-agent.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 13:08:55 +08:00
Jiang Liu
b03b1f6134 agent: simplify the way to manage storage object
Simplify the way to manage storage objects, and introduce
StorageStateCommon structures for coming extensions.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:58:24 +08:00
Jiang Liu
8392c71bf2 sys-util: support more mount flags in parse_mount_options()
Support more mount flags in parse_mount_options().

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:17:39 +08:00
Jiang Liu
c00d8f3d48 agent: use create_mount_destination() from kata-sys-util
Use create_mount_destination() from kata-sys-util crate to reduce
redundant code.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:17:38 +08:00
Jiang Liu
5e867f0538 types: add more mount related constants
Add more mount related constants.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:17:36 +08:00
Jiang Liu
880e6c9a76 agent: use function from kata-sys-utils to reduce code
Use function get_linux_mount_info() from kata-sys-util crate to share
common code.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:17:34 +08:00
QuanweiZhou
a6921dd837 Merge pull request #7698 from jiangliu/virtual-volume
kata-types: introduce KataVirtualVolume to support nydus, direct volume and image pull
2023-08-24 11:50:39 +08:00
Fabiano Fidêncio
7705c5962e Merge pull request #7728 from ManaSugi/fix/typo-test-toml
libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml
2023-08-23 23:55:41 +02:00
GabyCT
c1712e1930 Merge pull request #7737 from jepio/fix-local-build
local-build: Remove GID before creating group
2023-08-23 12:26:39 -06:00
Jeremi Piotrowski
3b881fbc0e local-build: Remove GID before creating group
docker install now creates a group with gid 999 which happens to match what we
need to get docker-in-docker to work. Remove the group first as we don't need
it.

Fixes: #7726
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-08-23 18:58:38 +02:00
David Esparza
ebce5d25a9 Merge pull request #7734 from fidencio/topic/kata-deploy-fix-removal
kata-deploy: Avoid failing on content removal
2023-08-23 10:29:57 -06:00
Gabriela Cervantes
959ca49447 metrics: Add TensorFlow ResNet50 fp32 Dockerfile
This PR adds the TensorFlow ResNet50 fp32 Dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-23 16:24:58 +00:00
Gabriela Cervantes
4b7d72c4a8 metrics: Add TensorFlow ResNet50 FP32 benchmark
This PR adds TensorFlow ResNet50 FP32 benchmark for kata metrics.

Fixes #7735

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-23 16:21:09 +00:00
Fabiano Fidêncio
e7e4cc2182 Merge pull request #7716 from bergwolf/github/image-initrd-assets
runtime: fix image and initrd assets handling
2023-08-23 18:02:15 +02:00
Fabiano Fidêncio
5cba38c175 kata-deploy: Avoid failing on content removal
We can simply use `rm -f` all over the place and avoid the container
returning any error.

Fixes: #7733

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-23 16:49:26 +02:00
Peng Tao
18d42da21e runtime/fc: fix image/initrd annotation handling
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.

Make sure annotations are preferred over config options in image and initrd
path handling.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-23 03:47:28 +00:00
Peng Tao
9fda7059a5 runtime/clh: fix image/initrd annotation handling
We should make sure annotations are preferred over
config options in image and initrd path handling.

Fixes: #7705
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-23 03:47:28 +00:00
Peng Tao
1a0092d631 runtime/qemu: fix image/initrd annotation handling
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.

Add a helper function ImageOrInitrdAssetPath to make sure annotations
are preferred over config options in image and initrd path handling.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-23 03:47:27 +00:00
Manabu Sugimoto
22d8f335d6 libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml
Change `pdisable_guest_seccomp` to `disable_guest_seccomp`

Fixes: #7727

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-23 12:08:18 +09:00
GabyCT
b8990c0490 Merge pull request #7722 from GabyCT/topic/adddiskreadme
metrics: Add disk link to README
2023-08-22 12:29:54 -06:00
GabyCT
514d3d42b8 Merge pull request #7712 from GabyCT/topic/fixfiopath
metrics: Fix FIO path
2023-08-22 12:28:28 -06:00
Gabriela Cervantes
8afd158cef metrics: Add disk link to README
This PR adds disk link to README documentation for kata metrics.

Fixes #7721

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-22 16:20:31 +00:00
Julien Ropé
40914b25d4 kata-agent: use default filemode for block device when it is set to 0
When the FileMode field for the device is unset (0), use a default value instead
to allow the use of the device from the container.
This behaviour is seen from cri-o typically.

Note: this is what runc is doing, which is why regular containers don't have an
issue. This change makes sure kata behaves the same as runc.

Fixes: #7717

Signed-off-by: Julien Ropé <jrope@redhat.com>
2023-08-22 16:08:14 +02:00
Fabiano Fidêncio
8032797418 Merge pull request #7708 from microsoft/danmihai1/kata-deploy-log
gha: capture additional kata-deploy output
2023-08-21 23:43:51 +02:00
David Esparza
d2c130ea69 Merge pull request #7710 from GabyCT/topic/fixpytorch1
metrics: Use function from metrics common in pytorch script
2023-08-21 15:31:24 -06:00
Gabriela Cervantes
eee2ee6eeb metrics: Fix FIO path
This PR fixes the FIO path for the FIO files.

Fixes #7711

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-21 21:06:04 +00:00
David Esparza
9347051592 Merge pull request #7666 from dborquez/metrics_improve_fio_test
metrics: Enable kata runtime in K8s for FIO test.
2023-08-21 13:51:57 -06:00
Gabriela Cervantes
39bc3488f5 metrics: Use function from metrics common in pytorch script
This PR uses a common function into the pytorch script.

Fixes #7709

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-21 16:12:35 +00:00
Dan Mihai
400eb88743 gha: capture additional kata-deploy output
10 lines can be insufficient for diagnostics.

Fixes: #7707

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-21 15:58:57 +00:00
GabyCT
700759232f Merge pull request #7690 from GabyCT/topic/fixpytorch
metrics: Fix README for pytorch
2023-08-21 09:50:14 -06:00
Jiang Liu
6e038e66e4 Merge pull request #7680 from GabyCT/topic/removetime
metrics: Remove unused variable in tensorflow mobilenet script
2023-08-21 23:39:07 +08:00
Jiang Liu
4aee3eade0 kata-types: implement serde methods for KataVirtualVolume
Implement serilization/deserialization methods for KataVirtualVolume.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-21 16:46:56 +08:00
Jiang Liu
b875e39323 kata-types: validate KataVirtualVolume object
Implement method validate() for KataVirtualVolume to validate message
format.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-21 16:42:07 +08:00
Jiang Liu
fa2fdc1057 kata-types: implement two conversion helpers for KataVirtualVolume
Enable conversions from NydusExtraOptions/DirectVolumeMountInfo to
KataVirtualVolume.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-21 16:35:26 +08:00
Jiang Liu
6326af20e3 kata-types: introduce KataVirtualVolume
Introduce structure KataVirtualVolume to to encapsulate information
for extra mount options and direct volumes, so we could build a common
infrastructure to handle these cases.

Fixes: #7699

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-21 16:19:47 +08:00
Gabriela Cervantes
c8b43f8b3e metrics: Fix README for pytorch
This PR fixes the pytorch reference in the README file.

Fixes #7689

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-18 20:14:49 +00:00
Aurélien
fa34d61805 Merge pull request #7664 from microsoft/danmihai1/agent-init-policy
rootfs: agent: Policy support with AGENT_INIT=yes
2023-08-18 10:51:55 -07:00
Fabiano Fidêncio
7e66d1f6b5 Merge pull request #7649 from fidencio/topic/k8s-tests-remove-kata-deploy-tests
gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy
2023-08-18 07:47:26 +02:00
David Esparza
fb571f8be9 metrics: Enable kata runtime in K8s for FIO test.
This PR configures the corresponding kata runtime in K8s
based on the tested hypervisor.

This PR also enables FIO metrics test in the kata metrics-ci.

Fixes: #7665

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-17 17:11:27 -06:00
Dan Mihai
cb056f8cb3 rootfs: agent: Policy support with AGENT_INIT=yes
When building with AGENT_POLICY=yes and AGENT_INIT=yes:
1. Include OPA and the Policy settings in rootfs.
2. Start OPA from the kata agent.

Before these changes, building with both AGENT_POLICY=yes and
AGENT_INIT=yes was unsupported.

Starting OPA from systemd (when AGENT_INIT=no) was already supported.

Fixes: #7615

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-17 22:37:58 +00:00
GabyCT
c358056a3f Merge pull request #7685 from GabyCT/topic/changename
metrics: Fix check results for tensorflow benchmark
2023-08-17 15:39:43 -06:00
Gabriela Cervantes
85c02828e1 metrics: Update tensorflow name in gha run script
This PR update tensorflow name in gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 20:17:48 +00:00
Gabriela Cervantes
e8a5119343 metrics: Fix check results for tensorflow benchmark
This PR fixes the check results for tensorflow benchmark now
that we change the name of the test.

Fixes #7684

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 19:52:45 +00:00
Fabiano Fidêncio
2d896ad12f gha: kata-deploy: Do the runtime class cleanup as part of the cleanup
Instead of doing this as part of the test itself, let's ensure it's done
before running the tests and during the tests cleanup.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 18:54:46 +02:00
Fabiano Fidêncio
4ffc2c86f3 gha: kata-deploy: Add the first kata-deploy test
This test, at least for now, only checks whether the runtimeclasses
have been properly created.

This is just a migration from a test we had as part of the k8s suite.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 18:54:46 +02:00
GabyCT
4ba684e6e4 Merge pull request #7653 from GabyCT/topic/tensorflowfp32
metrics: Add Tensorflow ResNet50 int8 benchmark
2023-08-17 10:44:25 -06:00
Gabriela Cervantes
8616c050ae metrics: Remove unused variable in tensorflow mobilenet script
This PR removes unused variable in tensorflow mobilenet script.

Fixes #7679

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 16:04:18 +00:00
Fabiano Fidêncio
285e616b5e tests: common: Ensure test_type is used as part of the cluster's name
By doing this we can make sure there won't be any clash on the cluster
name created for either the k8s or the kata-deploy tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:16 +02:00
Fabiano Fidêncio
790bd3548d tests: commob: Don't fail if yq is not part of the cache
This may happen on external runners.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:14 +02:00
Fabiano Fidêncio
ce6adecd0a gha: kata-deploy: Add run-kata-deploy-tests.sh
This will have the same function as run-k8s-tests.sh has, but for
kata-deploy.

Right now it doesn't have any tests, and the command to actually run the
tests is commented out, but right now this is just a placeholder that
will be populated sooner than later.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 09:49:03 +02:00
Fabiano Fidêncio
cfc29c11a3 gha: k8s: Stop running kata-deploy tests as part of the k8s suite
In a follow-up series, we'll add a whole suite for the kata-deploy
tests.  With this in mind, let's already get rid of this one and avoid
more kata-deploy tests to land here.

Fixes: #7642

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 09:48:54 +02:00
Fabiano Fidêncio
e470a650e0 Merge pull request #7654 from sprt/ci-fixes
kata-deploy: Properly create default runtime class
2023-08-17 09:43:34 +02:00
Wedson Almeida Filho
962378606e Merge pull request #7627 from wedsonaf/error-conv
agent: simplify error handling
2023-08-16 21:02:38 -03:00
Aurélien Bombo
f4dd152863 tests: k8s: Call ensure_yq() in setup.sh
It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing
the yq error so let's try this instead.

Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-08-16 14:13:56 -07:00
GabyCT
3d0cfc88c9 Merge pull request #7662 from GabyCT/topic/fixhelptensorflow
metrics: Fix MobileNet help me description
2023-08-16 14:13:39 -06:00
Aurélien Bombo
339569b69c kata-deploy: Properly create default runtime class
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.

Fixes: #7663

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-08-16 11:04:44 -07:00
Gabriela Cervantes
2a491e9b1f metrics: Fix MobileNet help me description
This PR fixes MobileNet help me description in the
tensorflow script.

Fixes #7661

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-16 15:25:39 +00:00
Fabiano Fidêncio
606e419fac Merge pull request #7660 from fidencio/topic/add-kata-deploy-tests-as-part-of-the-ci
gha: ci: Start running kata-deploy tests
2023-08-16 16:44:08 +02:00
Fabiano Fidêncio
d19a75e80c gha: ci: Start running kata-deploy tests
Let's add the tests as part of the ci.yaml, so they an be triggered as
part of each PR.

For this PR those tests won't be triggered, courtesy to the
`pull_request_target` event we rely on.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-16 16:08:05 +02:00
Fabiano Fidêncio
4adcf2192e Merge pull request #7651 from ManaSugi/runk/containerd-test
runk: Modify kill command's error message for containerd tests
2023-08-16 15:37:48 +02:00
Zhongtao Hu
5c8a61a4c8 Merge pull request #7558 from openanolis/fix/driver_option
runtime-rs: add driver option
2023-08-16 13:56:29 +08:00
Zhongtao Hu
d90f7ac689 runtime-rs: add unit test for block driver
add unit test for block driver

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:45:27 +08:00
Zhongtao Hu
e44919f0da runtime-rs: add load_test_config for unit test
add load_test_config for unit test

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:32:56 +08:00
Zhongtao Hu
7f48a69379 runtime-rs: add driver option
add driver option when handle linux devices

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:32:49 +08:00
Gabriela Cervantes
bade6a5c3b docs: Fix TensorFlow word across the document
This PR fixes the TensorFlow word across the document to have uniformity
across all the document.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 20:13:05 +00:00
Fabiano Fidêncio
0bc48eab60 Merge pull request #7640 from fidencio/topic/gha-cri-containerd-enable-tests
gha: cri-containerd: Enable tests
2023-08-15 21:18:28 +02:00
Gabriela Cervantes
1a1b207760 docs: Add Tensorflow Resnet50 documentation
This PR adds the Tensorflow Resnet50 documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:46:44 +00:00
Gabriela Cervantes
24baededc0 metrics: Add Dockerfile for ResNet50 int8
This PR adds the dockerfile for ResNet50 int8 benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:38:26 +00:00
Gabriela Cervantes
6d971ba8df metrics: Add Tensorflow ResNet50 int8 benchmark
This PR adds the Tensorflow ResNet50 int8 script for kata metrics.

Fixes #7652

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:30:22 +00:00
Manabu Sugimoto
25d151bd1b runk: Modify kill command's error message for containerd tests
The error message when the kill command is executed with the container's
state == Stopped should be "container not running" because the containerd
tests expect that OCI runtimes return the error message and compare it.
If the error message is different from the expected one, the tests fail.

Fixes: #7650

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-16 00:39:50 +09:00
GabyCT
0bbabeaaf8 Merge pull request #7644 from GabyCT/topic/renametensorflow
metrics: Rename tensorflow scripts
2023-08-15 09:23:24 -06:00
Fabiano Fidêncio
46d25d908d Merge pull request #7643 from fidencio/topic/add-functional-kata-deploy-tests
gha: tests: Add kata-deploy functional tests -- Part 1
2023-08-15 15:23:48 +02:00
Fabiano Fidêncio
b3592ab25c gha: cri-containerd: Enable tests
As the cri-containerd tests have been fully migrated to GHA, let's make
sure we get them running.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:32:42 +02:00
Fabiano Fidêncio
84dd02e0f9 gha: cri-containerd: Add timeout to the crictl calls on testContainerStop
As part of the runners, we're hitting a timeout that I cannot reproduce,
at all, when allocating the same instance and running the tests
manually.

The default timeout to connect to the server is 2s when using `crictl`.
Let's increase this to 20s.

It's fairly important to mention that in the first tests I used a
timeout of 10s, and that helped but we still hit issues every now and
then.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
b29782984a gha: cri-containerd: Show pod before deleting it
It'll help us to debug failures with the pod stop / pod delete.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
ae0930824a gha: cri-containerd: Print kata logs in case of error
We need this to fully understand what are the issues we're facing.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
6c8b2ffa60 gha: cri-containerd: Group containerd logs
This improves readability in case of failures by a lot.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
9e898701f5 gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account
Short commit log says it all.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Wedson Almeida Filho
76dac8f22c agent: simplify error handling
We extend the `Result` and `Option` types with associated types that
allows converting a `Result<T, E>` and `Option<T>` into
`ttrpc::Result<T>`.

This allows the elimination of many `match` statements in favor of
calling the map function plus the `?` operator. This transformation
simplifies the code.

Fixes: #7624

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-15 06:55:27 -03:00
Fabiano Fidêncio
e107d1d94e Merge pull request #7574 from microsoft/danmihai1/policy
agent: runtime: add Agent Policy feature
2023-08-15 11:29:13 +02:00
Bin Liu
ea81eb6c2e Merge pull request #7169 from chethanah/runk/support-no-pid-ns
runk: Support without pid ns
2023-08-15 13:00:40 +08:00
Gabriela Cervantes
18a7fd8e4e metrics: Rename tensorflow scripts
This PR renames the tensorflow scripts to include the data format
that is being used as we will have multiple tests with different
data and model formats for tensorflow so this will help us to
distinguish them.

Fixes #7645

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-14 20:40:35 +00:00
GabyCT
a740c80251 Merge pull request #7626 from GabyCT/topic/cassandrak
metrics: Add Cassandra Kubernetes benchmark for kata metrics
2023-08-14 14:22:52 -06:00
GabyCT
4e5e39e8b3 Merge pull request #7618 from GabyCT/topic/addfunctionscommon
metrics: Add common functions to the common script
2023-08-14 14:22:30 -06:00
GabyCT
a19d471c01 Merge pull request #7629 from dborquez/metrics_improve_stopping_kata_components
metrics: fix the loop used to stop kata components
2023-08-14 14:22:06 -06:00
Fabiano Fidêncio
e55fa93db9 tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx
This will not be tested as part of the PR, thanks to the
`pull_request_target` event, but we want it to be added so we can build
atop of that in a coming up series.

Fixes: #7642

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 21:38:00 +02:00
Fabiano Fidêncio
d9ee17aaec tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks
This will not be tested as part of the PR, thanks to the
`pull_request_target` event, but we want it to be added so we can build
atop of that in a coming up series.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 21:37:52 +02:00
Chelsea Mafrica
22465d22f0 Merge pull request #7638 from ManaSugi/fix/virtcontainers-doc
docs: Remove installation step in virtcontainers doc
2023-08-14 10:21:57 -07:00
Dan Mihai
ab829d1038 agent: runtime: add the Agent Policy feature
Fixes: #7573

To enable this feature, build your rootfs using AGENT_POLICY=yes. The
default is AGENT_POLICY=no.

Building rootfs using AGENT_POLICY=yes has the following effects:

1. The kata-opa service gets included in the Guest image.

2. The agent gets built using AGENT_POLICY=yes.

After this patch, the shim calls SetPolicy if and only if a Policy
annotation is attached to the sandbox/pod. When creating a sandbox/pod
that doesn't have an attached Policy annotation:

1. If the agent was built using AGENT_POLICY=yes, the new sandbox uses
   the default agent settings, that might include a default Policy too.

2. If the agent was built using AGENT_POLICY=no, the new sandbox is
   executed the same way as before this patch.

Any SetPolicy calls from the shim to the agent fail if the agent was
built using AGENT_POLICY=no.

If the agent was built using AGENT_POLICY=yes:

1. The agent reads the contents of a default policy file during sandbox
   start-up.

2. The agent then connects to the OPA service on localhost and sends
   the default policy to OPA.

3. If the shim calls SetPolicy:

   a. The agent checks if SetPolicy is allowed by the current
      policy (the current policy is typically the default policy
      mentioned above).

   b. If SetPolicy is allowed, the agent deletes the current policy
      from OPA and replaces it with the new policy it received from
      the shim.

   A typical new policy from the shim doesn't allow any future SetPolicy
   calls.

4. For every agent rpc API call, the agent asks OPA if that call
   should be allowed. OPA allows or not a call based on the current
   policy, the name of the agent API, and the API call's inputs. The
   agent rejects any calls that are rejected by OPA.

When building using AGENT_POLICY_DEBUG=yes, additional Policy logging
gets enabled in the agent. In particular, information about the inputs
for agent rpc API calls is logged in /tmp/policy.txt, on the Guest VM.
These inputs can be useful for investigating API calls that might have
been rejected by the Policy. Examples:

1. Load a failing policy file test1.rego on a different machine:

opa run --server --addr 127.0.0.1:8181 test1.rego

2. Collect the API inputs from Guest's /tmp/policy.txt and test on the
   machine where the failing policy has been loaded:

curl -X POST http://localhost:8181/v1/data/agent_policy/CreateContainerRequest \
--data-binary @test1-inputs.json

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-14 17:07:35 +00:00
Fabiano Fidêncio
831e73ff91 tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder
Right now this file does nothing, as it's not even called by any GHA.
However, it'll be populated later on as part of a different series,
where we'll have kata-deploy specific tests running here.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 17:46:10 +02:00
Fabiano Fidêncio
af1b46bbf2 tests: Add gha-run-k8s-common.sh
Let's split a good portion of `tests/integration/kuberentes/gha-run.sh`
out, and put them in a place where they can be used to the soon-to-come
kata-deploy specific tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 17:45:58 +02:00
Jeremi Piotrowski
a57e7ffe14 Merge pull request #7211 from stevenhorsman/propogate-secrets
Propogate secrets, config maps etc into guest if sharedFS not available
2023-08-14 11:24:47 +02:00
Manabu Sugimoto
416445e7eb docs: Remove installation step in virtcontainers doc
Remove the installation step in the virtcontainers doc
because the virtcontainers install/uninstall targets have
been removed by 86723b51ae
and they are not used anymore.

Fixes: #7637

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-14 15:15:24 +09:00
David Esparza
767434d50a metrics: fix the loop used to stop kata components #7629
This PR fixed the loop that stops the kata-shim and the
hypervisors used in metrics checks.

Fixes: #7628

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-11 12:32:41 -06:00
Gabriela Cervantes
5d0f0d43c7 metrics: Add cassandra statefulset yaml
This PR adds cassandra statefulset yaml for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:39 +00:00
Gabriela Cervantes
c1dcc1396f metrics: Add cassandra service yaml
This PR adds the cassandra service yaml for the benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:36 +00:00
Gabriela Cervantes
2297a0d1c5 metrics: Add block loop pvc yaml for cassandra
This PR adds block loop pvc yaml for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:33 +00:00
Gabriela Cervantes
e3d511946f metrics: Add block loop pv yaml for cassandra test
This PR adds the block loop pv yaml for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:29 +00:00
Gabriela Cervantes
9890271594 metrics: Add block loop pvc for cassandra test
This PR adds the block loop pvc for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:19 +00:00
Gabriela Cervantes
349b89969a metrics: Add Cassandra Kubernetes benchmark for kata metrics
This PR adds Cassandra Kubernetes benchmark for kata metrics tests.

Fixes #7625

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:21:48 +00:00
stevenhorsman
8815ed0665 runtime: Remove config warnings
Remove configuration file shared_fs = none warnings
now that there is a solution to updating configMaps, secrets etc

Fixes: #7210
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-08-11 16:31:08 +01:00
Yohei Ueda
afe1a6ac5a agent: support copying of directories and symlinks
This patch allows copying of directories and symlinks when
static file copying is used between host and guest. This change is
necessary to support recursive file copying between shim and agent.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(cherry picked from commit de232b8030)
2023-08-11 16:31:08 +01:00
Pradipta Banerjee
ab13ef87ee runtime: propagate configmap/secrets etc changes for remote-hyp
For remote hypervisor, the configmap, secrets, downward-api or project-volumes are
copied from host to guest. This patch watches for changes to the host files
and copies the changes to the guest.

Note that configmap updates takes significantly longer than updates via downward-api.
This is similar across runc and Kata runtimes.

Fixes: #7210

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Julien Ropé <jrope@redhat.com>
(cherry picked from commit 3081cd5f8e)
(cherry picked from commit 68ec673bc4d9cd853eee51b21a0e91fcec149aad)
2023-08-11 16:31:08 +01:00
Yohei Ueda
c074ec4df1 runtime: Copy shared files recursively
This patch enables recursive file copying
when filesystem sharing is not used.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
(cherry picked from commit 5422a056f2)
(cherry picked from commit 16055ce040bbd724be2916bc518d89b69c9e0ca5)

Fixes: #7210
2023-08-11 16:16:52 +01:00
Gabriela Cervantes
fdcd52ff78 metrics: Add check containers are running in tensorflow mobilenet
This PR adds check containers are running in tensorflow mobilenet
that is being defined in common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:17:20 +00:00
Gabriela Cervantes
36337ee146 metrics: Add check containers are up in tensorflow script
This PR adds the check containers are up function from common
in tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:15:18 +00:00
Gabriela Cervantes
f700f9b0ba metrics: Remove unused variable in tensorflow script
This PR removes an unused variable in tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:13:37 +00:00
Gabriela Cervantes
833cf7a684 metrics: Add check containers are running function
This PR adds the check containers are running function the common metrics
script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:12:22 +00:00
Gabriela Cervantes
918c783084 metrics: Add check containers are up in tensorflow mobilenet script
This PR adds the check containers are up in the common script
in the tensorflow mobilenet script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:06:40 +00:00
Gabriela Cervantes
9d57a1fab4 metrics: Use check containers are up in tensorflow script
This PR uses the check containers are up from the common script
in the tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:42:09 +00:00
Gabriela Cervantes
1c84680d8c metrics: Add check containers are up in common script
This PR adds check containers are up in common script for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:39:24 +00:00
Gabriela Cervantes
d3e57cf454 metrics: Use collect_results function in tensorflow mobilenet test
This PR uses the collect results function defined in common for
the tensorflow mobilenet test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:34:30 +00:00
Gabriela Cervantes
286de046af metrics: Remove collect results function definition
This PR removes the collect results function from tensorflow script
as it is going to be referenced in the common metrics script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:31:23 +00:00
Gabriela Cervantes
9879709aae metrics: Add common functions to the common script
This PR adds the collect results function to the common metrics
script.

Fixes #7617

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:27:11 +00:00
Fupan Li
39e67b06e9 dragonball: vsock add fifo/pipe stream support for passed fd hybridStream
Since the passed fd through unix socket would be any
stream fd such as pipe/fifo fd or any other socket
fd, thus we should deal with it as a normal hybrid
stream instead of a unix stream.

Fixes:#7584

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2023-08-10 11:07:10 +08:00
Manabu Sugimoto
f1d8de9be6 runk: Allow runk to launch a container without pid namespace
Allow runk to launch a container even though users don't specify the
pid namespace in `config.json` because general container runtimes
such as runc also can launch a container without the namespace.
On the other hand, Kata Containers doesn't allow it due to security issue
so this feature should be enabled in only runk.

Fixes: #7168

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-07-16 23:31:14 +05:30
685 changed files with 24953 additions and 35096 deletions

View File

@@ -21,7 +21,7 @@ jobs:
steps:
- name: Checkout code to allow hub to communicate with the project
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Install hub extension script
run: |

View File

@@ -39,7 +39,7 @@ jobs:
popd &>/dev/null
- name: Checkout code to allow hub to communicate with the project
uses: actions/checkout@v2
uses: actions/checkout@v4
- name: Add issue to issue backlog
env:

View File

@@ -21,7 +21,16 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v1
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}
- name: Install PR sizing label script
run: |

200
.github/workflows/basic-ci-amd64.yaml vendored Normal file
View File

@@ -0,0 +1,200 @@
name: CI | Basic amd64 tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-cri-containerd:
strategy:
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance.
fail-fast: false
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'qemu']
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts
- name: Run cri-containerd tests
run: bash tests/integration/cri-containerd/gha-run.sh run
run-containerd-stability:
strategy:
fail-fast: false
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'qemu']
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/stability/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/stability/gha-run.sh install-kata kata-artifacts
- name: Run containerd-stability tests
run: bash tests/stability/gha-run.sh run
run-nydus:
strategy:
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance.
fail-fast: false
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'qemu', 'dragonball']
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/nydus/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/nydus/gha-run.sh install-kata kata-artifacts
- name: Run nydus tests
run: bash tests/integration/nydus/gha-run.sh run
run-runk:
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts
- name: Run tracing tests
run: bash tests/integration/runk/gha-run.sh run
run-vfio:
strategy:
fail-fast: false
matrix:
vmm: ['clh', 'qemu']
runs-on: garm-ubuntu-2304
env:
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/functional/vfio/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Run vfio tests
timeout-minutes: 15
run: bash tests/functional/vfio/gha-run.sh run

View File

@@ -16,6 +16,10 @@ on:
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-asset:
@@ -23,9 +27,13 @@ jobs:
strategy:
matrix:
asset:
- agent
- agent-opa
- agent-ctl
- cloud-hypervisor
- cloud-hypervisor-glibc
- firecracker
- kata-ctl
- kernel
- kernel-sev
- kernel-dragonball-experimental
@@ -33,6 +41,7 @@ jobs:
- kernel-nvidia-gpu
- kernel-nvidia-gpu-snp
- kernel-nvidia-gpu-tdx-experimental
- log-parser-rs
- nydus
- ovmf
- ovmf-sev
@@ -44,12 +53,18 @@ jobs:
- rootfs-initrd
- rootfs-initrd-mariner
- rootfs-initrd-sev
- runk
- shim-v2
- tdvf
- trace-forwarder
- virtiofsd
stage:
- ${{ inputs.stage }}
exclude:
- asset: agent
stage: release
- asset: agent-opa
stage: release
- asset: cloud-hypervisor-glibc
stage: release
steps:
@@ -61,11 +76,17 @@ jobs:
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
@@ -76,6 +97,10 @@ jobs:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v3
@@ -89,9 +114,15 @@ jobs:
runs-on: ubuntu-latest
needs: build-asset
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v3
with:

View File

@@ -16,10 +16,14 @@ on:
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-asset:
runs-on: arm64
runs-on: arm64-builder
strategy:
matrix:
asset:
@@ -48,10 +52,17 @@ jobs:
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
@@ -62,6 +73,10 @@ jobs:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v3
@@ -72,16 +87,22 @@ jobs:
if-no-files-found: error
create-kata-tarball:
runs-on: arm64
runs-on: arm64-builder
needs: build-asset
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v3
with:

View File

@@ -16,6 +16,10 @@ on:
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-asset:
@@ -44,10 +48,17 @@ jobs:
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
@@ -59,6 +70,10 @@ jobs:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v3
@@ -76,9 +91,15 @@ jobs:
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v3
with:

View File

@@ -19,7 +19,7 @@ jobs:
steps:
- name: Checkout Code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Generate Action
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: bash cargo-deny-generator.sh

View File

@@ -1,170 +0,0 @@
name: CI | Publish CC runtime payload for amd64
on:
workflow_call:
inputs:
target-arch:
required: true
type: string
jobs:
build-asset:
runs-on: ubuntu-latest
strategy:
matrix:
measured_rootfs:
- no
asset:
- cloud-hypervisor
- qemu
- virtiofsd
- kernel-sev
- ovmf-sev
- ovmf
- qemu-snp-experimental
- qemu-tdx-experimental
- rootfs-initrd-sev
- cc-tdx-td-shim
- tdvf
include:
- measured_rootfs: yes
asset: kernel
- measured_rootfs: yes
asset: kernel-tdx-experimental
- measured_rootfs: yes
asset: cc-rootfs-image
- measured_rootfs: yes
asset: rootfs-image-tdx
steps:
- name: Login to Kata Containers quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
with:
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Build ${{ matrix.asset }}
run: |
USE_CACHE="no" make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: yes
MEASURED_ROOTFS: ${{ matrix.measured_rootfs }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v3
with:
name: kata-artifacts
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 1
if-no-files-found: error
- name: store-artifact root_hash_tdx.txt
uses: actions/upload-artifact@v3
with:
name: root_hash_tdx.txt
path: tools/osbuilder/root_hash_tdx.txt
retention-days: 1
if-no-files-found: ignore
- name: store-artifact root_hash_vanilla.txt
uses: actions/upload-artifact@v3
with:
name: root_hash_vanilla.txt
path: tools/osbuilder/root_hash_vanilla.txt
retention-days: 1
if-no-files-found: ignore
build-asset-cc-shim-v2:
runs-on: ubuntu-latest
needs: build-asset
steps:
- name: Login to Kata Containers quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- name: Get root_hash_tdx.txt
uses: actions/download-artifact@v3
with:
name: root_hash_tdx.txt
path: tools/osbuilder/
- name: Get root_hash_vanilla.txt
uses: actions/download-artifact@v3
with:
name: root_hash_vanilla.txt
path: tools/osbuilder/
- name: Build cc-shim-v2
run: |
USE_CACHE="no" make cc-shim-v2-tarball
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
PUSH_TO_REGISTRY: yes
MEASURED_ROOTFS: yes
- name: store-artifact cc-shim-v2
uses: actions/upload-artifact@v3
with:
name: kata-artifacts
path: kata-build/kata-static-cc-shim-v2.tar.xz
retention-days: 1
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-latest
needs: [build-asset, build-asset-cc-shim-v2]
steps:
- uses: actions/checkout@v3
- name: get-artifacts
uses: actions/download-artifact@v3
with:
name: kata-artifacts
path: kata-artifacts
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
- name: store-artifacts
uses: actions/upload-artifact@v3
with:
name: kata-static-tarball
path: kata-static.tar.xz
retention-days: 1
if-no-files-found: error
kata-payload:
needs: create-kata-tarball
runs-on: ubuntu-latest
steps:
- name: Login to Confidential Containers quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.COCO_QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.COCO_QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball
- name: build-and-push-kata-payload
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "quay.io/confidential-containers/runtime-payload-ci" \
"kata-containers-${{ inputs.target-arch }}"

View File

@@ -1,207 +0,0 @@
name: CI | Publish CC runtime payload for s390x
on:
workflow_call:
inputs:
target-arch:
required: true
type: string
jobs:
build-asset:
runs-on: s390x
strategy:
matrix:
measured_rootfs:
- no
asset:
- qemu
- cc-rootfs-initrd
- virtiofsd
include:
- measured_rootfs: yes
asset: kernel
- measured_rootfs: yes
asset: cc-rootfs-image
steps:
- name: Login to Kata Containers quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
with:
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Build ${{ matrix.asset }}
run: |
USE_CACHE="no" make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
sudo chown -R $(id -u):$(id -g) "kata-build"
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: yes
MEASURED_ROOTFS: ${{ matrix.measured_rootfs }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v3
with:
name: kata-artifacts-s390x
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 1
if-no-files-found: error
- name: store-artifact root_hash_vanilla.txt
uses: actions/upload-artifact@v3
with:
name: root_hash_vanilla.txt-s390x
path: tools/osbuilder/root_hash_vanilla.txt
retention-days: 1
if-no-files-found: ignore
build-asset-cc-shim-v2:
runs-on: s390x
needs: build-asset
steps:
- name: Login to Kata Containers quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- name: Get root_hash_vanilla.txt
uses: actions/download-artifact@v3
with:
name: root_hash_vanilla.txt-s390x
path: tools/osbuilder/
- name: Build cc-shim-v2
run: |
USE_CACHE="no" make cc-shim-v2-tarball
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
PUSH_TO_REGISTRY: yes
MEASURED_ROOTFS: yes
- name: store-artifact cc-shim-v2
uses: actions/upload-artifact@v3
with:
name: kata-artifacts-s390x
path: kata-build/kata-static-cc-shim-v2.tar.xz
retention-days: 1
if-no-files-found: error
build-asset-cc-se-image:
runs-on: s390x
needs: build-asset
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- name: get-artifacts
uses: actions/download-artifact@v3
with:
name: kata-artifacts-s390x
path: kata-artifacts
- name: Place a host key document
run: |
mkdir -p "host-key-document"
cp "${CI_HKD_PATH}" "host-key-document"
env:
CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}
- name: Build cc-se-image
run: |
base_dir=tools/packaging/kata-deploy/local-build/
cp -r kata-artifacts ${base_dir}/build
# Skip building dependant artifacts of cc-se-image-tarball
# because we already have them from the previous build
sed -i 's/\(^cc-se-image-tarball:\).*/\1/g' ${base_dir}/Makefile
USE_CACHE="no" make cc-se-image-tarball
build_dir=$(readlink -f build)
sudo cp -r "${build_dir}" "kata-build"
sudo chown -R $(id -u):$(id -g) "kata-build"
env:
HKD_PATH: "host-key-document"
- name: store-artifact cc-se-image
uses: actions/upload-artifact@v3
with:
name: kata-artifacts-s390x
path: kata-build/kata-static-cc-se-image.tar.xz
retention-days: 1
if-no-files-found: error
create-kata-tarball:
runs-on: s390x
needs: [build-asset, build-asset-cc-shim-v2, build-asset-cc-se-image]
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- name: get-artifacts
uses: actions/download-artifact@v3
with:
name: kata-artifacts-s390x
path: kata-artifacts
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
- name: store-artifacts
uses: actions/upload-artifact@v3
with:
name: kata-static-tarball-s390x
path: kata-static.tar.xz
retention-days: 1
if-no-files-found: error
kata-payload:
needs: create-kata-tarball
runs-on: s390x
steps:
- name: Login to Confidential Containers quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.COCO_QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.COCO_QUAY_DEPLOYER_PASSWORD }}
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-s390x
- name: build-and-push-kata-payload
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "quay.io/confidential-containers/runtime-payload-ci" \
"kata-containers-${{ inputs.target-arch }}"

View File

@@ -1,47 +0,0 @@
name: CI | Publish Kata Containers payload for Confidential Containers
on:
push:
branches:
- CCv0
workflow_dispatch:
jobs:
build-assets-amd64:
uses: ./.github/workflows/cc-payload-after-push-amd64.yaml
with:
target-arch: amd64
secrets: inherit
build-assets-s390x:
uses: ./.github/workflows/cc-payload-after-push-s390x.yaml
with:
target-arch: s390x
secrets: inherit
publish:
runs-on: ubuntu-latest
needs: [build-assets-amd64, build-assets-s390x]
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Login to Confidential Containers quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.COCO_QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.COCO_QUAY_DEPLOYER_PASSWORD }}
- name: Push commit multi-arch manifest
run: |
docker manifest create quay.io/confidential-containers/runtime-payload-ci:kata-containers-${GITHUB_SHA} \
--amend quay.io/confidential-containers/runtime-payload-ci:kata-containers-${GITHUB_SHA}-amd64 \
--amend quay.io/confidential-containers/runtime-payload-ci:kata-containers-${GITHUB_SHA}-s390x
docker manifest push quay.io/confidential-containers/runtime-payload-ci:kata-containers-${GITHUB_SHA}
- name: Push latest multi-arch manifest
run: |
docker manifest create quay.io/confidential-containers/runtime-payload-ci:kata-containers-latest \
--amend quay.io/confidential-containers/runtime-payload-ci:kata-containers-amd64 \
--amend quay.io/confidential-containers/runtime-payload-ci:kata-containers-s390x
docker manifest push quay.io/confidential-containers/runtime-payload-ci:kata-containers-latest

View File

@@ -1,155 +0,0 @@
name: Publish Kata Containers payload for Confidential Containers (amd64)
on:
workflow_call:
inputs:
target-arch:
required: true
type: string
jobs:
build-asset:
runs-on: ubuntu-latest
strategy:
matrix:
measured_rootfs:
- no
asset:
- cloud-hypervisor
- qemu
- virtiofsd
- kernel-sev
- kernel-snp-experimental
- ovmf-sev
- ovmf
- qemu-snp-experimental
- qemu-tdx-experimental
- rootfs-initrd-sev
- cc-tdx-td-shim
- tdvf
include:
- measured_rootfs: yes
asset: kernel
- measured_rootfs: yes
asset: kernel-tdx-experimental
- measured_rootfs: yes
asset: cc-rootfs-image
- measured_rootfs: yes
asset: rootfs-image-tdx
steps:
- uses: actions/checkout@v3
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
MEASURED_ROOTFS: ${{ matrix.measured_rootfs }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v3
with:
name: kata-artifacts
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 1
if-no-files-found: error
- name: store-artifact root_hash_tdx.txt
uses: actions/upload-artifact@v3
with:
name: root_hash_tdx.txt
path: tools/osbuilder/root_hash_tdx.txt
retention-days: 1
if-no-files-found: ignore
- name: store-artifact root_hash_vanilla.txt
uses: actions/upload-artifact@v3
with:
name: root_hash_vanilla.txt
path: tools/osbuilder/root_hash_vanilla.txt
retention-days: 1
if-no-files-found: ignore
build-asset-cc-shim-v2:
runs-on: ubuntu-latest
needs: build-asset
steps:
- uses: actions/checkout@v3
- name: Get root_hash_tdx.txt
uses: actions/download-artifact@v3
with:
name: root_hash_tdx.txt
path: tools/osbuilder/
- name: Get root_hash_vanilla.txt
uses: actions/download-artifact@v3
with:
name: root_hash_vanilla.txt
path: tools/osbuilder/
- name: Build cc-shim-v2
run: |
make cc-shim-v2-tarball
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
MEASURED_ROOTFS: yes
- name: store-artifact cc-shim-v2
uses: actions/upload-artifact@v3
with:
name: kata-artifacts
path: kata-build/kata-static-cc-shim-v2.tar.xz
retention-days: 1
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-latest
needs: [build-asset, build-asset-cc-shim-v2]
steps:
- uses: actions/checkout@v3
- name: get-artifacts
uses: actions/download-artifact@v3
with:
name: kata-artifacts
path: kata-artifacts
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
- name: store-artifacts
uses: actions/upload-artifact@v3
with:
name: kata-static-tarball
path: kata-static.tar.xz
retention-days: 1
if-no-files-found: error
kata-payload:
needs: create-kata-tarball
runs-on: ubuntu-latest
steps:
- name: Login to quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.COCO_QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.COCO_QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball
- name: build-and-push-kata-payload
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz \
"quay.io/confidential-containers/runtime-payload" \
"kata-containers-${{ inputs.target-arch }}"

View File

@@ -1,142 +0,0 @@
name: Publish Kata Containers payload for Confidential Containers (s390x)
on:
workflow_call:
inputs:
target-arch:
required: true
type: string
jobs:
build-asset:
runs-on: s390x
strategy:
matrix:
measured_rootfs:
- no
asset:
- qemu
- virtiofsd
include:
- measured_rootfs: yes
asset: kernel
- measured_rootfs: yes
asset: cc-rootfs-image
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
MEASURED_ROOTFS: ${{ matrix.measured_rootfs }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v3
with:
name: kata-artifacts-s390x
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 1
if-no-files-found: error
- name: store-artifact root_hash_vanilla.txt
uses: actions/upload-artifact@v3
with:
name: root_hash_vanilla.txt-s390x
path: tools/osbuilder/root_hash_vanilla.txt
retention-days: 1
if-no-files-found: ignore
build-asset-cc-shim-v2:
runs-on: s390x
needs: build-asset
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- name: Get root_hash_vanilla.txt
uses: actions/download-artifact@v3
with:
name: root_hash_vanilla.txt-s390x
path: tools/osbuilder/
- name: Build cc-shim-v2
run: |
make cc-shim-v2-tarball
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
MEASURED_ROOTFS: yes
- name: store-artifact cc-shim-v2
uses: actions/upload-artifact@v3
with:
name: kata-artifacts-s390x
path: kata-build/kata-static-cc-shim-v2.tar.xz
retention-days: 1
if-no-files-found: error
create-kata-tarball:
runs-on: s390x
needs: [build-asset, build-asset-cc-shim-v2]
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- name: get-artifacts
uses: actions/download-artifact@v3
with:
name: kata-artifacts-s390x
path: kata-artifacts
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
- name: store-artifacts
uses: actions/upload-artifact@v3
with:
name: kata-static-tarball-s390x
path: kata-static.tar.xz
retention-days: 1
if-no-files-found: error
kata-payload:
needs: create-kata-tarball
runs-on: s390x
steps:
- name: Login to quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.COCO_QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.COCO_QUAY_DEPLOYER_PASSWORD }}
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-s390x
- name: build-and-push-kata-payload
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz \
"quay.io/confidential-containers/runtime-payload" \
"kata-containers-${{ inputs.target-arch }}"

View File

@@ -1,46 +0,0 @@
name: Publish Kata Containers payload for Confidential Containers
on:
push:
tags:
- 'CC\-[0-9]+.[0-9]+.[0-9]+'
jobs:
build-assets-amd64:
uses: ./.github/workflows/cc-payload-amd64.yaml
with:
target-arch: amd64
secrets: inherit
build-assets-s390x:
uses: ./.github/workflows/cc-payload-s390x.yaml
with:
target-arch: s390x
secrets: inherit
publish:
runs-on: ubuntu-latest
needs: [build-assets-amd64, build-assets-s390x]
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Login to Confidential Containers quay.io
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.COCO_QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.COCO_QUAY_DEPLOYER_PASSWORD }}
- name: Push commit multi-arch manifest
run: |
docker manifest create quay.io/confidential-containers/runtime-payload:kata-containers-${GITHUB_SHA} \
--amend quay.io/confidential-containers/runtime-payload:kata-containers-${GITHUB_SHA}-amd64 \
--amend quay.io/confidential-containers/runtime-payload:kata-containers-${GITHUB_SHA}-s390x
docker manifest push quay.io/confidential-containers/runtime-payload:kata-containers-${GITHUB_SHA}
- name: Push latest multi-arch manifest
run: |
docker manifest create quay.io/confidential-containers/runtime-payload:kata-containers-latest \
--amend quay.io/confidential-containers/runtime-payload:kata-containers-amd64 \
--amend quay.io/confidential-containers/runtime-payload:kata-containers-s390x
docker manifest push quay.io/confidential-containers/runtime-payload:kata-containers-latest

View File

@@ -15,4 +15,5 @@ jobs:
commit-hash: ${{ github.sha }}
pr-number: "nightly"
tag: ${{ github.sha }}-nightly
target-branch: ${{ github.ref_name }}
secrets: inherit

View File

@@ -28,4 +28,5 @@ jobs:
commit-hash: ${{ github.event.pull_request.head.sha }}
pr-number: ${{ github.event.pull_request.number }}
tag: ${{ github.event.pull_request.number }}-${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
secrets: inherit

View File

@@ -11,6 +11,10 @@ on:
tag:
required: true
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-kata-static-tarball-amd64:
@@ -18,6 +22,7 @@ jobs:
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
publish-kata-deploy-payload-amd64:
needs: build-kata-static-tarball-amd64
@@ -28,15 +33,23 @@ jobs:
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
build-and-publish-tee-confidential-unencrypted-image:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
@@ -60,6 +73,54 @@ jobs:
platforms: linux/amd64, linux/s390x
file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile
run-docker-tests-on-garm:
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-docker-tests-on-garm.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
run-nerdctl-tests-on-garm:
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-nerdctl-tests-on-garm.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
run-kata-deploy-tests-on-aks:
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-kata-deploy-tests-on-aks.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-kata-deploy-tests-on-garm:
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-kata-deploy-tests-on-garm.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-kata-monitor-tests:
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-kata-monitor-tests.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
run-k8s-tests-on-aks:
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-k8s-tests-on-aks.yaml
@@ -69,37 +130,43 @@ jobs:
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-k8s-tests-on-sev:
needs: [publish-kata-deploy-payload-amd64, build-and-publish-tee-confidential-unencrypted-image]
uses: ./.github/workflows/run-k8s-tests-on-sev.yaml
run-k8s-tests-on-garm:
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-k8s-tests-on-garm.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-k8s-tests-on-snp:
needs: [publish-kata-deploy-payload-amd64, build-and-publish-tee-confidential-unencrypted-image]
uses: ./.github/workflows/run-k8s-tests-on-snp.yaml
run-k8s-tests-with-crio-on-garm:
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-k8s-tests-with-crio-on-garm.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-k8s-tests-on-tdx:
run-kata-coco-tests:
needs: [publish-kata-deploy-payload-amd64, build-and-publish-tee-confidential-unencrypted-image]
uses: ./.github/workflows/run-k8s-tests-on-tdx.yaml
uses: ./.github/workflows/run-kata-coco-tests.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
run-metrics-tests:
needs: build-kata-static-tarball-amd64
@@ -107,24 +174,12 @@ jobs:
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
run-cri-containerd-tests:
run-basic-amd64-tests:
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-cri-containerd-tests.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
run-nydus-tests:
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-nydus-tests.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
run-vfio-tests:
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-vfio-tests.yaml
uses: ./.github/workflows/basic-ci-amd64.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}

View File

@@ -51,7 +51,7 @@ jobs:
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
with:
commits: ${{ steps.get-pr-commits.outputs.commits }}
pattern: '^.{0,75}(\n.*)*$|^Merge pull request (?:kata-containers)?#[\d]+ from.*'
pattern: '^.{0,75}(\n.*)*$'
error: 'Subject too long (max 75)'
post_error: ${{ env.error_msg }}
@@ -102,6 +102,6 @@ jobs:
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
with:
commits: ${{ steps.get-pr-commits.outputs.commits }}
pattern: '^[\s\t]*[^:\s\t]+[\s\t]*:|^Merge pull request (?:kata-containers)?#[\d]+ from.*'
pattern: '^[\s\t]*[^:\s\t]+[\s\t]*:'
error: 'Failed to find subsystem in subject'
post_error: ${{ env.error_msg }}

View File

@@ -21,6 +21,6 @@ jobs:
with:
go-version: 1.19.3
- name: Checkout code
uses: actions/checkout@v2
uses: actions/checkout@v4
- name: Build utils
run: ./ci/darwin-test.sh

View File

@@ -1,124 +0,0 @@
on:
issue_comment:
types: [created, edited]
name: deploy-ccv0-demo
jobs:
check-comment-and-membership:
runs-on: ubuntu-latest
if: |
github.event.issue.pull_request
&& github.event_name == 'issue_comment'
&& github.event.action == 'created'
&& startsWith(github.event.comment.body, '/deploy-ccv0-demo')
steps:
- name: Check membership
uses: kata-containers/is-organization-member@1.0.1
id: is_organization_member
with:
organization: kata-containers
username: ${{ github.event.comment.user.login }}
token: ${{ secrets.GITHUB_TOKEN }}
- name: Fail if not member
run: |
result=${{ steps.is_organization_member.outputs.result }}
if [ $result == false ]; then
user=${{ github.event.comment.user.login }}
echo Either ${user} is not part of the kata-containers organization
echo or ${user} has its Organization Visibility set to Private at
echo https://github.com/orgs/kata-containers/people?query=${user}
echo
echo Ensure you change your Organization Visibility to Public and
echo trigger the test again.
exit 1
fi
build-asset:
runs-on: ubuntu-latest
needs: check-comment-and-membership
strategy:
matrix:
asset:
- cloud-hypervisor
- firecracker
- kernel
- qemu
- rootfs-image
- rootfs-initrd
- shim-v2
steps:
- uses: actions/checkout@v2
- name: Prepare confidential container rootfs
if: ${{ matrix.asset == 'rootfs-initrd' }}
run: |
pushd include_rootfs/etc
curl -LO https://raw.githubusercontent.com/confidential-containers/documentation/main/demos/ssh-demo/aa-offline_fs_kbc-keys.json
mkdir kata-containers
envsubst < docs/how-to/data/confidential-agent-config.toml.in > kata-containers/agent.toml
popd
env:
AA_KBC_PARAMS: offline_fs_kbc::null
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
AA_KBC: offline_fs_kbc
INCLUDE_ROOTFS: include_rootfs
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-latest
needs: build-asset
steps:
- uses: actions/checkout@v2
- name: get-artifacts
uses: actions/download-artifact@v2
with:
name: kata-artifacts
path: kata-artifacts
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
- name: store-artifacts
uses: actions/upload-artifact@v2
with:
name: kata-static-tarball
path: kata-static.tar.xz
kata-deploy:
needs: create-kata-tarball
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: get-kata-tarball
uses: actions/download-artifact@v2
with:
name: kata-static-tarball
- name: build-and-push-kata-deploy-ci
id: build-and-push-kata-deploy-ci
run: |
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
pushd $GITHUB_WORKSPACE
git checkout $tag
pkg_sha=$(git rev-parse HEAD)
popd
mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t quay.io/confidential-containers/runtime-payload:$pkg_sha $GITHUB_WORKSPACE/tools/packaging/kata-deploy
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
docker push quay.io/confidential-containers/runtime-payload:$pkg_sha
mkdir -p packaging/kata-deploy
ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action
echo "::set-output name=PKG_SHA::${pkg_sha}"

View File

@@ -22,7 +22,7 @@ jobs:
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
- name: Checkout code
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}

View File

@@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Ensure the split out runtime classes match the all-in-one file
run: |
pushd tools/packaging/kata-deploy/runtimeclasses/

View File

@@ -38,7 +38,17 @@ jobs:
- name: Checkout code to allow hub to communicate with the project
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}
- name: Move issue to "In progress"
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

View File

@@ -4,6 +4,7 @@ on:
branches:
- main
- stable-*
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
@@ -15,6 +16,7 @@ jobs:
with:
commit-hash: ${{ github.sha }}
push-to-registry: yes
target-branch: ${{ github.ref_name }}
secrets: inherit
build-assets-arm64:
@@ -22,6 +24,7 @@ jobs:
with:
commit-hash: ${{ github.sha }}
push-to-registry: yes
target-branch: ${{ github.ref_name }}
secrets: inherit
build-assets-s390x:
@@ -29,6 +32,7 @@ jobs:
with:
commit-hash: ${{ github.sha }}
push-to-registry: yes
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-kata-deploy-payload-amd64:
@@ -39,6 +43,7 @@ jobs:
registry: quay.io
repo: kata-containers/kata-deploy-ci
tag: kata-containers-amd64
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-kata-deploy-payload-arm64:
@@ -49,6 +54,7 @@ jobs:
registry: quay.io
repo: kata-containers/kata-deploy-ci
tag: kata-containers-arm64
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-kata-deploy-payload-s390x:
@@ -59,6 +65,7 @@ jobs:
registry: quay.io
repo: kata-containers/kata-deploy-ci
tag: kata-containers-s390x
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-manifest:
@@ -66,7 +73,7 @@ jobs:
needs: [publish-kata-deploy-payload-amd64, publish-kata-deploy-payload-arm64, publish-kata-deploy-payload-s390x]
steps:
- name: Checkout repository
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Login to Kata Containers quay.io
uses: docker/login-action@v2

View File

@@ -17,14 +17,25 @@ on:
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
kata-payload:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v3

View File

@@ -17,18 +17,29 @@ on:
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
kata-payload:
runs-on: arm64
runs-on: arm64-builder
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v3

View File

@@ -17,6 +17,10 @@ on:
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
kata-payload:
@@ -26,9 +30,16 @@ jobs:
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v3

View File

@@ -29,7 +29,7 @@ jobs:
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:

View File

@@ -14,7 +14,7 @@ jobs:
kata-deploy:
needs: build-kata-static-tarball-arm64
runs-on: arm64
runs-on: arm64-builder
steps:
- name: Login to Kata Containers docker.io
uses: docker/login-action@v2
@@ -29,7 +29,7 @@ jobs:
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:

View File

@@ -29,7 +29,7 @@ jobs:
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:

View File

@@ -32,7 +32,7 @@ jobs:
needs: [build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x]
steps:
- name: Checkout repository
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Login to Kata Containers docker.io
uses: docker/login-action@v2
@@ -73,11 +73,7 @@ jobs:
needs: publish-multi-arch-images
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: install hub
run: |
wget -q -O- https://github.com/mislav/hub/releases/download/v2.14.2/hub-linux-amd64-2.14.2.tgz | \
tar xz --strip-components=2 --wildcards '*/bin/hub' && sudo mv hub /usr/local/bin/hub
- uses: actions/checkout@v4
- name: download-artifacts-amd64
uses: actions/download-artifact@v3
@@ -90,7 +86,7 @@ jobs:
mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"
pushd $GITHUB_WORKSPACE
echo "uploading asset '${tarball}' for tag: ${tag}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} gh release upload "${tag}" "${tarball}"
popd
- name: download-artifacts-arm64
@@ -104,7 +100,7 @@ jobs:
mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"
pushd $GITHUB_WORKSPACE
echo "uploading asset '${tarball}' for tag: ${tag}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} gh release upload "${tag}" "${tarball}"
popd
- name: download-artifacts-s390x
@@ -118,13 +114,13 @@ jobs:
mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"
pushd $GITHUB_WORKSPACE
echo "uploading asset '${tarball}' for tag: ${tag}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} gh release upload "${tag}" "${tarball}"
popd
upload-versions-yaml:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: upload versions.yaml
env:
GITHUB_TOKEN: ${{ secrets.GIT_UPLOAD_TOKEN }}
@@ -133,28 +129,28 @@ jobs:
pushd $GITHUB_WORKSPACE
versions_file="kata-containers-$tag-versions.yaml"
cp versions.yaml ${versions_file}
hub release edit -m "" -a "${versions_file}" "${tag}"
gh release upload "${tag}" "${versions_file}"
popd
upload-cargo-vendored-tarball:
needs: upload-multi-arch-static-tarball
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: generate-and-upload-tarball
run: |
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tarball="kata-containers-$tag-vendor.tar.gz"
pushd $GITHUB_WORKSPACE
bash -c "tools/packaging/release/generate_vendor.sh ${tarball}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} gh release upload "${tag}" "${tarball}"
popd
upload-libseccomp-tarball:
needs: upload-cargo-vendored-tarball
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: download-and-upload-tarball
env:
GITHUB_TOKEN: ${{ secrets.GIT_UPLOAD_TOKEN }}
@@ -174,6 +170,6 @@ jobs:
# "-m" option should be empty to re-use the existing release title
# without opening a text editor.
# For the details, check https://hub.github.com/hub-release.1.html.
hub release edit -m "" -a "${tarball}" "${tag}"
hub release edit -m "" -a "${asc}" "${tag}"
gh release upload "${tag}" "${tarball}"
gh release upload "${tag}" "${asc}"
popd

View File

@@ -36,7 +36,17 @@ jobs:
- name: Checkout code to allow hub to communicate with the project
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}
- name: Install porting checker script
run: |

View File

@@ -1,42 +0,0 @@
name: CI | Run cri-containerd tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
jobs:
run-cri-containerd:
strategy:
fail-fast: true
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'qemu']
runs-on: garm-ubuntu-2204
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v3
with:
ref: ${{ inputs.commit-hash }}
- name: Install dependencies
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts
- name: Run cri-containerd tests
run: bash tests/integration/cri-containerd/gha-run.sh run

View File

@@ -0,0 +1,56 @@
name: CI | Run docker integration tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-docker-tests:
strategy:
# We can set this to true whenever we're 100% sure that
# all the tests are not flaky, otherwise we'll fail them
# all due to a single flaky instance.
fail-fast: false
matrix:
vmm:
- clh
- qemu
runs-on: garm-ubuntu-2304-smaller
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/docker/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/docker/gha-run.sh install-kata kata-artifacts
- name: Run docker smoke test
timeout-minutes: 5
run: bash tests/integration/docker/gha-run.sh run

View File

@@ -17,6 +17,10 @@ on:
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-k8s-tests:
@@ -29,6 +33,9 @@ jobs:
- clh
- dragonball
- qemu
instance-type:
- small
- normal
include:
- host_os: cbl-mariner
vmm: clh
@@ -40,11 +47,20 @@ jobs:
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HOST_OS: ${{ matrix.host_os }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: "vanilla"
USING_NFD: "false"
K8S_TEST_HOST_TYPE: ${{ matrix.instance-type }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli

View File

@@ -0,0 +1,88 @@
name: CI | Run kubernetes tests on GARM
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-k8s-tests:
strategy:
fail-fast: false
matrix:
vmm:
- clh #cloud-hypervisor
- fc #firecracker
- qemu
snapshotter:
- devmapper
k8s:
- k3s
instance:
- garm-ubuntu-2004
- garm-ubuntu-2004-smaller
include:
- instance: garm-ubuntu-2004
instance-type: normal
- instance: garm-ubuntu-2004-smaller
instance-type: small
runs-on: ${{ matrix.instance }}
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: ${{ matrix.k8s }}
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: "false"
K8S_TEST_HOST_TYPE: ${{ matrix.instance-type }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy ${{ matrix.k8s }}
run: bash tests/integration/kubernetes/gha-run.sh deploy-k8s
- name: Configure the ${{ matrix.snapshotter }} snapshotter
run: bash tests/integration/kubernetes/gha-run.sh configure-snapshotter
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-garm
- name: Install `bats`
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-garm

View File

@@ -1,52 +0,0 @@
name: CI | Run kubernetes tests on SEV
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
jobs:
run-k8s-tests:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-sev
runs-on: sev
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBECONFIG: /home/kata/.kube/config
USING_NFD: "false"
steps:
- uses: actions/checkout@v3
with:
ref: ${{ inputs.commit-hash }}
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-sev
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-sev

View File

@@ -1,52 +0,0 @@
name: CI | Run kubernetes tests on SEV-SNP
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
jobs:
run-k8s-tests:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-snp
runs-on: sev-snp
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBECONFIG: /home/kata/.kube/config
USING_NFD: "false"
steps:
- uses: actions/checkout@v3
with:
ref: ${{ inputs.commit-hash }}
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-snp
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-snp

View File

@@ -1,51 +0,0 @@
name: CI | Run kubernetes tests on TDX
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
jobs:
run-k8s-tests:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-tdx
runs-on: tdx
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
USING_NFD: "true"
steps:
- uses: actions/checkout@v3
with:
ref: ${{ inputs.commit-hash }}
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-tdx
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-tdx

View File

@@ -0,0 +1,86 @@
name: CI | Run kubernetes tests, using CRI-O, on GARM
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-k8s-tests:
strategy:
fail-fast: false
matrix:
vmm:
- qemu
k8s:
- k0s
instance:
- garm-ubuntu-2004
- garm-ubuntu-2004-smaller
include:
- instance: garm-ubuntu-2004
instance-type: normal
- instance: garm-ubuntu-2004-smaller
instance-type: small
- k8s: k0s
k8s-extra-params: '--cri-socket remote:unix:///var/run/crio/crio.sock --kubelet-extra-args --cgroup-driver="systemd"'
runs-on: ${{ matrix.instance }}
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: ${{ matrix.k8s }}
KUBERNETES_EXTRA_PARAMS: ${{ matrix.k8s-extra-params }}
USING_NFD: "false"
K8S_TEST_HOST_TYPE: ${{ matrix.instance-type }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Configure CRI-O
run: bash tests/integration/kubernetes/gha-run.sh setup-crio
- name: Deploy ${{ matrix.k8s }}
run: bash tests/integration/kubernetes/gha-run.sh deploy-k8s
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-garm
- name: Install `bats`
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-garm

View File

@@ -0,0 +1,176 @@
name: CI | Run kata coco tests
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-kata-deploy-tests-on-tdx:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-tdx
runs-on: tdx
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: "k3s"
USING_NFD: "true"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Run tests
run: bash tests/functional/kata-deploy/gha-run.sh run-tests
run-k8s-tests-on-tdx:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-tdx
runs-on: tdx
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: "k3s"
USING_NFD: "true"
K8S_TEST_HOST_TYPE: "baremetal"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-tdx
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-tdx
run-k8s-tests-on-sev:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-sev
runs-on: sev
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBECONFIG: /home/kata/.kube/config
KUBERNETES: "vanilla"
USING_NFD: "false"
K8S_TEST_HOST_TYPE: "baremetal"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-sev
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-sev
run-k8s-tests-sev-snp:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-snp
runs-on: sev-snp
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBECONFIG: /home/kata/.kube/config
KUBERNETES: "vanilla"
USING_NFD: "false"
K8S_TEST_HOST_TYPE: "baremetal"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-snp
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-snp

View File

@@ -0,0 +1,89 @@
name: CI | Run kata-deploy tests on AKS
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-kata-deploy-tests:
strategy:
fail-fast: false
matrix:
host_os:
- ubuntu
vmm:
- clh
- dragonball
- qemu
include:
- host_os: cbl-mariner
vmm: clh
runs-on: ubuntu-latest
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HOST_OS: ${{ matrix.host_os }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: "vanilla"
USING_NFD: "false"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Download Azure CLI
run: bash tests/functional/kata-deploy/gha-run.sh install-azure-cli
- name: Log into the Azure account
run: bash tests/functional/kata-deploy/gha-run.sh login-azure
env:
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
- name: Create AKS cluster
timeout-minutes: 10
run: bash tests/functional/kata-deploy/gha-run.sh create-cluster
- name: Install `bats`
run: bash tests/functional/kata-deploy/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/functional/kata-deploy/gha-run.sh install-kubectl
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/functional/kata-deploy/gha-run.sh get-cluster-credentials
- name: Run tests
run: bash tests/functional/kata-deploy/gha-run.sh run-tests
- name: Delete AKS cluster
if: always()
run: bash tests/functional/kata-deploy/gha-run.sh delete-cluster

View File

@@ -0,0 +1,65 @@
name: CI | Run kata-deploy tests on GARM
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-kata-deploy-tests:
strategy:
fail-fast: false
matrix:
vmm:
- clh
- qemu
k8s:
- k0s
- k3s
- rke2
runs-on: garm-ubuntu-2004-smaller
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: ${{ matrix.k8s }}
USING_NFD: "false"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy ${{ matrix.k8s }}
run: bash tests/functional/kata-deploy/gha-run.sh deploy-k8s
- name: Install `bats`
run: bash tests/functional/kata-deploy/gha-run.sh install-bats
- name: Run tests
run: bash tests/functional/kata-deploy/gha-run.sh run-tests

View File

@@ -0,0 +1,59 @@
name: CI | Run kata-monitor tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-monitor:
strategy:
fail-fast: false
matrix:
vmm:
- qemu
container_engine:
- crio
- containerd
include:
- container_engine: containerd
containerd_version: lts
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINER_ENGINE: ${{ matrix.container_engine }}
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/functional/kata-monitor/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/functional/kata-monitor/gha-run.sh install-kata kata-artifacts
- name: Run kata-monitor tests
run: bash tests/functional/kata-monitor/gha-run.sh run

View File

@@ -8,6 +8,10 @@ on:
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
setup-kata:
@@ -16,9 +20,16 @@ jobs:
env:
GOPATH: ${{ github.workspace }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v3
@@ -32,7 +43,10 @@ jobs:
run-metrics:
needs: setup-kata
strategy:
fail-fast: true
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance.
fail-fast: false
matrix:
vmm: ['clh', 'qemu']
max-parallel: 1
@@ -65,6 +79,9 @@ jobs:
- name: run iperf test
run: bash tests/metrics/gha-run.sh run-test-iperf
- name: run latency test
run: bash tests/metrics/gha-run.sh run-test-latency
- name: make metrics tarball ${{ matrix.vmm }}
run: bash tests/metrics/gha-run.sh make-tarball-results

View File

@@ -0,0 +1,57 @@
name: CI | Run nerdctl integration tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-nerdctl-tests:
strategy:
# We can set this to true whenever we're 100% sure that
# all the tests are not flaky, otherwise we'll fail them
# all due to a single flaky instance.
fail-fast: false
matrix:
vmm:
- clh
- dragonball
- qemu
runs-on: garm-ubuntu-2304-smaller
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/nerdctl/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/nerdctl/gha-run.sh install-kata kata-artifacts
- name: Run nerdctl smoke test
timeout-minutes: 5
run: bash tests/integration/nerdctl/gha-run.sh run

View File

@@ -1,42 +0,0 @@
name: CI | Run nydus tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
jobs:
run-nydus:
strategy:
fail-fast: true
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'qemu', 'dragonball']
runs-on: garm-ubuntu-2204
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v3
with:
ref: ${{ inputs.commit-hash }}
- name: Install dependencies
run: bash tests/integration/nydus/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/nydus/gha-run.sh install-kata kata-artifacts
- name: Run nydus tests
run: bash tests/integration/nydus/gha-run.sh run

46
.github/workflows/run-runk-tests.yaml vendored Normal file
View File

@@ -0,0 +1,46 @@
name: CI | Run runk tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-runk:
runs-on: garm-ubuntu-2204-smaller
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts
- name: Run tracing tests
run: bash tests/integration/runk/gha-run.sh run

View File

@@ -1,37 +0,0 @@
name: CI | Run vfio tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
jobs:
run-vfio:
strategy:
fail-fast: false
matrix:
vmm: ['clh', 'qemu']
runs-on: garm-ubuntu-2204
env:
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v3
with:
ref: ${{ inputs.commit-hash }}
- name: Install dependencies
run: bash tests/functional/vfio/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v3
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Run vfio tests
run: bash tests/functional/vfio/gha-run.sh run

View File

@@ -1,37 +0,0 @@
on:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
paths-ignore: [ '**.md', '**.png', '**.jpg', '**.jpeg', '**.svg', '/docs/**' ]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
name: Static checks dragonball
jobs:
test-dragonball:
runs-on: dragonball
env:
RUST_BACKTRACE: "1"
steps:
- uses: actions/checkout@v3
- name: Set env
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
- name: Install Rust
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
./ci/install_rust.sh
echo PATH="$HOME/.cargo/bin:$PATH" >> $GITHUB_ENV
- name: Run Unit Test
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd src/dragonball
cargo version
rustc --version
sudo -E env PATH=$PATH LIBC=gnu SUPPORT_VIRTUALIZATION=true make test

View File

@@ -12,84 +12,183 @@ concurrency:
name: Static checks
jobs:
static-checks:
runs-on: garm-ubuntu-2004
check-kernel-config-version:
runs-on: ubuntu-latest
steps:
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Ensure the kernel config version has been updated
run: |
kernel_dir="tools/packaging/kernel/"
kernel_version_file="${kernel_dir}kata_config_version"
modified_files=$(git diff --name-only origin/$GITHUB_BASE_REF..HEAD)
if git diff --name-only origin/$GITHUB_BASE_REF..HEAD "${kernel_dir}" | grep "${kernel_dir}"; then
echo "Kernel directory has changed, checking if $kernel_version_file has been updated"
if echo "$modified_files" | grep -v "README.md" | grep "${kernel_dir}" >>"/dev/null"; then
echo "$modified_files" | grep "$kernel_version_file" >>/dev/null || ( echo "Please bump version in $kernel_version_file" && exit 1)
else
echo "Readme file changed, no need for kernel config version update."
fi
echo "Check passed"
fi
build-checks:
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
cmd:
component:
- agent
- dragonball
- runtime
- runtime-rs
- agent-ctl
- kata-ctl
- log-parser-rs
- runk
- trace-forwarder
command:
- "make vendor"
- "make static-checks"
- "make check"
- "make test"
- "sudo -E PATH=\"$PATH\" make test"
include:
- component: agent
component-path: src/agent
- component: dragonball
component-path: src/dragonball
- component: runtime
component-path: src/runtime
- component: runtime-rs
component-path: src/runtime-rs
- component: agent-ctl
component-path: src/tools/agent-ctl
- component: kata-ctl
component-path: src/tools/kata-ctl
- component: log-parser-rs
component-path: src/tools/log-parser-rs
- component: runk
component-path: src/tools/runk
- component: trace-forwarder
component-path: src/tools/trace-forwarder
- install-libseccomp: no
- component: agent
install-libseccomp: yes
- component: runk
install-libseccomp: yes
steps:
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install yq
run: |
./ci/install_yq.sh
env:
INSTALL_IN_GOPATH: false
- name: Install golang
if: ${{ matrix.component == 'runtime' }}
run: |
./tests/install_go.sh -f -p
echo "/usr/local/go/bin" >> $GITHUB_PATH
- name: Install rust
if: ${{ matrix.component != 'runtime' }}
run: |
./tests/install_rust.sh
echo "${HOME}/.cargo/bin" >> $GITHUB_PATH
- name: Install musl-tools
if: ${{ matrix.component != 'runtime' }}
run: sudo apt-get -y install musl-tools
- name: Install libseccomp
if: ${{ matrix.command != 'make vendor' && matrix.command != 'make check' && matrix.install-libseccomp == 'yes' }}
run: |
libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)
gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)
./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"
echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"
echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV
- name: Setup XDG_RUNTIME_DIR for the `runtime` tests
if: ${{ matrix.command != 'make vendor' && matrix.command != 'make check' && matrix.component == 'runtime' }}
run: |
XDG_RUNTIME_DIR=$(mktemp -d /tmp/kata-tests-$USER.XXX | tee >(xargs chmod 0700))
echo "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}" >> $GITHUB_ENV
- name: Running `${{ matrix.command }}` for ${{ matrix.component }}
run: |
cd ${{ matrix.component-path }}
${{ matrix.command }}
env:
RUST_BACKTRACE: "1"
build-checks-depending-on-kvm:
runs-on: garm-ubuntu-2004-smaller
strategy:
fail-fast: false
matrix:
component:
- runtime-rs
include:
- component: runtime-rs
command: "sudo -E env PATH=$PATH LIBC=gnu SUPPORT_VIRTUALIZATION=true make test"
- component: runtime-rs
component-path: src/dragonball
steps:
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install system deps
run: |
sudo apt-get install -y build-essential musl-tools
- name: Install yq
run: |
sudo -E ./ci/install_yq.sh
env:
INSTALL_IN_GOPATH: false
- name: Install rust
run: |
export PATH="$PATH:/usr/local/bin"
./tests/install_rust.sh
- name: Running `${{ matrix.command }}` for ${{ matrix.component }}
run: |
export PATH="$PATH:${HOME}/.cargo/bin"
cd ${{ matrix.component-path }}
${{ matrix.command }}
env:
RUST_BACKTRACE: "1"
static-checks:
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
cmd:
- "make static-checks"
env:
RUST_BACKTRACE: "1"
target_branch: ${{ github.base_ref }}
GOPATH: ${{ github.workspace }}
steps:
- name: Free disk space
run: |
sudo rm -rf /usr/share/dotnet
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
build-essential \
haveged \
libdevmapper-dev \
clang
- name: Install Go
uses: actions/setup-go@v3
with:
go-version: 1.19.3
- name: Check kernel config version
run: |
cd "${{ github.workspace }}/src/github.com/${{ github.repository }}"
kernel_dir="tools/packaging/kernel/"
kernel_version_file="${kernel_dir}kata_config_version"
modified_files=$(git diff --name-only origin/CCv0..HEAD)
if git diff --name-only origin/CCv0..HEAD "${kernel_dir}" | grep "${kernel_dir}"; then
echo "Kernel directory has changed, checking if $kernel_version_file has been updated"
if echo "$modified_files" | grep -v "README.md" | grep "${kernel_dir}" >>"/dev/null"; then
echo "$modified_files" | grep "$kernel_version_file" >>/dev/null || ( echo "Please bump version in $kernel_version_file" && exit 1)
else
echo "Readme file changed, no need for kernel config version update."
fi
echo "Check passed"
fi
- name: Set PATH
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
- name: Setup
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh
- name: Installing rust
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_rust.sh
PATH=$PATH:"$HOME/.cargo/bin"
rustup target add x86_64-unknown-linux-musl
rustup component add rustfmt clippy
- name: Setup seccomp
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)
gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"
echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"
echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV
- name: Run check
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
export PATH=$PATH:"$HOME/.cargo/bin"
export XDG_RUNTIME_DIR=$(mktemp -d /tmp/kata-tests-$USER.XXX | tee >(xargs chmod 0700))
cd ${GOPATH}/src/github.com/${{ github.repository }} && ${{ matrix.cmd }}
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Install yq
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }}
./ci/install_yq.sh
env:
INSTALL_IN_GOPATH: false
- name: Install golang
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }}
./tests/install_go.sh -f -p
echo "/usr/local/go/bin" >> $GITHUB_PATH
- name: Install system dependencies
run: |
sudo apt-get -y install moreutils hunspell pandoc
- name: Run check
run: |
export PATH=${PATH}:${GOPATH}/bin
cd ${GOPATH}/src/github.com/${{ github.repository }} && ${{ matrix.cmd }}

View File

@@ -1 +1 @@
3.2.0-rc0
3.3.0-alpha0

View File

@@ -7,12 +7,10 @@
set -o errexit
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
script_name="$(basename "${BASH_SOURCE[0]}")"
clone_tests_repo
source "${tests_repo_dir}/.ci/lib.sh"
source "${script_dir}/../tests/common.bash"
# The following variables if set on the environment will change the behavior
# of gperf and libseccomp configure scripts, that may lead this script to
@@ -25,11 +23,11 @@ workdir="$(mktemp -d --tmpdir build-libseccomp.XXXXX)"
# Variables for libseccomp
libseccomp_version="${LIBSECCOMP_VERSION:-""}"
if [ -z "${libseccomp_version}" ]; then
libseccomp_version=$(get_version "externals.libseccomp.version")
libseccomp_version=$(get_from_kata_deps "externals.libseccomp.version")
fi
libseccomp_url="${LIBSECCOMP_URL:-""}"
if [ -z "${libseccomp_url}" ]; then
libseccomp_url=$(get_version "externals.libseccomp.url")
libseccomp_url=$(get_from_kata_deps "externals.libseccomp.url")
fi
libseccomp_tarball="libseccomp-${libseccomp_version}.tar.gz"
libseccomp_tarball_url="${libseccomp_url}/releases/download/v${libseccomp_version}/${libseccomp_tarball}"
@@ -38,11 +36,11 @@ cflags="-O2"
# Variables for gperf
gperf_version="${GPERF_VERSION:-""}"
if [ -z "${gperf_version}" ]; then
gperf_version=$(get_version "externals.gperf.version")
gperf_version=$(get_from_kata_deps "externals.gperf.version")
fi
gperf_url="${GPERF_URL:-""}"
if [ -z "${gperf_url}" ]; then
gperf_url=$(get_version "externals.gperf.url")
gperf_url=$(get_from_kata_deps "externals.gperf.url")
fi
gperf_tarball="gperf-${gperf_version}.tar.gz"
gperf_tarball_url="${gperf_url}/${gperf_tarball}"
@@ -72,8 +70,7 @@ build_and_install_gperf() {
curl -sLO "${gperf_tarball_url}"
tar -xf "${gperf_tarball}"
pushd "gperf-${gperf_version}"
# gperf is a build time dependency of libseccomp and not to be used in the target.
# Unset $CC since that might point to a cross compiler.
# Unset $CC for configure, we will always use native for gperf
CC= ./configure --prefix="${gperf_install_dir}"
make
make install

View File

@@ -64,86 +64,3 @@ run_get_pr_changed_file_details()
source "$tests_repo_dir/.ci/lib.sh"
get_pr_changed_file_details
}
# Check if the 1st argument version is greater than and equal to 2nd one
# Version format: [0-9]+ separated by period (e.g. 2.4.6, 1.11.3 and etc.)
#
# Parameters:
# $1 - a version to be tested
# $2 - a target version
#
# Return:
# 0 if $1 is greater than and equal to $2
# 1 otherwise
version_greater_than_equal() {
local current_version=$1
local target_version=$2
smaller_version=$(echo -e "$current_version\n$target_version" | sort -V | head -1)
if [ "${smaller_version}" = "${target_version}" ]; then
return 0
else
return 1
fi
}
# Build a IBM zSystem secure execution (SE) image
#
# Parameters:
# $1 - kernel_parameters
# $2 - a source directory where kernel and initrd are located
# $3 - a destination directory where a SE image is built
#
# Return:
# 0 if the image is successfully built
# 1 otherwise
build_secure_image() {
kernel_params="${1:-}"
install_src_dir="${2:-}"
install_dest_dir="${3:-}"
if [ ! -f "${install_src_dir}/vmlinuz.container" ] ||
[ ! -f "${install_src_dir}/kata-containers-initrd.img" ]; then
cat << EOF >&2
Either kernel or initrd does not exist or is mistakenly named
A file name for kernel must be vmlinuz.container (raw binary)
A file name for initrd must be kata-containers-initrd.img
EOF
return 1
fi
cmdline="${kernel_params} panic=1 scsi_mod.scan=none swiotlb=262144"
parmfile="$(mktemp --suffix=-cmdline)"
echo "${cmdline}" > "${parmfile}"
chmod 600 "${parmfile}"
[ -n "${HKD_PATH:-}" ] || (echo >&2 "No host key document specified." && return 1)
cert_list=($(ls -1 $HKD_PATH))
declare hkd_options
eval "for cert in ${cert_list[*]}; do
hkd_options+=\"--host-key-document=\\\"\$HKD_PATH/\$cert\\\" \"
done"
command -v genprotimg > /dev/null 2>&1 || { apt update; apt install -y s390-tools; }
extra_arguments=""
genprotimg_version=$(genprotimg --version | grep -Po '(?<=version )[^-]+')
if ! version_greater_than_equal "${genprotimg_version}" "2.17.0"; then
extra_arguments="--x-pcf '0xe0'"
fi
eval genprotimg \
"${extra_arguments}" \
"${hkd_options}" \
--output="${install_dest_dir}/kata-containers-secure.img" \
--image="${install_src_dir}/vmlinuz.container" \
--ramdisk="${install_src_dir}/kata-containers-initrd.img" \
--parmfile="${parmfile}" \
--no-verify # no verification for CI testing purposes
build_result=$?
rm -f "${parmfile}"
if [ $build_result -eq 0 ]; then
return 0
else
return 1
fi
}

View File

@@ -43,7 +43,7 @@ and perform DMA transactions _anywhere_.
The second feature is ACS (Access Control Services), which controls which
devices are allowed to communicate with one another and thus avoids improper
routing of packets irrespectively of whether IOMMU is enabled or not.
routing of packets `irrespectively` of whether IOMMU is enabled or not.
When IOMMU is enabled, ACS is normally configured to force all PCI Express DMA
to go through the root complex so IOMMU can translate it, impacting performance
@@ -126,7 +126,7 @@ efficient P2P communication.
## PCI Express Virtual P2P Approval Capability
Most of the time, the PCI Express topology is flattened and obfuscated to ensure
easy migration of the VM image between different physical hardware topologies.
easy migration of the VM image between different physical hardware `topologies`.
In Kata, we can configure the hypervisor to use PCI Express root ports to
hotplug the VFIO  devices one is passing through. A user can select how many PCI
Express root ports to allocate depending on how many devices are passed through.
@@ -220,7 +220,7 @@ containers that he wants to run with Kata. The goal is to make such things as
transparent as possible, so we also introduced
[CDI](https://github.com/container-orchestrated-devices/container-device-interface)
(Container Device Interface) to Kata. CDI is a[
specification](https://github.com/container-orchestrated-devices/container-device-interface/blob/master/SPEC.md)
specification](https://github.com/container-orchestrated-devices/container-device-interface/blob/main/SPEC.md)
for container runtimes to support third-party devices.
As written before, we can provide a clique ID for the devices that belong
@@ -300,7 +300,7 @@ pcie_switch_port = 8
```
Each device that is passed through is attached to a PCI Express downstream port
as illustrated below. We can even replicate the hosts two DPUs topologies with
as illustrated below. We can even replicate the hosts two DPUs `topologies` with
added metadata through the CDI. Most of the time, a container only needs one
pair of GPU and NIC for GPUDirect RDMA. This is more of a showcase of what we
can do with the power of Kata and CDI. One could even think of adding groups of
@@ -328,7 +328,7 @@ $ lspci -tv
```
The configuration of using either the root port or switch port can be applied on
a per Container or Pod basis, meaning we can switch PCI Express topologies on
a per Container or Pod basis, meaning we can switch PCI Express `topologies` on
each run of an application.
## Hypervisor Resource Limits

View File

@@ -45,8 +45,4 @@
- [How to run Kata Containers with `nydus`](how-to-use-virtio-fs-nydus-with-kata.md)
- [How to run Kata Containers with AMD SEV-SNP](how-to-run-kata-containers-with-SNP-VMs.md)
- [How to use EROFS to build rootfs in Kata Containers](how-to-use-erofs-build-rootfs.md)
- [How to run Kata Containers with kinds of Block Volumes](how-to-run-kata-containers-with-kinds-of-Block-Volumes.md)
## Confidential Containers
- [How to use build and test the Confidential Containers `CCv0` proof of concept](how-to-build-and-test-ccv0.md)
- [How to generate a Kata Containers payload for the Confidential Containers Operator](how-to-generate-a-kata-containers-payload-for-the-confidential-containers-operator.md)
- [How to run Kata Containers with kinds of Block Volumes](how-to-run-kata-containers-with-kinds-of-Block-Volumes.md)

View File

@@ -1,635 +0,0 @@
#!/bin/bash -e
#
# Copyright (c) 2021, 2023 IBM Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
# Disclaimer: This script is work in progress for supporting the CCv0 prototype
# It shouldn't be considered supported by the Kata Containers community, or anyone else
# Based on https://github.com/kata-containers/kata-containers/blob/main/docs/Developer-Guide.md,
# but with elements of the tests/.ci scripts used
readonly script_name="$(basename "${BASH_SOURCE[0]}")"
# By default in Golang >= 1.16 GO111MODULE is set to "on", but not all modules support it, so overwrite to "auto"
export GO111MODULE="auto"
# Setup kata containers environments if not set - we default to use containerd
export CRI_CONTAINERD=${CRI_CONTAINERD:-"yes"}
export CRI_RUNTIME=${CRI_RUNTIME:-"containerd"}
export CRIO=${CRIO:-"no"}
export KATA_HYPERVISOR="${KATA_HYPERVISOR:-qemu}"
export KUBERNETES=${KUBERNETES:-"no"}
export AGENT_INIT="${AGENT_INIT:-${TEST_INITRD:-no}}"
export AA_KBC="${AA_KBC:-offline_fs_kbc}"
export KATA_BUILD_CC=${KATA_BUILD_CC:-"yes"}
export TEE_TYPE=${TEE_TYPE:-}
export PREFIX="${PREFIX:-/opt/confidential-containers}"
export RUNTIME_CONFIG_PATH="${RUNTIME_CONFIG_PATH:-${PREFIX}/share/defaults/kata-containers/configuration.toml}"
# Allow the user to overwrite the default repo and branch names if they want to build from a fork
export katacontainers_repo="${katacontainers_repo:-github.com/kata-containers/kata-containers}"
export katacontainers_branch="${katacontainers_branch:-CCv0}"
export kata_default_branch=${katacontainers_branch}
export tests_repo="${tests_repo:-github.com/kata-containers/tests}"
export tests_branch="${tests_branch:-CCv0}"
export target_branch=${tests_branch} # kata-containers/ci/lib.sh uses target branch var to check out tests repo
# if .bash_profile exists then use it, otherwise fall back to .profile
export PROFILE="${HOME}/.profile"
if [ -r "${HOME}/.bash_profile" ]; then
export PROFILE="${HOME}/.bash_profile"
fi
# Stop PS1: unbound variable error happening
export PS1=${PS1:-}
# Create a bunch of common, derived values up front so we don't need to create them in all the different functions
. ${PROFILE}
if [ -z ${GOPATH} ]; then
export GOPATH=${HOME}/go
fi
export tests_repo_dir="${GOPATH}/src/${tests_repo}"
export katacontainers_repo_dir="${GOPATH}/src/${katacontainers_repo}"
export ROOTFS_DIR="${katacontainers_repo_dir}/tools/osbuilder/rootfs-builder/rootfs"
export PULL_IMAGE="${PULL_IMAGE:-quay.io/kata-containers/confidential-containers:signed}" # Doesn't need authentication
export CONTAINER_ID="${CONTAINER_ID:-0123456789}"
source /etc/os-release || source /usr/lib/os-release
grep -Eq "\<fedora\>" /etc/os-release 2> /dev/null && export USE_PODMAN=true
# If we've already checked out the test repo then source the confidential scripts
if [ "${KUBERNETES}" == "yes" ]; then
export BATS_TEST_DIRNAME="${tests_repo_dir}/integration/kubernetes/confidential"
[ -d "${BATS_TEST_DIRNAME}" ] && source "${BATS_TEST_DIRNAME}/lib.sh"
else
export BATS_TEST_DIRNAME="${tests_repo_dir}/integration/containerd/confidential"
[ -d "${BATS_TEST_DIRNAME}" ] && source "${BATS_TEST_DIRNAME}/lib.sh"
fi
[ -d "${BATS_TEST_DIRNAME}" ] && source "${BATS_TEST_DIRNAME}/../../confidential/lib.sh"
usage() {
exit_code="$1"
cat <<EOF
Overview:
Build and test kata containers from source
Optionally set kata-containers and tests repo and branch as exported variables before running
e.g. export katacontainers_repo=github.com/stevenhorsman/kata-containers && export katacontainers_branch=kata-ci-from-fork && export tests_repo=github.com/stevenhorsman/tests && export tests_branch=kata-ci-from-fork && ~/${script_name} build_and_install_all
Usage:
${script_name} [options] <command>
Commands:
- agent_create_container: Run CreateContainer command against the agent with agent-ctl
- agent_pull_image: Run PullImage command against the agent with agent-ctl
- all: Build and install everything, test kata with containerd and capture the logs
- build_and_add_agent_to_rootfs: Builds the kata-agent and adds it to the rootfs
- build_and_install_all: Build and install everything
- build_and_install_rootfs: Builds and installs the rootfs image
- build_kata_runtime: Build and install the kata runtime
- build_cloud_hypervisor Checkout, patch, build and install Cloud Hypervisor
- build_qemu: Checkout, patch, build and install QEMU
- configure: Configure Kata to use rootfs and enable debug
- connect_to_ssh_demo_pod: Ssh into the ssh demo pod, showing that the decryption succeeded
- copy_signature_files_to_guest Copies signature verification files to guest
- create_rootfs: Create a local rootfs
- crictl_create_cc_container Use crictl to create a new busybox container in the kata cc pod
- crictl_create_cc_pod Use crictl to create a new kata cc pod
- crictl_delete_cc Use crictl to delete the kata cc pod sandbox and container in it
- help: Display this help
- init_kubernetes: initialize a Kubernetes cluster on this system
- initialize: Install dependencies and check out kata-containers source
- install_guest_kernel: Setup, build and install the guest kernel
- kubernetes_create_cc_pod: Create a Kata CC runtime busybox-based pod in Kubernetes
- kubernetes_create_ssh_demo_pod: Create a Kata CC runtime pod based on the ssh demo
- kubernetes_delete_cc_pod: Delete the Kata CC runtime busybox-based pod in Kubernetes
- kubernetes_delete_ssh_demo_pod: Delete the Kata CC runtime pod based on the ssh demo
- open_kata_shell: Open a shell into the kata runtime
- rebuild_and_install_kata: Rebuild the kata runtime and agent and build and install the image
- shim_pull_image: Run PullImage command against the shim with ctr
- test_capture_logs: Test using kata with containerd and capture the logs in the user's home directory
- test: Test using kata with containerd
Options:
-d: Enable debug
-h: Display this help
EOF
# if script sourced don't exit as this will exit the main shell, just return instead
[[ $_ != $0 ]] && return "$exit_code" || exit "$exit_code"
}
build_and_install_all() {
initialize
build_and_install_kata_runtime
configure
create_a_local_rootfs
build_and_install_rootfs
install_guest_kernel_image
case "$KATA_HYPERVISOR" in
"qemu")
build_qemu
;;
"cloud-hypervisor")
build_cloud_hypervisor
;;
*)
echo "Invalid option: $KATA_HYPERVISOR is not supported." >&2
;;
esac
check_kata_runtime
if [ "${KUBERNETES}" == "yes" ]; then
init_kubernetes
fi
}
rebuild_and_install_kata() {
checkout_tests_repo
checkout_kata_containers_repo
build_and_install_kata_runtime
build_and_add_agent_to_rootfs
build_and_install_rootfs
check_kata_runtime
}
# Based on the jenkins_job_build.sh script in kata-containers/tests/.ci - checks out source code and installs dependencies
initialize() {
# We need git to checkout and bootstrap the ci scripts and some other packages used in testing
sudo apt-get update && sudo apt-get install -y curl git qemu-utils
grep -qxF "export GOPATH=\${HOME}/go" "${PROFILE}" || echo "export GOPATH=\${HOME}/go" >> "${PROFILE}"
grep -qxF "export GOROOT=/usr/local/go" "${PROFILE}" || echo "export GOROOT=/usr/local/go" >> "${PROFILE}"
grep -qxF "export PATH=\${GOPATH}/bin:/usr/local/go/bin:\${PATH}" "${PROFILE}" || echo "export PATH=\${GOPATH}/bin:/usr/local/go/bin:\${PATH}" >> "${PROFILE}"
# Load the new go and PATH parameters from the profile
. ${PROFILE}
mkdir -p "${GOPATH}"
checkout_tests_repo
pushd "${tests_repo_dir}"
local ci_dir_name=".ci"
sudo -E PATH=$PATH -s "${ci_dir_name}/install_go.sh" -p -f
sudo -E PATH=$PATH -s "${ci_dir_name}/install_rust.sh"
# Need to change ownership of rustup so later process can create temp files there
sudo chown -R ${USER}:${USER} "${HOME}/.rustup"
checkout_kata_containers_repo
# Run setup, but don't install kata as we will build it ourselves in locations matching the developer guide
export INSTALL_KATA="no"
sudo -E PATH=$PATH -s ${ci_dir_name}/setup.sh
# Reload the profile to pick up installed dependencies
. ${PROFILE}
popd
}
checkout_tests_repo() {
echo "Creating repo: ${tests_repo} and branch ${tests_branch} into ${tests_repo_dir}..."
# Due to git https://github.blog/2022-04-12-git-security-vulnerability-announced/ the tests repo needs
# to be owned by root as it is re-checked out in rootfs.sh
mkdir -p $(dirname "${tests_repo_dir}")
[ -d "${tests_repo_dir}" ] || sudo -E git clone "https://${tests_repo}.git" "${tests_repo_dir}"
sudo -E chown -R root:root "${tests_repo_dir}"
pushd "${tests_repo_dir}"
sudo -E git fetch
if [ -n "${tests_branch}" ]; then
sudo -E git checkout ${tests_branch}
fi
sudo -E git reset --hard origin/${tests_branch}
popd
source "${BATS_TEST_DIRNAME}/lib.sh"
source "${BATS_TEST_DIRNAME}/../../confidential/lib.sh"
}
# Note: clone_katacontainers_repo using go, so that needs to be installed first
checkout_kata_containers_repo() {
source "${tests_repo_dir}/.ci/lib.sh"
echo "Creating repo: ${katacontainers_repo} and branch ${kata_default_branch} into ${katacontainers_repo_dir}..."
clone_katacontainers_repo
sudo -E chown -R ${USER}:${USER} "${katacontainers_repo_dir}"
}
build_and_install_kata_runtime() {
export DEFAULT_HYPERVISOR=${KATA_HYPERVISOR}
${tests_repo_dir}/.ci/install_runtime.sh
}
configure() {
# configure kata to use rootfs, not initrd
sudo sed -i 's/^\(initrd =.*\)/# \1/g' ${RUNTIME_CONFIG_PATH}
enable_full_debug
enable_agent_console
# Switch image offload to true in kata config
switch_image_service_offload "on"
configure_cc_containerd
# From crictl v1.24.1 the default timoout leads to the pod creation failing, so update it
sudo crictl config --set timeout=10
# Verity checks aren't working locally, as we aren't re-genning the hash maybe? so remove it from the kernel parameters
remove_kernel_param "cc_rootfs_verity.scheme"
}
build_and_add_agent_to_rootfs() {
build_a_custom_kata_agent
add_custom_agent_to_rootfs
}
build_a_custom_kata_agent() {
# Install libseccomp for static linking
sudo -E PATH=$PATH GOPATH=$GOPATH ${katacontainers_repo_dir}/ci/install_libseccomp.sh /tmp/kata-libseccomp /tmp/kata-gperf
export LIBSECCOMP_LINK_TYPE=static
export LIBSECCOMP_LIB_PATH=/tmp/kata-libseccomp/lib
. "$HOME/.cargo/env"
pushd ${katacontainers_repo_dir}/src/agent
sudo -E PATH=$PATH make
ARCH=$(uname -m)
[ ${ARCH} == "ppc64le" ] || [ ${ARCH} == "s390x" ] && export LIBC=gnu || export LIBC=musl
[ ${ARCH} == "ppc64le" ] && export ARCH=powerpc64le
# Run a make install into the rootfs directory in order to create the kata-agent.service file which is required when we add to the rootfs
sudo -E PATH=$PATH make install DESTDIR="${ROOTFS_DIR}"
popd
}
create_a_local_rootfs() {
sudo rm -rf "${ROOTFS_DIR}"
pushd ${katacontainers_repo_dir}/tools/osbuilder/rootfs-builder
export distro="ubuntu"
[[ -z "${USE_PODMAN:-}" ]] && use_docker="${use_docker:-1}"
sudo -E OS_VERSION="${OS_VERSION:-}" GOPATH=$GOPATH EXTRA_PKGS="vim iputils-ping net-tools" DEBUG="${DEBUG:-}" USE_DOCKER="${use_docker:-}" SKOPEO=${SKOPEO:-} AA_KBC=${AA_KBC:-} UMOCI=yes SECCOMP=yes ./rootfs.sh -r ${ROOTFS_DIR} ${distro}
# Install_rust.sh during rootfs.sh switches us to the main branch of the tests repo, so switch back now
pushd "${tests_repo_dir}"
sudo -E git checkout ${tests_branch}
popd
# During the ./rootfs.sh call the kata agent is built as root, so we need to update the permissions, so we can rebuild it
sudo chown -R ${USER}:${USER} "${katacontainers_repo_dir}/src/agent/"
popd
}
add_custom_agent_to_rootfs() {
pushd ${katacontainers_repo_dir}/tools/osbuilder/rootfs-builder
ARCH=$(uname -m)
[ ${ARCH} == "ppc64le" ] || [ ${ARCH} == "s390x" ] && export LIBC=gnu || export LIBC=musl
[ ${ARCH} == "ppc64le" ] && export ARCH=powerpc64le
sudo install -o root -g root -m 0550 -t ${ROOTFS_DIR}/usr/bin ${katacontainers_repo_dir}/src/agent/target/${ARCH}-unknown-linux-${LIBC}/release/kata-agent
sudo install -o root -g root -m 0440 ../../../src/agent/kata-agent.service ${ROOTFS_DIR}/usr/lib/systemd/system/
sudo install -o root -g root -m 0440 ../../../src/agent/kata-containers.target ${ROOTFS_DIR}/usr/lib/systemd/system/
popd
}
build_and_install_rootfs() {
build_rootfs_image
install_rootfs_image
}
build_rootfs_image() {
pushd ${katacontainers_repo_dir}/tools/osbuilder/image-builder
# Logic from install_kata_image.sh - if we aren't using podman (ie on a fedora like), then use docker
[[ -z "${USE_PODMAN:-}" ]] && use_docker="${use_docker:-1}"
sudo -E USE_DOCKER="${use_docker:-}" ./image_builder.sh ${ROOTFS_DIR}
popd
}
install_rootfs_image() {
pushd ${katacontainers_repo_dir}/tools/osbuilder/image-builder
local commit=$(git log --format=%h -1 HEAD)
local date=$(date +%Y-%m-%d-%T.%N%z)
local image="kata-containers-${date}-${commit}"
sudo install -o root -g root -m 0640 -D kata-containers.img "${PREFIX}/share/kata-containers/${image}"
(cd ${PREFIX}/share/kata-containers && sudo ln -sf "$image" kata-containers.img)
echo "Built Rootfs from ${ROOTFS_DIR} to ${PREFIX}/share/kata-containers/${image}"
ls -al ${PREFIX}/share/kata-containers
popd
}
install_guest_kernel_image() {
${tests_repo_dir}/.ci/install_kata_kernel.sh
}
build_qemu() {
${tests_repo_dir}/.ci/install_virtiofsd.sh
${tests_repo_dir}/.ci/install_qemu.sh
}
build_cloud_hypervisor() {
${tests_repo_dir}/.ci/install_virtiofsd.sh
${tests_repo_dir}/.ci/install_cloud_hypervisor.sh
}
check_kata_runtime() {
sudo kata-runtime check
}
k8s_pod_file="${HOME}/busybox-cc.yaml"
init_kubernetes() {
# Check that kubeadm was installed and install it otherwise
if ! [ -x "$(command -v kubeadm)" ]; then
pushd "${tests_repo_dir}/.ci"
sudo -E PATH=$PATH -s install_kubernetes.sh
if [ "${CRI_CONTAINERD}" == "yes" ]; then
sudo -E PATH=$PATH -s "configure_containerd_for_kubernetes.sh"
fi
popd
fi
# If kubernetes init has previously run we need to clean it by removing the image and resetting k8s
local cid=$(sudo docker ps -a -q -f name=^/kata-registry$)
if [ -n "${cid}" ]; then
sudo docker stop ${cid} && sudo docker rm ${cid}
fi
local k8s_nodes=$(kubectl get nodes -o name 2>/dev/null || true)
if [ -n "${k8s_nodes}" ]; then
sudo kubeadm reset -f
fi
export CI="true" && sudo -E PATH=$PATH -s ${tests_repo_dir}/integration/kubernetes/init.sh
sudo chown ${USER}:$(id -g -n ${USER}) "$HOME/.kube/config"
cat << EOF > ${k8s_pod_file}
apiVersion: v1
kind: Pod
metadata:
name: busybox-cc
spec:
runtimeClassName: kata
containers:
- name: nginx
image: quay.io/kata-containers/confidential-containers:signed
imagePullPolicy: Always
EOF
}
call_kubernetes_create_cc_pod() {
kubernetes_create_cc_pod ${k8s_pod_file}
}
call_kubernetes_delete_cc_pod() {
pod_name=$(kubectl get pods -o jsonpath='{.items..metadata.name}')
kubernetes_delete_cc_pod $pod_name
}
call_kubernetes_create_ssh_demo_pod() {
setup_decryption_files_in_guest
kubernetes_create_ssh_demo_pod
}
call_connect_to_ssh_demo_pod() {
connect_to_ssh_demo_pod
}
call_kubernetes_delete_ssh_demo_pod() {
pod=$(kubectl get pods -o jsonpath='{.items..metadata.name}')
kubernetes_delete_ssh_demo_pod $pod
}
crictl_sandbox_name=kata-cc-busybox-sandbox
call_crictl_create_cc_pod() {
# Update iptables to allow forwarding to the cni0 bridge avoiding issues caused by the docker0 bridge
sudo iptables -P FORWARD ACCEPT
# get_pod_config in tests_common exports `pod_config` that points to the prepared pod config yaml
get_pod_config
crictl_delete_cc_pod_if_exists "${crictl_sandbox_name}"
crictl_create_cc_pod "${pod_config}"
sudo crictl pods
}
call_crictl_create_cc_container() {
# Create container configuration yaml based on our test copy of busybox
# get_pod_config in tests_common exports `pod_config` that points to the prepared pod config yaml
get_pod_config
local container_config="${FIXTURES_DIR}/${CONTAINER_CONFIG_FILE:-container-config.yaml}"
local pod_name=${crictl_sandbox_name}
crictl_create_cc_container ${pod_name} ${pod_config} ${container_config}
sudo crictl ps -a
}
crictl_delete_cc() {
crictl_delete_cc_pod ${crictl_sandbox_name}
}
test_kata_runtime() {
echo "Running ctr with the kata runtime..."
local test_image="quay.io/kata-containers/confidential-containers:signed"
if [ -z $(sudo ctr images ls -q name=="${test_image}") ]; then
sudo ctr image pull "${test_image}"
fi
sudo ctr run --runtime "io.containerd.kata.v2" --rm -t "${test_image}" test-kata uname -a
}
run_kata_and_capture_logs() {
echo "Clearing systemd journal..."
sudo systemctl stop systemd-journald
sudo rm -f /var/log/journal/*/* /run/log/journal/*/*
sudo systemctl start systemd-journald
test_kata_runtime
echo "Collecting logs..."
sudo journalctl -q -o cat -a -t kata-runtime > ${HOME}/kata-runtime.log
sudo journalctl -q -o cat -a -t kata > ${HOME}/shimv2.log
echo "Logs output to ${HOME}/kata-runtime.log and ${HOME}/shimv2.log"
}
get_ids() {
guest_cid=$(sudo ss -H --vsock | awk '{print $6}' | cut -d: -f1)
sandbox_id=$(ps -ef | grep containerd-shim-kata-v2 | egrep -o "id [^,][^,].* " | awk '{print $2}')
}
open_kata_shell() {
get_ids
sudo -E "PATH=$PATH" kata-runtime exec ${sandbox_id}
}
build_bundle_dir_if_necessary() {
bundle_dir="/tmp/bundle"
if [ ! -d "${bundle_dir}" ]; then
rootfs_dir="$bundle_dir/rootfs"
image="quay.io/kata-containers/confidential-containers:signed"
mkdir -p "$rootfs_dir" && (cd "$bundle_dir" && runc spec)
sudo docker export $(sudo docker create "$image") | tar -C "$rootfs_dir" -xvf -
fi
# There were errors in create container agent-ctl command due to /bin/ seemingly not being on the path, so hardcode it
sudo sed -i -e 's%^\(\t*\)"sh"$%\1"/bin/sh"%g' "${bundle_dir}/config.json"
}
build_agent_ctl() {
cd ${GOPATH}/src/${katacontainers_repo}/src/tools/agent-ctl/
if [ -e "${HOME}/.cargo/registry" ]; then
sudo chown -R ${USER}:${USER} "${HOME}/.cargo/registry"
fi
sudo -E PATH=$PATH -s make
ARCH=$(uname -m)
[ ${ARCH} == "ppc64le" ] || [ ${ARCH} == "s390x" ] && export LIBC=gnu || export LIBC=musl
[ ${ARCH} == "ppc64le" ] && export ARCH=powerpc64le
cd "./target/${ARCH}-unknown-linux-${LIBC}/release/"
}
run_agent_ctl_command() {
get_ids
build_bundle_dir_if_necessary
command=$1
# If kata-agent-ctl pre-built in this directory, use it directly, otherwise build it first and switch to release
if [ ! -x kata-agent-ctl ]; then
build_agent_ctl
fi
./kata-agent-ctl -l debug connect --bundle-dir "${bundle_dir}" --server-address "vsock://${guest_cid}:1024" -c "${command}"
}
agent_pull_image() {
run_agent_ctl_command "PullImage image=${PULL_IMAGE} cid=${CONTAINER_ID} source_creds=${SOURCE_CREDS}"
}
agent_create_container() {
run_agent_ctl_command "CreateContainer cid=${CONTAINER_ID}"
}
shim_pull_image() {
get_ids
local ctr_shim_command="sudo ctr --namespace k8s.io shim --id ${sandbox_id} pull-image ${PULL_IMAGE} ${CONTAINER_ID}"
echo "Issuing command '${ctr_shim_command}'"
${ctr_shim_command}
}
call_copy_signature_files_to_guest() {
# TODO #5173 - remove this once the kernel_params aren't ignored by the agent config
export DEBUG_CONSOLE="true"
if [ "${SKOPEO:-}" = "yes" ]; then
add_kernel_params "agent.container_policy_file=/etc/containers/quay_verification/quay_policy.json"
setup_skopeo_signature_files_in_guest
else
# TODO #4888 - set config to specifically enable signature verification to be on in ImageClient
setup_offline_fs_kbc_signature_files_in_guest
fi
}
main() {
while getopts "dh" opt; do
case "$opt" in
d)
export DEBUG="-d"
set -x
;;
h)
usage 0
;;
\?)
echo "Invalid option: -$OPTARG" >&2
usage 1
;;
esac
done
shift $((OPTIND - 1))
subcmd="${1:-}"
[ -z "${subcmd}" ] && usage 1
case "${subcmd}" in
all)
build_and_install_all
run_kata_and_capture_logs
;;
build_and_install_all)
build_and_install_all
;;
rebuild_and_install_kata)
rebuild_and_install_kata
;;
initialize)
initialize
;;
build_kata_runtime)
build_and_install_kata_runtime
;;
configure)
configure
;;
create_rootfs)
create_a_local_rootfs
;;
build_and_add_agent_to_rootfs)
build_and_add_agent_to_rootfs
;;
build_and_install_rootfs)
build_and_install_rootfs
;;
install_guest_kernel)
install_guest_kernel_image
;;
build_cloud_hypervisor)
build_cloud_hypervisor
;;
build_qemu)
build_qemu
;;
init_kubernetes)
init_kubernetes
;;
crictl_create_cc_pod)
call_crictl_create_cc_pod
;;
crictl_create_cc_container)
call_crictl_create_cc_container
;;
crictl_delete_cc)
crictl_delete_cc
;;
kubernetes_create_cc_pod)
call_kubernetes_create_cc_pod
;;
kubernetes_delete_cc_pod)
call_kubernetes_delete_cc_pod
;;
kubernetes_create_ssh_demo_pod)
call_kubernetes_create_ssh_demo_pod
;;
connect_to_ssh_demo_pod)
call_connect_to_ssh_demo_pod
;;
kubernetes_delete_ssh_demo_pod)
call_kubernetes_delete_ssh_demo_pod
;;
test)
test_kata_runtime
;;
test_capture_logs)
run_kata_and_capture_logs
;;
open_kata_console)
open_kata_console
;;
open_kata_shell)
open_kata_shell
;;
agent_pull_image)
agent_pull_image
;;
shim_pull_image)
shim_pull_image
;;
agent_create_container)
agent_create_container
;;
copy_signature_files_to_guest)
call_copy_signature_files_to_guest
;;
*)
usage 1
;;
esac
}
main $@

View File

@@ -1,45 +0,0 @@
# Copyright (c) 2021 IBM Corp.
#
# SPDX-License-Identifier: Apache-2.0
#
aa_kbc_params = "$AA_KBC_PARAMS"
https_proxy = "$HTTPS_PROXY"
[endpoints]
allowed = [
"AddARPNeighborsRequest",
"AddSwapRequest",
"CloseStdinRequest",
"CopyFileRequest",
"CreateContainerRequest",
"CreateSandboxRequest",
"DestroySandboxRequest",
#"ExecProcessRequest",
"GetMetricsRequest",
"GetOOMEventRequest",
"GuestDetailsRequest",
"ListInterfacesRequest",
"ListRoutesRequest",
"MemHotplugByProbeRequest",
"OnlineCPUMemRequest",
"PauseContainerRequest",
"PullImageRequest",
"ReadStreamRequest",
"RemoveContainerRequest",
#"ReseedRandomDevRequest",
"ResizeVolumeRequest",
"ResumeContainerRequest",
"SetGuestDateTimeRequest",
"SignalProcessRequest",
"StartContainerRequest",
"StartTracingRequest",
"StatsContainerRequest",
"StopTracingRequest",
"TtyWinResizeRequest",
"UpdateContainerRequest",
"UpdateInterfaceRequest",
"UpdateRoutesRequest",
"VolumeStatsRequest",
"WaitProcessRequest",
"WriteStreamRequest"
]

View File

@@ -1,475 +0,0 @@
# How to build, run and test Kata CCv0
## Introduction and Background
In order to try and make building (locally) and demoing the Kata Containers `CCv0` code base as simple as possible I've
shared a script [`ccv0.sh`](./ccv0.sh). This script was originally my attempt to automate the steps of the
[Developer Guide](https://github.com/kata-containers/kata-containers/blob/main/docs/Developer-Guide.md) so that I could do
different sections of them repeatedly and reliably as I was playing around with make changes to different parts of the
Kata code base. I then tried to weave in some of the [`tests/.ci`](https://github.com/kata-containers/tests/tree/main/.ci)
scripts in order to have less duplicated code.
As we're progress on the confidential containers journey I hope to add more features to demonstrate the functionality
we have working.
*Disclaimer: This script has mostly just been used and tested by me ([@stevenhorsman](https://github.com/stevenhorsman)),*
*so there might be issues with it. I'm happy to try and help solve these if possible, but this shouldn't be considered a*
*fully supported process by the Kata Containers community.*
### Basic script set-up and optional environment variables
In order to build, configure and demo the CCv0 functionality, these are the set-up steps I take:
- Provision a new VM
- *I choose a Ubuntu 20.04 8GB VM for this as I had one available. There are some dependences on apt-get installed*
*packages, so these will need re-working to be compatible with other platforms.*
- Copy the script over to your VM *(I put it in the home directory)* and ensure it has execute permission by running
```bash
$ chmod u+x ccv0.sh
```
- Optionally set up some environment variables
- By default the script checks out the `CCv0` branches of the `kata-containers/kata-containers` and
`kata-containers/tests` repositories, but it is designed to be used to test of personal forks and branches as well.
If you want to build and run these you can export the `katacontainers_repo`, `katacontainers_branch`, `tests_repo`
and `tests_branch` variables e.g.
```bash
$ export katacontainers_repo=github.com/stevenhorsman/kata-containers
$ export katacontainers_branch=stevenh/agent-pull-image-endpoint
$ export tests_repo=github.com/stevenhorsman/tests
$ export tests_branch=stevenh/add-ccv0-changes-to-build
```
before running the script.
- By default the build and configuration are using `QEMU` as the hypervisor. In order to use `Cloud Hypervisor` instead
set:
```
$ export KATA_HYPERVISOR="cloud-hypervisor"
```
before running the build.
- At this point you can provision a Kata confidential containers pod and container with either
[`crictl`](#using-crictl-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image),
or [Kubernetes](#using-kubernetes-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image)
and then test and use it.
### Using crictl for end-to-end provisioning of a Kata confidential containers pod with an unencrypted image
- Run the full build process with Kubernetes turned off, so its configuration doesn't interfere with `crictl` using:
```bash
$ export KUBERNETES="no"
$ export KATA_HYPERVISOR="qemu"
$ ~/ccv0.sh -d build_and_install_all
```
> **Note**: Much of this script has to be run as `sudo`, so you are likely to get prompted for your password.
- *I run this script sourced just so that the required installed components are accessible on the `PATH` to the rest*
*of the process without having to reload the session.*
- The steps that `build_and_install_all` takes is:
- Checkout the git repos for the `tests` and `kata-containers` repos as specified by the environment variables
(default to `CCv0` branches if they are not supplied)
- Use the `tests/.ci` scripts to install the build dependencies
- Build and install the Kata runtime
- Configure Kata to use containerd and for debug and confidential containers features to be enabled (including
enabling console access to the Kata guest shell, which should only be done in development)
- Create, build and install a rootfs for the Kata hypervisor to use. For 'CCv0' this is currently based on Ubuntu
20.04.
- Build the Kata guest kernel
- Install the hypervisor (in order to select which hypervisor will be used, the `KATA_HYPERVISOR` environment
variable can be used to select between `qemu` or `cloud-hypervisor`)
> **Note**: Depending on how where your VMs are hosted and how IPs are shared you might get an error from docker
during matching `ERROR: toomanyrequests: Too Many Requests`. To get past
this, login into Docker Hub and pull the images used with:
> ```bash
> $ sudo docker login
> $ sudo docker pull ubuntu
> ```
> then re-run the command.
- The first time this runs it may take a while, but subsequent runs will be quicker as more things are already
installed and they can be further cut down by not running all the above steps
[see "Additional script usage" below](#additional-script-usage)
- Create a new Kata sandbox pod using `crictl` with:
```bash
$ ~/ccv0.sh crictl_create_cc_pod
```
- This creates a pod configuration file, creates the pod from this using
`sudo crictl runp -r kata ~/pod-config.yaml` and runs `sudo crictl pods` to show the pod
- Create a new Kata confidential container with:
```bash
$ ~/ccv0.sh crictl_create_cc_container
```
- This creates a container (based on `busybox:1.33.1`) in the Kata cc sandbox and prints a list of containers.
This will have been created based on an image pulled in the Kata pod sandbox/guest, not on the host machine.
As this point you should have a `crictl` pod and container that is using the Kata confidential containers runtime.
You can [validate that the container image was pulled on the guest](#validate-that-the-container-image-was-pulled-on-the-guest)
or [using the Kata pod sandbox for testing with `agent-ctl` or `ctr shim`](#using-a-kata-pod-sandbox-for-testing-with-agent-ctl-or-ctr-shim)
#### Clean up the `crictl` pod sandbox and container
- When the testing is complete you can delete the container and pod by running:
```bash
$ ~/ccv0.sh crictl_delete_cc
```
### Using Kubernetes for end-to-end provisioning of a Kata confidential containers pod with an unencrypted image
- Run the full build process with the Kubernetes environment variable set to `"yes"`, so the Kubernetes cluster is
configured and created using the VM
as a single node cluster:
```bash
$ export KUBERNETES="yes"
$ ~/ccv0.sh build_and_install_all
```
> **Note**: Depending on how where your VMs are hosted and how IPs are shared you might get an error from docker
during matching `ERROR: toomanyrequests: Too Many Requests`. To get past
this, login into Docker Hub and pull the images used with:
> ```bash
> $ sudo docker login
> $ sudo docker pull registry:2
> $ sudo docker pull ubuntu:20.04
> ```
> then re-run the command.
- Check that your Kubernetes cluster has been correctly set-up by running :
```bash
$ kubectl get nodes
```
and checking that you see a single node e.g.
```text
NAME STATUS ROLES AGE VERSION
stevenh-ccv0-k8s1.fyre.ibm.com Ready control-plane,master 43s v1.22.0
```
- Create a Kata confidential containers pod by running:
```bash
$ ~/ccv0.sh kubernetes_create_cc_pod
```
- Wait a few seconds for pod to start then check that the pod's status is `Running` with
```bash
$ kubectl get pods
```
which should show something like:
```text
NAME READY STATUS RESTARTS AGE
busybox-cc 1/1 Running 0 54s
```
- As this point you should have a Kubernetes pod and container running, that is using the Kata
confidential containers runtime.
You can [validate that the container image was pulled on the guest](#validate-that-the-container-image-was-pulled-on-the-guest)
or [using the Kata pod sandbox for testing with `agent-ctl` or `ctr shim`](#using-a-kata-pod-sandbox-for-testing-with-agent-ctl-or-ctr-shim)
#### Clean up the Kubernetes pod sandbox and container
- When the testing is complete you can delete the container and pod by running:
```bash
$ ~/ccv0.sh kubernetes_delete_cc_pod
```
### Validate that the container image was pulled on the guest
There are a couple of ways we can check that the container pull image action was offloaded to the guest, by checking
the guest's file system for the unpacked bundle and checking the host's directories to ensure it wasn't also pulled
there.
- To check the guest's file system:
- Open a shell into the Kata guest with:
```bash
$ ~/ccv0.sh open_kata_shell
```
- List the files in the directory that the container image bundle should have been unpacked to with:
```bash
$ ls -ltr /run/kata-containers/confidential-containers_signed/
```
- This should give something like
```
total 72
-rw-r--r-- 1 root root 2977 Jan 20 10:03 config.json
drwxr-xr-x 12 root root 240 Jan 20 10:03 rootfs
```
which shows how the image has been pulled and then unbundled on the guest.
- Leave the Kata guest shell by running:
```bash
$ exit
```
- To verify that the image wasn't pulled on the host system we can look at the shared sandbox on the host and we
should only see a single bundle for the pause container as the `busybox` based container image should have been
pulled on the guest:
- Find all the `rootfs` directories under in the pod's shared directory with:
```bash
$ pod_id=$(ps -ef | grep containerd-shim-kata-v2 | egrep -o "id [^,][^,].* " | awk '{print $2}')
$ sudo find /run/kata-containers/shared/sandboxes/${pod_id}/shared -name rootfs
```
which should only show a single `rootfs` directory if the container image was pulled on the guest, not the host
- Looking that `rootfs` directory with
```bash
$ sudo ls -ltr $(sudo find /run/kata-containers/shared/sandboxes/${pod_id}/shared -name rootfs)
```
shows something similar to
```
total 668
-rwxr-xr-x 1 root root 682696 Aug 25 13:58 pause
drwxr-xr-x 2 root root 6 Jan 20 02:01 proc
drwxr-xr-x 2 root root 6 Jan 20 02:01 dev
drwxr-xr-x 2 root root 6 Jan 20 02:01 sys
drwxr-xr-x 2 root root 25 Jan 20 02:01 etc
```
which is clearly the pause container indicating that the `busybox` based container image is not exposed to the host.
### Using a Kata pod sandbox for testing with `agent-ctl` or `ctr shim`
Once you have a kata pod sandbox created as described above, either using
[`crictl`](#using-crictl-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image), or [Kubernetes](#using-kubernetes-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image)
, you can use this to test specific components of the Kata confidential
containers architecture. This can be useful for development and debugging to isolate and test features
that aren't broadly supported end-to-end. Here are some examples:
- In the first terminal run the pull image on guest command against the Kata agent, via the shim (`containerd-shim-kata-v2`).
This can be achieved using the [containerd](https://github.com/containerd/containerd) CLI tool, `ctr`, which can be used to
interact with the shim directly. The command takes the form
`ctr --namespace k8s.io shim --id <sandbox-id> pull-image <image> <new-container-id>` and can been run directly, or through
the `ccv0.sh` script to automatically fill in the variables:
- Optionally, set up some environment variables to set the image and credentials used:
- By default the shim pull image test in `ccv0.sh` will use the `busybox:1.33.1` based test image
`quay.io/kata-containers/confidential-containers:signed` which requires no authentication. To use a different
image, set the `PULL_IMAGE` environment variable e.g.
```bash
$ export PULL_IMAGE="docker.io/library/busybox:latest"
```
Currently the containerd shim pull image
code doesn't support using a container registry that requires authentication, so if this is required, see the
below steps to run the pull image command against the agent directly.
- Run the pull image agent endpoint with:
```bash
$ ~/ccv0.sh shim_pull_image
```
which we print the `ctr shim` command for reference
- Alternatively you can issue the command directly to the `kata-agent` pull image endpoint, which also supports
credentials in order to pull from an authenticated registry:
- Optionally set up some environment variables to set the image and credentials used:
- Set the `PULL_IMAGE` environment variable e.g. `export PULL_IMAGE="docker.io/library/busybox:latest"`
if a specific container image is required.
- If the container registry for the image requires authentication then this can be set with an environment
variable `SOURCE_CREDS`. For example to use Docker Hub (`docker.io`) as an authenticated user first run
`export SOURCE_CREDS="<dockerhub username>:<dockerhub api key>"`
> **Note**: the credentials support on the agent request is a tactical solution for the short-term
proof of concept to allow more images to be pulled and tested. Once we have support for getting
keys into the Kata guest image using the attestation-agent and/or KBS I'd expect container registry
credentials to be looked up using that mechanism.
- Run the pull image agent endpoint with
```bash
$ ~/ccv0.sh agent_pull_image
```
and you should see output which includes `Command PullImage (1 of 1) returned (Ok(()), false)` to indicate
that the `PullImage` request was successful e.g.
```
Finished release [optimized] target(s) in 0.21s
{"msg":"announce","level":"INFO","ts":"2021-09-15T08:40:14.189360410-07:00","subsystem":"rpc","name":"kata-agent-ctl","pid":"830920","version":"0.1.0","source":"kata-agent-ctl","config":"Config { server_address: \"vsock://1970354082:1024\", bundle_dir: \"/tmp/bundle\", timeout_nano: 0, interactive: false, ignore_errors: false }"}
{"msg":"client setup complete","level":"INFO","ts":"2021-09-15T08:40:14.193639057-07:00","pid":"830920","source":"kata-agent-ctl","name":"kata-agent-ctl","subsystem":"rpc","version":"0.1.0","server-address":"vsock://1970354082:1024"}
{"msg":"Run command PullImage (1 of 1)","level":"INFO","ts":"2021-09-15T08:40:14.196643765-07:00","pid":"830920","source":"kata-agent-ctl","subsystem":"rpc","name":"kata-agent-ctl","version":"0.1.0"}
{"msg":"response received","level":"INFO","ts":"2021-09-15T08:40:43.828200633-07:00","source":"kata-agent-ctl","name":"kata-agent-ctl","subsystem":"rpc","version":"0.1.0","pid":"830920","response":""}
{"msg":"Command PullImage (1 of 1) returned (Ok(()), false)","level":"INFO","ts":"2021-09-15T08:40:43.828261708-07:00","subsystem":"rpc","pid":"830920","source":"kata-agent-ctl","version":"0.1.0","name":"kata-agent-ctl"}
```
> **Note**: The first time that `~/ccv0.sh agent_pull_image` is run, the `agent-ctl` tool will be built
which may take a few minutes.
- To validate that the image pull was successful, you can open a shell into the Kata guest with:
```bash
$ ~/ccv0.sh open_kata_shell
```
- Check the `/run/kata-containers/` directory to verify that the container image bundle has been created in a directory
named either `01234556789` (for the container id), or the container image name, e.g.
```bash
$ ls -ltr /run/kata-containers/confidential-containers_signed/
```
which should show something like
```
total 72
drwxr-xr-x 10 root root 200 Jan 1 1970 rootfs
-rw-r--r-- 1 root root 2977 Jan 20 16:45 config.json
```
- Leave the Kata shell by running:
```bash
$ exit
```
## Verifying signed images
For this sample demo, we use local attestation to pass through the required
configuration to do container image signature verification. Due to this, the ability to verify images is limited
to a pre-created selection of test images in our test
repository [`quay.io/kata-containers/confidential-containers`](https://quay.io/repository/kata-containers/confidential-containers?tab=tags).
For pulling images not in this test repository (called an *unprotected* registry below), we fall back to the behaviour
of not enforcing signatures. More documentation on how to customise this to match your own containers through local,
or remote attestation will be available in future.
In our test repository there are three tagged images:
| Test Image | Base Image used | Signature status | GPG key status |
| --- | --- | --- | --- |
| `quay.io/kata-containers/confidential-containers:signed` | `busybox:1.33.1` | [signature](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/signatures.tar) embedded in kata rootfs | [public key](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/public.gpg) embedded in kata rootfs |
| `quay.io/kata-containers/confidential-containers:unsigned` | `busybox:1.33.1` | not signed | not signed |
| `quay.io/kata-containers/confidential-containers:other_signed` | `nginx:1.21.3` | [signature](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/signatures.tar) embedded in kata rootfs | GPG key not kept |
Using a standard unsigned `busybox` image that can be pulled from another, *unprotected*, `quay.io` repository we can
test a few scenarios.
In this sample, with local attestation, we pass in the the public GPG key and signature files, and the [`offline_fs_kbc`
configuration](https://github.com/confidential-containers/attestation-agent/blob/main/src/kbc_modules/offline_fs_kbc/README.md)
into the guest image which specifies that any container image from `quay.io/kata-containers`
must be signed with the embedded GPG key and the agent configuration needs updating to enable this.
With this policy set a few tests of image verification can be done to test different scenarios by attempting
to create containers from these images using `crictl`:
- If you don't already have the Kata Containers CC code built and configured for `crictl`, then follow the
[instructions above](#using-crictl-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image)
up to the `~/ccv0.sh crictl_create_cc_pod` command.
- In order to enable the guest image, you will need to setup the required configuration, policy and signature files
needed by running
`~/ccv0.sh copy_signature_files_to_guest` and then run `~/ccv0.sh crictl_create_cc_pod` which will delete and recreate
your pod - adding in the new files.
- To test the fallback behaviour works using an unsigned image from an *unprotected* registry we can pull the `busybox`
image by running:
```bash
$ export CONTAINER_CONFIG_FILE=container-config_unsigned-unprotected.yaml
$ ~/ccv0.sh crictl_create_cc_container
```
- This finishes showing the running container e.g.
```text
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID
98c70fefe997a quay.io/prometheus/busybox:latest Less than a second ago Running prometheus-busybox-signed 0 70119e0539238
```
- To test that an unsigned image from our *protected* test container registry is rejected we can run:
```bash
$ export CONTAINER_CONFIG_FILE=container-config_unsigned-protected.yaml
$ ~/ccv0.sh crictl_create_cc_container
```
- This correctly results in an error message from `crictl`:
`PullImage from image service failed" err="rpc error: code = Internal desc = Security validate failed: Validate image failed: The signatures do not satisfied! Reject reason: [Match reference failed.]" image="quay.io/kata-containers/confidential-containers:unsigned"`
- To test that the signed image our *protected* test container registry is accepted we can run:
```bash
$ export CONTAINER_CONFIG_FILE=container-config.yaml
$ ~/ccv0.sh crictl_create_cc_container
```
- This finishes by showing a new `kata-cc-busybox-signed` running container e.g.
```text
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID
b4d85c2132ed9 quay.io/kata-containers/confidential-containers:signed Less than a second ago Running kata-cc-busybox-signed 0 70119e0539238
...
```
- Finally to check the image with a valid signature, but invalid GPG key (the real trusted piece of information we really
want to protect with the attestation agent in future) fails we can run:
```bash
$ export CONTAINER_CONFIG_FILE=container-config_signed-protected-other.yaml
$ ~/ccv0.sh crictl_create_cc_container
```
- Again this results in an error message from `crictl`:
`"PullImage from image service failed" err="rpc error: code = Internal desc = Security validate failed: Validate image failed: The signatures do not satisfied! Reject reason: [signature verify failed! There is no pubkey can verify the signature!]" image="quay.io/kata-containers/confidential-containers:other_signed"`
### Using Kubernetes to create a Kata confidential containers pod from the encrypted ssh demo sample image
The [ssh-demo](https://github.com/confidential-containers/documentation/tree/main/demos/ssh-demo) explains how to
demonstrate creating a Kata confidential containers pod from an encrypted image with the runtime created by the
[confidential-containers operator](https://github.com/confidential-containers/documentation/blob/main/demos/operator-demo).
To be fully confidential, this should be run on a Trusted Execution Environment, but it can be tested on generic
hardware as well.
If you wish to build the Kata confidential containers runtime to do this yourself, then you can using the following
steps:
- Run the full build process with the Kubernetes environment variable set to `"yes"`, so the Kubernetes cluster is
configured and created using the VM as a single node cluster and with `AA_KBC` set to `offline_fs_kbc`.
```bash
$ export KUBERNETES="yes"
$ export AA_KBC=offline_fs_kbc
$ ~/ccv0.sh build_and_install_all
```
- The `AA_KBC=offline_fs_kbc` mode will ensure that, when creating the rootfs of the Kata guest, the
[attestation-agent](https://github.com/confidential-containers/attestation-agent) will be added along with the
[sample offline KBC](https://github.com/confidential-containers/documentation/blob/main/demos/ssh-demo/aa-offline_fs_kbc-keys.json)
and an agent configuration file
> **Note**: Depending on how where your VMs are hosted and how IPs are shared you might get an error from docker
during matching `ERROR: toomanyrequests: Too Many Requests`. To get past
this, login into Docker Hub and pull the images used with:
> ```bash
> $ sudo docker login
> $ sudo docker pull registry:2
> $ sudo docker pull ubuntu:20.04
> ```
> then re-run the command.
- Check that your Kubernetes cluster has been correctly set-up by running :
```bash
$ kubectl get nodes
```
and checking that you see a single node e.g.
```text
NAME STATUS ROLES AGE VERSION
stevenh-ccv0-k8s1.fyre.ibm.com Ready control-plane,master 43s v1.22.0
```
- Create a sample Kata confidential containers ssh pod by running:
```bash
$ ~/ccv0.sh kubernetes_create_ssh_demo_pod
```
- As this point you should have a Kubernetes pod running the Kata confidential containers runtime that has pulled
the [sample image](https://hub.docker.com/r/katadocker/ccv0-ssh) which was encrypted by the key file that we included
in the rootfs.
During the pod deployment the image was pulled and then decrypted using the key file, on the Kata guest image, without
it ever being available to the host.
- To validate that the container is working you, can connect to the image via SSH by running:
```bash
$ ~/ccv0.sh connect_to_ssh_demo_pod
```
- During this connection the host key fingerprint is shown and should match:
`ED25519 key fingerprint is SHA256:wK7uOpqpYQczcgV00fGCh+X97sJL3f6G1Ku4rvlwtR0.`
- After you are finished connecting then run:
```bash
$ exit
```
- To delete the sample SSH demo pod run:
```bash
$ ~/ccv0.sh kubernetes_delete_ssh_demo_pod
```
## Additional script usage
As well as being able to use the script as above to build all of `kata-containers` from scratch it can be used to just
re-build bits of it by running the script with different parameters. For example after the first build you will often
not need to re-install the dependencies, the hypervisor or the Guest kernel, but just test code changes made to the
runtime and agent. This can be done by running `~/ccv0.sh rebuild_and_install_kata`. (*Note this does a hard checkout*
*from git, so if your changes are only made locally it is better to do the individual steps e.g.*
`~/ccv0.sh build_kata_runtime && ~/ccv0.sh build_and_add_agent_to_rootfs && ~/ccv0.sh build_and_install_rootfs`).
There are commands for a lot of steps in building, setting up and testing and the full list can be seen by running
`~/ccv0.sh help`:
```
$ ~/ccv0.sh help
Overview:
Build and test kata containers from source
Optionally set kata-containers and tests repo and branch as exported variables before running
e.g. export katacontainers_repo=github.com/stevenhorsman/kata-containers && export katacontainers_branch=kata-ci-from-fork && export tests_repo=github.com/stevenhorsman/tests && export tests_branch=kata-ci-from-fork && ~/ccv0.sh build_and_install_all
Usage:
ccv0.sh [options] <command>
Commands:
- help: Display this help
- all: Build and install everything, test kata with containerd and capture the logs
- build_and_install_all: Build and install everything
- initialize: Install dependencies and check out kata-containers source
- rebuild_and_install_kata: Rebuild the kata runtime and agent and build and install the image
- build_kata_runtime: Build and install the kata runtime
- configure: Configure Kata to use rootfs and enable debug
- create_rootfs: Create a local rootfs
- build_and_add_agent_to_rootfs:Builds the kata-agent and adds it to the rootfs
- build_and_install_rootfs: Builds and installs the rootfs image
- install_guest_kernel: Setup, build and install the guest kernel
- build_cloud_hypervisor Checkout, patch, build and install Cloud Hypervisor
- build_qemu: Checkout, patch, build and install QEMU
- init_kubernetes: initialize a Kubernetes cluster on this system
- crictl_create_cc_pod Use crictl to create a new kata cc pod
- crictl_create_cc_container Use crictl to create a new busybox container in the kata cc pod
- crictl_delete_cc Use crictl to delete the kata cc pod sandbox and container in it
- kubernetes_create_cc_pod: Create a Kata CC runtime busybox-based pod in Kubernetes
- kubernetes_delete_cc_pod: Delete the Kata CC runtime busybox-based pod in Kubernetes
- open_kata_shell: Open a shell into the kata runtime
- agent_pull_image: Run PullImage command against the agent with agent-ctl
- shim_pull_image: Run PullImage command against the shim with ctr
- agent_create_container: Run CreateContainer command against the agent with agent-ctl
- test: Test using kata with containerd
- test_capture_logs: Test using kata with containerd and capture the logs in the user's home directory
Options:
-d: Enable debug
-h: Display this help
```

View File

@@ -1,44 +0,0 @@
# Generating a Kata Containers payload for the Confidential Containers Operator
[Confidential Containers
Operator](https://github.com/confidential-containers/operator) consumes a Kata
Containers payload, generated from the `CCv0` branch, and here one can find all
the necessary info on how to build such a payload.
## Requirements
* `make` installed in the machine
* Docker installed in the machine
* `sudo` access to the machine
## Process
* Clone [Kata Containers](https://github.com/kata-containers/kata-containers)
```sh
git clone --branch CCv0 https://github.com/kata-containers/kata-containers
```
* In case you've already cloned the repo, make sure to switch to the `CCv0` branch
```sh
git checkout CCv0
```
* Ensure your tree is clean and in sync with upstream `CCv0`
```sh
git clean -xfd
git reset --hard <upstream>/CCv0
```
* Make sure you're authenticated to `quay.io`
```sh
sudo docker login quay.io
```
* From the top repo directory, run:
```sh
sudo make cc-payload
```
* Make sure the image was upload to the [Confidential Containers
runtime-payload
registry](https://quay.io/repository/confidential-containers/runtime-payload?tab=tags)
## Notes
Make sure to run it on a machine that's not the one you're hacking on, prepare a
cup of tea, and get back to it an hour later (at least).

View File

@@ -28,10 +28,10 @@ __Steps from the Developer Guide:__
__SNP-specific steps:__
- Build the SNP-specific kernel as shown below (see this [guide](../../tools/packaging/kernel/README.md#build-kata-containers-kernel) for more information)
```bash
$ pushd kata-containers/tools/packaging/kernel/
$ ./build-kernel.sh -a x86_64 -x snp setup
$ ./build-kernel.sh -a x86_64 -x snp build
$ sudo -E PATH="${PATH}" ./build-kernel.sh -x snp install
$ pushd kata-containers/tools/packaging/
$ ./kernel/build-kernel.sh -a x86_64 -x snp setup
$ ./kernel/build-kernel.sh -a x86_64 -x snp build
$ sudo -E PATH="${PATH}" ./kernel/build-kernel.sh -x snp install
$ popd
```
- Build a current OVMF capable of SEV-SNP:

View File

@@ -27,8 +27,6 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.runtime.internetworking_model` | string| determines how the VM should be connected to the container network interface. Valid values are `macvtap`, `tcfilter` and `none` |
| `io.katacontainers.config.runtime.sandbox_cgroup_only`| `boolean` | determines if Kata processes are managed only in sandbox cgroup |
| `io.katacontainers.config.runtime.enable_pprof` | `boolean` | enables Golang `pprof` for `containerd-shim-kata-v2` process |
| `io.katacontainers.config.runtime.image_request_timeout` | `uint64` | the timeout for pulling an image within the guest in `seconds`, default is `60` |
| `io.katacontainers.config.runtime.sealed_secret_enabled` | `boolean` | enables the sealed secret feature, default is `false` |
## Agent Options
| Key | Value Type | Comments |
@@ -96,16 +94,6 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.hypervisor.enable_guest_swap` | `boolean` | enable swap in the guest |
| `io.katacontainers.config.hypervisor.use_legacy_serial` | `boolean` | uses legacy serial device for guest's console (QEMU) |
## Confidential Computing Options
| Key | Value Type | Comments |
|-------| ----- | ----- |
| `io.katacontainers.config.pre_attestation.enabled"` | `bool` |
determines if SEV/-ES attestation is enabled |
| `io.katacontainers.config.pre_attestation.uri"` | `string` |
specify the location of the attestation server |
| `io.katacontainers.config.sev.policy"` | `uint32` |
specify the SEV guest policy |
## Container Options
| Key | Value Type | Comments |
|-------| ----- | ----- |

View File

@@ -29,7 +29,7 @@ Then you can build and install the guest kernel image as shown [here](../../tool
## Run a Kata Container utilizing `virtio-mem`
Use following command to enable memory overcommitment of a Linux kernel. Because QEMU `virtio-mem` device need to allocate a lot of memory.
Use following command to enable memory over-commitment of a Linux kernel. Because QEMU `virtio-mem` device need to allocate a lot of memory.
```
$ echo 1 | sudo tee /proc/sys/vm/overcommit_memory
```

3425
src/agent/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -8,7 +8,7 @@ license = "Apache-2.0"
[dependencies]
oci = { path = "../libs/oci" }
rustjail = { path = "rustjail" }
protocols = { path = "../libs/protocols", features = ["async"] }
protocols = { path = "../libs/protocols", features = ["async", "with-serde"] }
lazy_static = "1.3.0"
ttrpc = { version = "0.7.1", features = ["async"], default-features = false }
protobuf = "3.2.0"
@@ -23,20 +23,17 @@ regex = "1.5.6"
serial_test = "0.5.1"
kata-sys-util = { path = "../libs/kata-sys-util" }
kata-types = { path = "../libs/kata-types" }
const_format = "0.2.30"
url = "2.2.2"
derivative = "2.2.0"
# Async helpers
async-trait = "0.1.42"
async-recursion = "0.3.2"
futures = "0.3.28"
futures = "0.3.17"
# Async runtime
tokio = { version = "1.28.1", features = ["full"] }
tokio-vsock = "0.3.1"
netlink-sys = { version = "0.7.0", features = ["tokio_socket"] }
netlink-sys = { version = "0.7.0", features = ["tokio_socket",]}
rtnetlink = "0.8.0"
netlink-packet-utils = "0.4.1"
ipnetwork = "0.17.0"
@@ -62,7 +59,7 @@ cgroups = { package = "cgroups-rs", version = "0.3.2" }
tracing = "0.1.26"
tracing-subscriber = "0.2.18"
tracing-opentelemetry = "0.13.0"
opentelemetry = { version = "0.14.0", features = ["rt-tokio-current-thread"] }
opentelemetry = { version = "0.14.0", features = ["rt-tokio-current-thread"]}
vsock-exporter = { path = "vsock-exporter" }
# Configuration
@@ -70,18 +67,11 @@ serde = { version = "1.0.129", features = ["derive"] }
toml = "0.5.8"
clap = { version = "3.0.1", features = ["derive"] }
# "vendored" feature for openssl is required by musl build
openssl = { version = "0.10.38", features = ["vendored"] }
# Image pull/decrypt
image-rs = { git = "https://github.com/confidential-containers/guest-components", tag = "v0.8.0", default-features = false, features = [
"kata-cc-native-tls",
"verity",
"signature-simple-xrss",
] }
[patch.crates-io]
oci-distribution = { git = "https://github.com/krustlet/oci-distribution.git", rev = "f44124c" }
# Communication with the OPA service
http = { version = "0.2.8", optional = true }
reqwest = { version = "0.11.14", optional = true }
# The "vendored" feature for openssl is required for musl build
openssl = { version = "0.10.54", features = ["vendored"], optional = true }
[dev-dependencies]
tempfile = "3.1.0"
@@ -89,17 +79,17 @@ test-utils = { path = "../libs/test-utils" }
which = "4.3.0"
[workspace]
resolver = "2"
members = ["rustjail"]
members = [
"rustjail",
]
[profile.release]
lto = true
[features]
confidential-data-hub = []
seccomp = ["rustjail/seccomp"]
sealed-secret = ["protocols/sealed-secret", "confidential-data-hub"]
standard-oci-runtime = ["rustjail/standard-oci-runtime"]
agent-policy = ["http", "openssl", "reqwest"]
[[bin]]
name = "kata-agent"

View File

@@ -33,11 +33,12 @@ ifeq ($(SECCOMP),yes)
override EXTRA_RUSTFEATURES += seccomp
endif
SEALED_SECRET ?= no
##VAR AGENT_POLICY=yes|no define if agent enables the policy feature
AGENT_POLICY ?= no
# Enable sealed-secret feature of rust build
ifeq ($(SEALED_SECRET),yes)
override EXTRA_RUSTFEATURES += sealed-secret
# Enable the policy feature of rust build
ifeq ($(AGENT_POLICY),yes)
override EXTRA_RUSTFEATURES += agent-policy
endif
include ../../utils.mk
@@ -61,7 +62,7 @@ endif
TARGET_PATH = target/$(TRIPLE)/$(BUILD_TYPE)/$(TARGET)
##VAR DESTDIR=<path> is a directory prepended to each installed target file
DESTDIR :=
DESTDIR ?=
##VAR BINDIR=<path> is a directory for installing executable programs
BINDIR := /usr/bin
@@ -147,7 +148,7 @@ vendor:
#TARGET test: run cargo tests
test:
test: $(GENERATED_FILES)
@cargo test --all --target $(TRIPLE) $(EXTRA_RUSTFEATURES) -- --nocapture
##TARGET check: run test

View File

@@ -541,11 +541,8 @@ fn linux_device_to_cgroup_device(d: &LinuxDevice) -> Option<DeviceResource> {
}
fn linux_device_group_to_cgroup_device(d: &LinuxDeviceCgroup) -> Option<DeviceResource> {
let dev_type = match &d.r#type {
Some(t_s) => match DeviceType::from_char(t_s.chars().next()) {
Some(t_c) => t_c,
None => return None,
},
let dev_type = match DeviceType::from_char(d.r#type.chars().next()) {
Some(t) => t,
None => return None,
};
@@ -602,7 +599,7 @@ lazy_static! {
// all mknod to all char devices
LinuxDeviceCgroup {
allow: true,
r#type: Some("c".to_string()),
r#type: "c".to_string(),
major: Some(WILDCARD),
minor: Some(WILDCARD),
access: "m".to_string(),
@@ -611,7 +608,7 @@ lazy_static! {
// all mknod to all block devices
LinuxDeviceCgroup {
allow: true,
r#type: Some("b".to_string()),
r#type: "b".to_string(),
major: Some(WILDCARD),
minor: Some(WILDCARD),
access: "m".to_string(),
@@ -620,7 +617,7 @@ lazy_static! {
// all read/write/mknod to char device /dev/console
LinuxDeviceCgroup {
allow: true,
r#type: Some("c".to_string()),
r#type: "c".to_string(),
major: Some(5),
minor: Some(1),
access: "rwm".to_string(),
@@ -629,7 +626,7 @@ lazy_static! {
// all read/write/mknod to char device /dev/pts/<N>
LinuxDeviceCgroup {
allow: true,
r#type: Some("c".to_string()),
r#type: "c".to_string(),
major: Some(136),
minor: Some(WILDCARD),
access: "rwm".to_string(),
@@ -638,7 +635,7 @@ lazy_static! {
// all read/write/mknod to char device /dev/ptmx
LinuxDeviceCgroup {
allow: true,
r#type: Some("c".to_string()),
r#type: "c".to_string(),
major: Some(5),
minor: Some(2),
access: "rwm".to_string(),
@@ -647,7 +644,7 @@ lazy_static! {
// all read/write/mknod to char device /dev/net/tun
LinuxDeviceCgroup {
allow: true,
r#type: Some("c".to_string()),
r#type: "c".to_string(),
major: Some(10),
minor: Some(200),
access: "rwm".to_string(),

View File

@@ -80,6 +80,7 @@ const CLOG_FD: &str = "CLOG_FD";
const FIFO_FD: &str = "FIFO_FD";
const HOME_ENV_KEY: &str = "HOME";
const PIDNS_FD: &str = "PIDNS_FD";
const PIDNS_ENABLED: &str = "PIDNS_ENABLED";
const CONSOLE_SOCKET_FD: &str = "CONSOLE_SOCKET_FD";
#[derive(Debug)]
@@ -280,6 +281,17 @@ pub struct SyncPc {
pid: pid_t,
}
#[derive(Debug, Clone)]
pub struct PidNs {
enabled: bool,
fd: Option<i32>,
}
impl PidNs {
pub fn new(enabled: bool, fd: Option<i32>) -> Self {
Self { enabled, fd }
}
}
pub trait Container: BaseContainer {
fn pause(&mut self) -> Result<()>;
fn resume(&mut self) -> Result<()>;
@@ -339,16 +351,20 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
let crfd = std::env::var(CRFD_FD)?.parse::<i32>().unwrap();
let cfd_log = std::env::var(CLOG_FD)?.parse::<i32>().unwrap();
// get the pidns fd from parent, if parent had passed the pidns fd,
// then get it and join in this pidns; otherwise, create a new pidns
// by unshare from the parent pidns.
match std::env::var(PIDNS_FD) {
Ok(fd) => {
let pidns_fd = fd.parse::<i32>().context("get parent pidns fd")?;
sched::setns(pidns_fd, CloneFlags::CLONE_NEWPID).context("failed to join pidns")?;
let _ = unistd::close(pidns_fd);
if std::env::var(PIDNS_ENABLED)?.eq(format!("{}", true).as_str()) {
// get the pidns fd from parent, if parent had passed the pidns fd,
// then get it and join in this pidns; otherwise, create a new pidns
// by unshare from the parent pidns.
match std::env::var(PIDNS_FD) {
Ok(fd) => {
let pidns_fd = fd.parse::<i32>().context("get parent pidns fd")?;
sched::setns(pidns_fd, CloneFlags::CLONE_NEWPID).context("failed to join pidns")?;
let _ = unistd::close(pidns_fd);
}
Err(_e) => {
sched::unshare(CloneFlags::CLONE_NEWPID)?;
}
}
Err(_e) => sched::unshare(CloneFlags::CLONE_NEWPID)?,
}
match unsafe { fork() } {
@@ -983,9 +999,13 @@ impl BaseContainer for LinuxContainer {
}
let pidns = get_pid_namespace(&self.logger, linux)?;
#[cfg(not(feature = "standard-oci-runtime"))]
if !pidns.enabled {
return Err(anyhow!("cannot find the pid ns"));
}
defer!(if let Some(pid) = pidns {
let _ = unistd::close(pid);
defer!(if let Some(fd) = pidns.fd {
let _ = unistd::close(fd);
});
let exec_path = std::env::current_exe()?;
@@ -1008,14 +1028,15 @@ impl BaseContainer for LinuxContainer {
.env(CRFD_FD, format!("{}", crfd))
.env(CWFD_FD, format!("{}", cwfd))
.env(CLOG_FD, format!("{}", cfd_log))
.env(CONSOLE_SOCKET_FD, console_name);
.env(CONSOLE_SOCKET_FD, console_name)
.env(PIDNS_ENABLED, format!("{}", pidns.enabled));
if p.init {
child = child.env(FIFO_FD, format!("{}", fifofd));
}
if pidns.is_some() {
child = child.env(PIDNS_FD, format!("{}", pidns.unwrap()));
if pidns.fd.is_some() {
child = child.env(PIDNS_FD, format!("{}", pidns.fd.unwrap()));
}
child.spawn()?;
@@ -1249,11 +1270,11 @@ pub fn update_namespaces(logger: &Logger, spec: &mut Spec, init_pid: RawFd) -> R
Ok(())
}
fn get_pid_namespace(logger: &Logger, linux: &Linux) -> Result<Option<RawFd>> {
fn get_pid_namespace(logger: &Logger, linux: &Linux) -> Result<PidNs> {
for ns in &linux.namespaces {
if ns.r#type == "pid" {
if ns.path.is_empty() {
return Ok(None);
return Ok(PidNs::new(true, None));
}
let fd =
@@ -1269,11 +1290,11 @@ fn get_pid_namespace(logger: &Logger, linux: &Linux) -> Result<Option<RawFd>> {
e
})?;
return Ok(Some(fd));
return Ok(PidNs::new(true, Some(fd)));
}
}
Err(anyhow!("cannot find the pid ns"))
Ok(PidNs::new(false, None))
}
fn is_userns_enabled(linux: &Linux) -> bool {

View File

@@ -241,12 +241,6 @@ pub fn resources_grpc_to_oci(res: &grpc::LinuxResources) -> oci::LinuxResources
let devices = {
let mut d = Vec::new();
for dev in res.Devices.iter() {
let dev_type = if dev.Type.is_empty() {
None
} else {
Some(dev.Type.clone())
};
let major = if dev.Major == -1 {
None
} else {
@@ -260,7 +254,7 @@ pub fn resources_grpc_to_oci(res: &grpc::LinuxResources) -> oci::LinuxResources
};
d.push(oci::LinuxDeviceCgroup {
allow: dev.Allow,
r#type: dev_type,
r#type: dev.Type.clone(),
major,
minor,
access: dev.Access.clone(),
@@ -429,12 +423,18 @@ fn linux_grpc_to_oci(l: &grpc::Linux) -> oci::Linux {
let mut r = Vec::new();
for d in l.Devices.iter() {
// if the filemode for the device is 0 (unset), use a default value as runc does
let filemode = if d.FileMode != 0 {
Some(d.FileMode)
} else {
Some(0o666)
};
r.push(oci::LinuxDevice {
path: d.Path.clone(),
r#type: d.Type.clone(),
major: d.Major,
minor: d.Minor,
file_mode: Some(d.FileMode),
file_mode: filemode,
uid: Some(d.UID),
gid: Some(d.GID),
});

View File

@@ -1,289 +0,0 @@
// Copyright (c) 2023 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
// Confidential Data Hub client wrapper.
// Confidential Data Hub is a service running inside guest to provide resource related APIs.
// https://github.com/confidential-containers/guest-components/tree/main/confidential-data-hub
use anyhow::{anyhow, Result};
use oci::{Mount, Spec};
use protocols::{
sealed_secret, sealed_secret_ttrpc_async, sealed_secret_ttrpc_async::SealedSecretServiceClient,
};
use std::fs;
use std::os::unix::fs::symlink;
use std::path::Path;
const CDH_ADDR: &str = "unix:///run/confidential-containers/cdh.sock";
const SECRETS_DIR: &str = "/run/secrets/";
const SEALED_SECRET_TIMEOUT: i64 = 50 * 1000 * 1000 * 1000;
// Convenience function to obtain the scope logger.
fn sl() -> slog::Logger {
slog_scope::logger()
}
#[derive(Clone)]
pub struct CDHClient {
sealed_secret_client: Option<SealedSecretServiceClient>,
}
impl CDHClient {
pub fn new() -> Result<Self> {
let c = ttrpc::asynchronous::Client::connect(CDH_ADDR);
match c {
Ok(v) => {
let ssclient = sealed_secret_ttrpc_async::SealedSecretServiceClient::new(v);
Ok(CDHClient {
sealed_secret_client: Some(ssclient),
})
}
Err(_) => Ok(CDHClient {
sealed_secret_client: None,
}),
}
}
pub async fn unseal_secret_async(
&self,
sealed: &str,
) -> Result<sealed_secret::UnsealSecretOutput> {
let secret = sealed
.strip_prefix("sealed.")
.ok_or(anyhow!("strip_prefix \"sealed.\" failed"))?;
let mut input = sealed_secret::UnsealSecretInput::new();
input.set_secret(secret.into());
let unseal = self
.sealed_secret_client
.as_ref()
.ok_or(anyhow!("unwrap sealed_secret_client failed"))?
.unseal_secret(ttrpc::context::with_timeout(SEALED_SECRET_TIMEOUT), &input)
.await?;
Ok(unseal)
}
pub async fn unseal_env(&self, env: &str) -> Result<String> {
let (key, value) = env.split_once('=').unwrap_or(("", ""));
if value.starts_with("sealed.") {
let unsealed_value = self.unseal_secret_async(value).await;
match unsealed_value {
Ok(v) => {
let plain_env = format!("{}={}", key, std::str::from_utf8(&v.plaintext)?);
return Ok(plain_env);
}
Err(e) => {
return Err(e);
}
};
}
Ok((*env.to_owned()).to_string())
}
pub async fn unseal_file(&self, sealed_source_path: &String) -> Result<()> {
if !Path::new(sealed_source_path).exists() {
info!(
sl(),
"sealed source path {:?} does not exist", sealed_source_path
);
return Ok(());
}
for entry in fs::read_dir(sealed_source_path)? {
let entry = entry?;
if !entry.file_type()?.is_symlink()
&& !fs::metadata(entry.path())?.file_type().is_file()
{
info!(
sl(),
"skipping sealed source entry {:?} because its file type is {:?}",
entry,
entry.file_type()?
);
continue;
}
let target_path = fs::canonicalize(&entry.path())?;
info!(sl(), "sealed source entry target path: {:?}", target_path);
if !target_path.is_file() {
info!(sl(), "sealed source is not a file: {:?}", target_path);
continue;
}
let secret_name = entry.file_name();
let contents = fs::read_to_string(&target_path)?;
if contents.starts_with("sealed.") {
info!(sl(), "sealed source entry found: {:?}", target_path);
let unsealed_filename = SECRETS_DIR.to_string()
+ secret_name
.as_os_str()
.to_str()
.ok_or(anyhow!("create unsealed_filename failed"))?;
let unsealed_value = self.unseal_secret_async(&contents).await?;
fs::write(&unsealed_filename, unsealed_value.plaintext)?;
fs::remove_file(&entry.path())?;
symlink(unsealed_filename, &entry.path())?;
}
}
Ok(())
}
pub fn create_sealed_secret_mounts(&self, spec: &mut Spec) -> Result<Vec<String>> {
let mut sealed_source_path: Vec<String> = vec![];
for m in spec.mounts.iter_mut() {
if let Some(unsealed_mount_point) = m.destination.strip_prefix("/sealed") {
info!(
sl(),
"sealed mount destination: {:?} source: {:?}", m.destination, m.source
);
sealed_source_path.push(m.source.clone());
m.destination = unsealed_mount_point.to_string();
}
}
if !sealed_source_path.is_empty() {
let sealed_mounts = Mount {
destination: SECRETS_DIR.to_string(),
r#type: "bind".to_string(),
source: SECRETS_DIR.to_string(),
options: vec!["bind".to_string()],
};
spec.mounts.push(sealed_mounts);
}
fs::create_dir_all(SECRETS_DIR)?;
Ok(sealed_source_path)
}
} /* end of impl CDHClient */
#[cfg(test)]
#[cfg(feature = "sealed-secret")]
mod tests {
use crate::cdh::CDHClient;
use crate::cdh::CDH_ADDR;
use crate::cdh::SECRETS_DIR;
use anyhow::anyhow;
use async_trait::async_trait;
use protocols::{sealed_secret, sealed_secret_ttrpc_async};
use std::fs;
use std::fs::File;
use std::io::{Read, Write};
use std::path::Path;
use std::sync::Arc;
use tokio::signal::unix::{signal, SignalKind};
struct TestService;
#[async_trait]
impl sealed_secret_ttrpc_async::SealedSecretService for TestService {
async fn unseal_secret(
&self,
_ctx: &::ttrpc::asynchronous::TtrpcContext,
_req: sealed_secret::UnsealSecretInput,
) -> ttrpc::error::Result<sealed_secret::UnsealSecretOutput> {
let mut output = sealed_secret::UnsealSecretOutput::new();
output.set_plaintext("unsealed".into());
Ok(output)
}
}
fn remove_if_sock_exist(sock_addr: &str) -> std::io::Result<()> {
let path = sock_addr
.strip_prefix("unix://")
.expect("socket address does not have the expected format.");
if std::path::Path::new(path).exists() {
std::fs::remove_file(path)?;
}
Ok(())
}
fn start_ttrpc_server() {
tokio::spawn(async move {
let ss = Box::new(TestService {})
as Box<dyn sealed_secret_ttrpc_async::SealedSecretService + Send + Sync>;
let ss = Arc::new(ss);
let ss_service = sealed_secret_ttrpc_async::create_sealed_secret_service(ss);
remove_if_sock_exist(CDH_ADDR).unwrap();
let mut server = ttrpc::asynchronous::Server::new()
.bind(CDH_ADDR)
.unwrap()
.register_service(ss_service);
server.start().await.unwrap();
let mut interrupt = signal(SignalKind::interrupt()).unwrap();
tokio::select! {
_ = interrupt.recv() => {
server.shutdown().await.unwrap();
}
};
});
}
#[tokio::test]
async fn test_unseal_env() {
let rt = tokio::runtime::Runtime::new().unwrap();
let _guard = rt.enter();
start_ttrpc_server();
std::thread::sleep(std::time::Duration::from_secs(2));
let cc = Some(CDHClient::new().unwrap());
let cdh_client = cc
.as_ref()
.ok_or(anyhow!("get confidential-data-hub client failed"))
.unwrap();
let sealed_env = String::from("key=sealed.testdata");
let unsealed_env = cdh_client.unseal_env(&sealed_env).await.unwrap();
assert_eq!(unsealed_env, String::from("key=unsealed"));
let normal_env = String::from("key=testdata");
let unchanged_env = cdh_client.unseal_env(&normal_env).await.unwrap();
assert_eq!(unchanged_env, String::from("key=testdata"));
rt.shutdown_background();
std::thread::sleep(std::time::Duration::from_secs(2));
}
#[tokio::test]
async fn test_unseal_file() {
let rt = tokio::runtime::Runtime::new().unwrap();
let _guard = rt.enter();
start_ttrpc_server();
std::thread::sleep(std::time::Duration::from_secs(2));
let cc = Some(CDHClient::new().unwrap());
let cdh_client = cc
.as_ref()
.ok_or(anyhow!("get confidential-data-hub client failed"))
.unwrap();
fs::create_dir_all(SECRETS_DIR).unwrap();
let sealed_filename = "passwd";
let mut sealed_file = File::create(sealed_filename).unwrap();
let dir = String::from(".");
sealed_file.write_all(b"sealed.passwd").unwrap();
cdh_client.unseal_file(&dir).await.unwrap();
let unsealed_filename = SECRETS_DIR.to_string() + "/passwd";
let mut unsealed_file = fs::File::open(unsealed_filename.clone()).unwrap();
let mut contents = String::new();
unsealed_file.read_to_string(&mut contents).unwrap();
assert_eq!(contents, String::from("unsealed"));
fs::remove_file(sealed_filename).unwrap();
fs::remove_file(unsealed_filename).unwrap();
let normal_filename = "passwd";
let mut normal_file = File::create(normal_filename).unwrap();
normal_file.write_all(b"passwd").unwrap();
cdh_client.unseal_file(&dir).await.unwrap();
let filename = SECRETS_DIR.to_string() + "/passwd";
assert!(!Path::new(&filename).exists());
fs::remove_file(normal_filename).unwrap();
rt.shutdown_background();
std::thread::sleep(std::time::Duration::from_secs(2));
}
}

View File

@@ -5,13 +5,11 @@
use crate::rpc;
use anyhow::{bail, ensure, Context, Result};
use serde::Deserialize;
use std::collections::HashSet;
use std::env;
use std::fs;
use std::str::FromStr;
use std::time;
use tracing::instrument;
use url::Url;
use kata_types::config::default::DEFAULT_AGENT_VSOCK_PORT;
@@ -26,15 +24,6 @@ const LOG_VPORT_OPTION: &str = "agent.log_vport";
const CONTAINER_PIPE_SIZE_OPTION: &str = "agent.container_pipe_size";
const UNIFIED_CGROUP_HIERARCHY_OPTION: &str = "agent.unified_cgroup_hierarchy";
const CONFIG_FILE: &str = "agent.config_file";
const AA_KBC_PARAMS: &str = "agent.aa_kbc_params";
const REST_API_OPTION: &str = "agent.rest_api";
const HTTPS_PROXY: &str = "agent.https_proxy";
const NO_PROXY: &str = "agent.no_proxy";
const ENABLE_DATA_INTEGRITY: &str = "agent.data_integrity";
const ENABLE_SIGNATURE_VERIFICATION: &str = "agent.enable_signature_verification";
const IMAGE_POLICY_FILE: &str = "agent.image_policy";
const IMAGE_REGISTRY_AUTH_FILE: &str = "agent.image_registry_auth";
const SIMPLE_SIGNING_SIGSTORE_CONFIG: &str = "agent.simple_signing_sigstore_config";
const DEFAULT_LOG_LEVEL: slog::Level = slog::Level::Info;
const DEFAULT_HOTPLUG_TIMEOUT: time::Duration = time::Duration::from_secs(3);
@@ -62,17 +51,6 @@ const ERR_INVALID_CONTAINER_PIPE_SIZE_PARAM: &str = "unable to parse container p
const ERR_INVALID_CONTAINER_PIPE_SIZE_KEY: &str = "invalid container pipe size key name";
const ERR_INVALID_CONTAINER_PIPE_NEGATIVE: &str = "container pipe size should not be negative";
#[derive(Debug, Default, Deserialize)]
pub struct EndpointsConfig {
pub allowed: Vec<String>,
}
#[derive(Debug, Default)]
pub struct AgentEndpoints {
pub allowed: HashSet<String>,
pub all_allowed: bool,
}
#[derive(Debug)]
pub struct AgentConfig {
pub debug_console: bool,
@@ -85,18 +63,7 @@ pub struct AgentConfig {
pub server_addr: String,
pub unified_cgroup_hierarchy: bool,
pub tracing: bool,
pub endpoints: AgentEndpoints,
pub supports_seccomp: bool,
pub container_policy_path: String,
pub aa_kbc_params: String,
pub rest_api: String,
pub https_proxy: String,
pub no_proxy: String,
pub data_integrity: bool,
pub enable_signature_verification: bool,
pub image_policy_file: String,
pub image_registry_auth_file: String,
pub simple_signing_sigstore_config: String,
}
#[derive(Debug, Deserialize)]
@@ -111,17 +78,6 @@ pub struct AgentConfigBuilder {
pub server_addr: Option<String>,
pub unified_cgroup_hierarchy: Option<bool>,
pub tracing: Option<bool>,
pub endpoints: Option<EndpointsConfig>,
pub container_policy_path: Option<String>,
pub aa_kbc_params: Option<String>,
pub rest_api: Option<String>,
pub https_proxy: Option<String>,
pub no_proxy: Option<String>,
pub data_integrity: Option<bool>,
pub enable_signature_verification: Option<bool>,
pub image_policy_file: Option<String>,
pub image_registry_auth_file: Option<String>,
pub simple_signing_sigstore_config: Option<String>,
}
macro_rules! config_override {
@@ -181,18 +137,7 @@ impl Default for AgentConfig {
server_addr: format!("{}:{}", VSOCK_ADDR, DEFAULT_AGENT_VSOCK_PORT),
unified_cgroup_hierarchy: false,
tracing: false,
endpoints: Default::default(),
supports_seccomp: rpc::have_seccomp(),
container_policy_path: String::from(""),
aa_kbc_params: String::from(""),
rest_api: String::from(""),
https_proxy: String::from(""),
no_proxy: String::from(""),
data_integrity: false,
enable_signature_verification: true,
image_policy_file: String::from(""),
image_registry_auth_file: String::from(""),
simple_signing_sigstore_config: String::from(""),
}
}
}
@@ -221,31 +166,6 @@ impl FromStr for AgentConfig {
config_override!(agent_config_builder, agent_config, server_addr);
config_override!(agent_config_builder, agent_config, unified_cgroup_hierarchy);
config_override!(agent_config_builder, agent_config, tracing);
config_override!(agent_config_builder, agent_config, container_policy_path);
config_override!(agent_config_builder, agent_config, aa_kbc_params);
config_override!(agent_config_builder, agent_config, rest_api);
config_override!(agent_config_builder, agent_config, https_proxy);
config_override!(agent_config_builder, agent_config, no_proxy);
config_override!(agent_config_builder, agent_config, data_integrity);
config_override!(
agent_config_builder,
agent_config,
enable_signature_verification
);
config_override!(agent_config_builder, agent_config, image_policy_file);
config_override!(agent_config_builder, agent_config, image_registry_auth_file);
config_override!(
agent_config_builder,
agent_config,
simple_signing_sigstore_config
);
// Populate the allowed endpoints hash set, if we got any from the config file.
if let Some(endpoints) = agent_config_builder.endpoints {
for ep in endpoints.allowed {
agent_config.endpoints.allowed.insert(ep);
}
}
Ok(agent_config)
}
@@ -268,10 +188,6 @@ impl AgentConfig {
let mut config: AgentConfig = Default::default();
let cmdline = fs::read_to_string(file)?;
let params: Vec<&str> = cmdline.split_ascii_whitespace().collect();
let mut using_config_file = false;
// Check if there is config file before parsing params that might
// override values from the config file.
for param in params.iter() {
// If we get a configuration file path from the command line, we
// generate our config from it.
@@ -279,15 +195,10 @@ impl AgentConfig {
// or if it can't be parsed properly.
if param.starts_with(format!("{}=", CONFIG_FILE).as_str()) {
let config_file = get_string_value(param)?;
config = AgentConfig::from_config_file(&config_file)
.context("AgentConfig from kernel cmdline")
.unwrap();
using_config_file = true;
break;
return AgentConfig::from_config_file(&config_file)
.context("AgentConfig from kernel cmdline");
}
}
for param in params.iter() {
// parse cmdline flags
parse_cmdline_param!(param, DEBUG_CONSOLE_FLAG, config.debug_console);
parse_cmdline_param!(param, DEV_MODE_FLAG, config.dev_mode);
@@ -347,49 +258,6 @@ impl AgentConfig {
config.unified_cgroup_hierarchy,
get_bool_value
);
parse_cmdline_param!(param, AA_KBC_PARAMS, config.aa_kbc_params, get_string_value);
parse_cmdline_param!(param, REST_API_OPTION, config.rest_api, get_string_value);
parse_cmdline_param!(param, HTTPS_PROXY, config.https_proxy, get_url_value);
parse_cmdline_param!(param, NO_PROXY, config.no_proxy, get_string_value);
parse_cmdline_param!(
param,
ENABLE_DATA_INTEGRITY,
config.data_integrity,
get_bool_value
);
parse_cmdline_param!(
param,
ENABLE_SIGNATURE_VERIFICATION,
config.enable_signature_verification,
get_bool_value
);
// URI of the image security file
parse_cmdline_param!(
param,
IMAGE_POLICY_FILE,
config.image_policy_file,
get_string_value
);
// URI of the registry auth file
parse_cmdline_param!(
param,
IMAGE_REGISTRY_AUTH_FILE,
config.image_registry_auth_file,
get_string_value
);
// URI of the simple signing sigstore file
// used when simple signing verification is used
parse_cmdline_param!(
param,
SIMPLE_SIGNING_SIGSTORE_CONFIG,
config.simple_signing_sigstore_config,
get_string_value
);
}
if let Ok(addr) = env::var(SERVER_ADDR_ENV_VAR) {
@@ -408,11 +276,6 @@ impl AgentConfig {
config.tracing = get_bool_value(&name_value)?;
}
// We did not get a configuration file: allow all endpoints.
if !using_config_file {
config.endpoints.all_allowed = true;
}
Ok(config)
}
@@ -422,10 +285,6 @@ impl AgentConfig {
.with_context(|| format!("Failed to read config file {}", file))?;
AgentConfig::from_str(&config)
}
pub fn is_allowed_endpoint(&self, ep: &str) -> bool {
self.endpoints.all_allowed || self.endpoints.allowed.contains(ep)
}
}
#[instrument]
@@ -546,12 +405,6 @@ fn get_container_pipe_size(param: &str) -> Result<i32> {
Ok(value)
}
#[instrument]
fn get_url_value(param: &str) -> Result<String> {
let value = get_string_value(param)?;
Ok(Url::parse(&value)?.to_string())
}
#[cfg(test)]
mod tests {
use test_utils::assert_result;
@@ -570,11 +423,6 @@ mod tests {
assert!(!config.dev_mode);
assert_eq!(config.log_level, DEFAULT_LOG_LEVEL);
assert_eq!(config.hotplug_timeout, DEFAULT_HOTPLUG_TIMEOUT);
assert_eq!(config.container_policy_path, "");
assert!(config.enable_signature_verification);
assert_eq!(config.image_policy_file, "");
assert_eq!(config.image_registry_auth_file, "");
assert_eq!(config.simple_signing_sigstore_config, "");
}
#[test]
@@ -593,16 +441,6 @@ mod tests {
server_addr: &'a str,
unified_cgroup_hierarchy: bool,
tracing: bool,
container_policy_path: &'a str,
aa_kbc_params: &'a str,
rest_api: &'a str,
https_proxy: &'a str,
no_proxy: &'a str,
data_integrity: bool,
enable_signature_verification: bool,
image_policy_file: &'a str,
image_registry_auth_file: &'a str,
simple_signing_sigstore_config: &'a str,
}
impl Default for TestData<'_> {
@@ -618,16 +456,6 @@ mod tests {
server_addr: TEST_SERVER_ADDR,
unified_cgroup_hierarchy: false,
tracing: false,
container_policy_path: "",
aa_kbc_params: "",
rest_api: "",
https_proxy: "",
no_proxy: "",
data_integrity: false,
enable_signature_verification: true,
image_policy_file: "",
image_registry_auth_file: "",
simple_signing_sigstore_config: "",
}
}
}
@@ -997,141 +825,6 @@ mod tests {
tracing: true,
..Default::default()
},
TestData {
contents: "agent.aa_kbc_params=offline_fs_kbc::null",
aa_kbc_params: "offline_fs_kbc::null",
..Default::default()
},
TestData {
contents: "agent.aa_kbc_params=eaa_kbc::127.0.0.1:50000",
aa_kbc_params: "eaa_kbc::127.0.0.1:50000",
..Default::default()
},
TestData {
contents: "agent.rest_api=attestation",
rest_api: "attestation",
..Default::default()
},
TestData {
contents: "agent.rest_api=resource",
rest_api: "resource",
..Default::default()
},
TestData {
contents: "agent.rest_api=all",
rest_api: "all",
..Default::default()
},
TestData {
contents: "agent.https_proxy=http://proxy.url.com:81/",
https_proxy: "http://proxy.url.com:81/",
..Default::default()
},
TestData {
contents: "agent.https_proxy=http://192.168.1.100:81/",
https_proxy: "http://192.168.1.100:81/",
..Default::default()
},
TestData {
contents: "agent.no_proxy=*.internal.url.com",
no_proxy: "*.internal.url.com",
..Default::default()
},
TestData {
contents: "agent.no_proxy=192.168.1.0/24,172.16.0.0/12",
no_proxy: "192.168.1.0/24,172.16.0.0/12",
..Default::default()
},
TestData {
contents: "",
data_integrity: false,
..Default::default()
},
TestData {
contents: "agent.data_integrity=true",
data_integrity: true,
..Default::default()
},
TestData {
contents: "agent.data_integrity=false",
data_integrity: false,
..Default::default()
},
TestData {
contents: "agent.data_integrity=1",
data_integrity: true,
..Default::default()
},
TestData {
contents: "agent.data_integrity=0",
data_integrity: false,
..Default::default()
},
TestData {
contents: "agent.enable_signature_verification=false",
enable_signature_verification: false,
..Default::default()
},
TestData {
contents: "agent.enable_signature_verification=0",
enable_signature_verification: false,
..Default::default()
},
TestData {
contents: "agent.enable_signature_verification=1",
enable_signature_verification: true,
..Default::default()
},
TestData {
contents: "agent.enable_signature_verification=foo",
enable_signature_verification: false,
..Default::default()
},
TestData {
contents: "agent.image_policy=file:///etc/policy.json",
image_policy_file: "file:///etc/policy.json",
..Default::default()
},
TestData {
contents: "agent.image_policy=kbs:///default/security-policy/test",
image_policy_file: "kbs:///default/security-policy/test",
..Default::default()
},
TestData {
contents: "agent.image_policy=kbs://example.kbs.org/default/security-policy/test",
image_policy_file: "kbs://example.kbs.org/default/security-policy/test",
..Default::default()
},
TestData {
contents: "agent.image_registry_auth=file:///etc/auth.json",
image_registry_auth_file: "file:///etc/auth.json",
..Default::default()
},
TestData {
contents: "agent.image_registry_auth=kbs:///default/credential/test",
image_registry_auth_file: "kbs:///default/credential/test",
..Default::default()
},
TestData {
contents: "agent.image_registry_auth=kbs://example.kbs.org/default/credential/test",
image_registry_auth_file: "kbs://example.kbs.org/default/credential/test",
..Default::default()
},
TestData {
contents: "agent.simple_signing_sigstore_config=file:///etc/containers/signature/default.yml",
simple_signing_sigstore_config: "file:///etc/containers/signature/default.yml",
..Default::default()
},
TestData {
contents: "agent.simple_signing_sigstore_config=kbs:///default/sigstore-config/test",
simple_signing_sigstore_config: "kbs:///default/sigstore-config/test",
..Default::default()
},
TestData {
contents: "agent.simple_signing_sigstore_config=kbs://example.kbs.org/default/sigstore-config/test",
simple_signing_sigstore_config: "kbs://example.kbs.org/default/sigstore-config/test",
..Default::default()
},
];
let dir = tempdir().expect("failed to create tmpdir");
@@ -1179,32 +872,6 @@ mod tests {
assert_eq!(d.container_pipe_size, config.container_pipe_size, "{}", msg);
assert_eq!(d.server_addr, config.server_addr, "{}", msg);
assert_eq!(d.tracing, config.tracing, "{}", msg);
assert_eq!(
d.container_policy_path, config.container_policy_path,
"{}",
msg
);
assert_eq!(d.aa_kbc_params, config.aa_kbc_params, "{}", msg);
assert_eq!(d.rest_api, config.rest_api, "{}", msg);
assert_eq!(d.https_proxy, config.https_proxy, "{}", msg);
assert_eq!(d.no_proxy, config.no_proxy, "{}", msg);
assert_eq!(d.data_integrity, config.data_integrity, "{}", msg);
assert_eq!(
d.enable_signature_verification, config.enable_signature_verification,
"{}",
msg
);
assert_eq!(d.image_policy_file, config.image_policy_file, "{}", msg);
assert_eq!(
d.image_registry_auth_file, config.image_registry_auth_file,
"{}",
msg
);
assert_eq!(
d.simple_signing_sigstore_config, config.simple_signing_sigstore_config,
"{}",
msg
);
for v in vars_to_unset {
env::remove_var(v);
@@ -1682,74 +1349,15 @@ Caused by:
r#"
dev_mode = true
server_addr = 'vsock://8:2048'
[endpoints]
allowed = ["CreateContainer", "StartContainer"]
"#,
)
.unwrap();
// Verify that the all_allowed flag is false
assert!(!config.endpoints.all_allowed);
// Verify that the override worked
assert!(config.dev_mode);
assert_eq!(config.server_addr, "vsock://8:2048");
assert_eq!(
config.endpoints.allowed,
["CreateContainer".to_string(), "StartContainer".to_string()]
.iter()
.cloned()
.collect()
);
// Verify that the default values are valid
assert_eq!(config.hotplug_timeout, DEFAULT_HOTPLUG_TIMEOUT);
}
#[test]
fn test_config_from_cmdline_and_config_file() {
let dir = tempdir().expect("failed to create tmpdir");
let agent_config = r#"
dev_mode = false
server_addr = 'vsock://8:2048'
[endpoints]
allowed = ["CreateContainer", "StartContainer"]
"#;
let config_path = dir.path().join("agent-config.toml");
let config_filename = config_path.to_str().expect("failed to get config filename");
fs::write(config_filename, agent_config).expect("failed to write agen config");
let cmdline = format!("agent.devmode agent.config_file={}", config_filename);
let cmdline_path = dir.path().join("cmdline");
let cmdline_filename = cmdline_path
.to_str()
.expect("failed to get cmdline filename");
fs::write(cmdline_filename, cmdline).expect("failed to write agen config");
let config = AgentConfig::from_cmdline(cmdline_filename, vec![])
.expect("failed to parse command line");
// Should be overwritten by cmdline
assert!(config.dev_mode);
// Should be from agent config
assert_eq!(config.server_addr, "vsock://8:2048");
// Should be from agent config
assert_eq!(
config.endpoints.allowed,
["CreateContainer".to_string(), "StartContainer".to_string()]
.iter()
.cloned()
.collect()
);
assert!(!config.endpoints.all_allowed);
}
}

View File

@@ -651,15 +651,13 @@ fn update_spec_devices(spec: &mut Spec, mut updates: HashMap<&str, DevUpdate>) -
if let Some(resources) = linux.resources.as_mut() {
for r in &mut resources.devices {
if let (Some(host_type), Some(host_major), Some(host_minor)) =
(r.r#type.as_ref(), r.major, r.minor)
{
if let Some(update) = res_updates.get(&(host_type.as_str(), host_major, host_minor))
if let (Some(host_major), Some(host_minor)) = (r.major, r.minor) {
if let Some(update) = res_updates.get(&(r.r#type.as_str(), host_major, host_minor))
{
info!(
sl(),
"update_spec_devices() updating resource";
"type" => &host_type,
"type" => &r.r#type,
"host_major" => host_major,
"host_minor" => host_minor,
"guest_major" => update.guest_major,
@@ -971,7 +969,7 @@ pub fn update_device_cgroup(spec: &mut Spec) -> Result<()> {
allow: false,
major: Some(major),
minor: Some(minor),
r#type: Some(String::from("b")),
r#type: String::from("b"),
access: String::from("rw"),
});
@@ -1134,13 +1132,13 @@ mod tests {
resources: Some(LinuxResources {
devices: vec![
oci::LinuxDeviceCgroup {
r#type: Some("c".to_string()),
r#type: "c".to_string(),
major: Some(host_major_a),
minor: Some(host_minor_a),
..oci::LinuxDeviceCgroup::default()
},
oci::LinuxDeviceCgroup {
r#type: Some("c".to_string()),
r#type: "c".to_string(),
major: Some(host_major_b),
minor: Some(host_minor_b),
..oci::LinuxDeviceCgroup::default()
@@ -1233,13 +1231,13 @@ mod tests {
resources: Some(LinuxResources {
devices: vec![
LinuxDeviceCgroup {
r#type: Some("c".to_string()),
r#type: "c".to_string(),
major: Some(host_major),
minor: Some(host_minor),
..LinuxDeviceCgroup::default()
},
LinuxDeviceCgroup {
r#type: Some("b".to_string()),
r#type: "b".to_string(),
major: Some(host_major),
minor: Some(host_minor),
..LinuxDeviceCgroup::default()

View File

@@ -1,586 +0,0 @@
// Copyright (c) 2021 Alibaba Cloud
// Copyright (c) 2021, 2023 IBM Corporation
// Copyright (c) 2022 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use std::collections::HashMap;
use std::env;
use std::fs;
use std::path::Path;
use std::sync::atomic::{AtomicU16, Ordering};
use std::sync::Arc;
use anyhow::{anyhow, Context, Result};
use async_trait::async_trait;
use image_rs::image::ImageClient;
use protocols::image;
use tokio::sync::Mutex;
use ttrpc::{self, error::get_rpc_status as ttrpc_error};
use crate::rpc::{verify_cid, CONTAINER_BASE};
use crate::AGENT_CONFIG;
// A marker to merge container spec for images pulled inside guest.
const ANNO_K8S_IMAGE_NAME: &str = "io.kubernetes.cri.image-name";
// kata rootfs is readonly, use tmpfs before CC storage is implemented.
const KATA_CC_IMAGE_WORK_DIR: &str = "/run/image/";
const KATA_CC_PAUSE_BUNDLE: &str = "/pause_bundle";
const CONFIG_JSON: &str = "config.json";
#[rustfmt::skip]
lazy_static! {
pub static ref IMAGE_SERVICE: Mutex<Option<ImageService>> = Mutex::new(None);
}
// Convenience function to obtain the scope logger.
fn sl() -> slog::Logger {
slog_scope::logger().new(o!("subsystem" => "cgroups"))
}
#[derive(Clone)]
pub struct ImageService {
image_client: Arc<Mutex<ImageClient>>,
images: Arc<Mutex<HashMap<String, String>>>,
container_count: Arc<AtomicU16>,
}
impl ImageService {
pub fn new() -> Self {
env::set_var("CC_IMAGE_WORK_DIR", KATA_CC_IMAGE_WORK_DIR);
let mut image_client = ImageClient::default();
if !AGENT_CONFIG.image_policy_file.is_empty() {
image_client.config.file_paths.policy_path = AGENT_CONFIG.image_policy_file.clone();
}
if !AGENT_CONFIG.simple_signing_sigstore_config.is_empty() {
image_client.config.file_paths.sigstore_config =
AGENT_CONFIG.simple_signing_sigstore_config.clone();
}
if !AGENT_CONFIG.image_registry_auth_file.is_empty() {
image_client.config.file_paths.auth_file =
AGENT_CONFIG.image_registry_auth_file.clone();
}
Self {
image_client: Arc::new(Mutex::new(image_client)),
images: Arc::new(Mutex::new(HashMap::new())),
container_count: Arc::new(AtomicU16::new(0)),
}
}
/// Get the singleton instance of image service.
pub async fn singleton() -> Result<ImageService> {
IMAGE_SERVICE
.lock()
.await
.clone()
.ok_or_else(|| anyhow!("image service is uninitialized"))
}
// pause image is packaged in rootfs for CC
fn unpack_pause_image(cid: &str, target_subpath: &str) -> Result<String> {
let cc_pause_bundle = Path::new(KATA_CC_PAUSE_BUNDLE);
if !cc_pause_bundle.exists() {
return Err(anyhow!("Pause image not present in rootfs"));
}
info!(sl(), "use guest pause image cid {:?}", cid);
let pause_bundle = Path::new(CONTAINER_BASE).join(cid).join(target_subpath);
let pause_rootfs = pause_bundle.join("rootfs");
let pause_config = pause_bundle.join(CONFIG_JSON);
let pause_binary = pause_rootfs.join("pause");
fs::create_dir_all(&pause_rootfs)?;
if !pause_config.exists() {
fs::copy(
cc_pause_bundle.join(CONFIG_JSON),
pause_bundle.join(CONFIG_JSON),
)?;
}
if !pause_binary.exists() {
fs::copy(cc_pause_bundle.join("rootfs").join("pause"), pause_binary)?;
}
Ok(pause_rootfs.display().to_string())
}
/// Determines the container id (cid) to use for a given request.
///
/// If the request specifies a non-empty id, use it; otherwise derive it from the image path.
/// In either case, verify that the chosen id is valid.
fn cid_from_request(&self, req: &image::PullImageRequest) -> Result<String> {
let req_cid = req.container_id();
let cid = if !req_cid.is_empty() {
req_cid.to_string()
} else if let Some(last) = req.image().rsplit('/').next() {
// Support multiple containers with same image
let index = self.container_count.fetch_add(1, Ordering::Relaxed);
// ':' not valid for container id
format!("{}_{}", last.replace(':', "_"), index)
} else {
return Err(anyhow!("Invalid image name. {}", req.image()));
};
verify_cid(&cid)?;
Ok(cid)
}
/// Set proxy environment from AGENT_CONFIG
fn set_proxy_env_vars() {
let https_proxy = &AGENT_CONFIG.https_proxy;
if !https_proxy.is_empty() {
env::set_var("HTTPS_PROXY", https_proxy);
}
let no_proxy = &AGENT_CONFIG.no_proxy;
if !no_proxy.is_empty() {
env::set_var("NO_PROXY", no_proxy);
}
}
/// init atestation agent and read config from AGENT_CONFIG
async fn get_security_config(&self) -> Result<String> {
let aa_kbc_params = &AGENT_CONFIG.aa_kbc_params;
// If the attestation-agent is being used, then enable the authenticated credentials support
info!(
sl(),
"image_client.config.auth set to: {}",
!aa_kbc_params.is_empty()
);
self.image_client.lock().await.config.auth = !aa_kbc_params.is_empty();
let decrypt_config = format!("provider:attestation-agent:{}", aa_kbc_params);
// Read enable signature verification from the agent config and set it in the image_client
let enable_signature_verification = &AGENT_CONFIG.enable_signature_verification;
info!(
sl(),
"enable_signature_verification set to: {}", enable_signature_verification
);
self.image_client.lock().await.config.security_validate = *enable_signature_verification;
Ok(decrypt_config)
}
/// Call image-rs to pull and unpack image.
async fn common_image_pull(
&self,
image: &str,
bundle_path: &Path,
decrypt_config: &str,
source_creds: Option<&str>,
cid: &str,
) -> Result<()> {
let res = self
.image_client
.lock()
.await
.pull_image(image, bundle_path, &source_creds, &Some(decrypt_config))
.await;
match res {
Ok(image) => {
info!(
sl(),
"pull and unpack image {:?}, cid: {:?}, with image-rs succeed. ", image, cid
);
}
Err(e) => {
error!(
sl(),
"pull and unpack image {:?}, cid: {:?}, with image-rs failed with {:?}. ",
image,
cid,
e.to_string()
);
return Err(e);
}
};
self.add_image(String::from(image), String::from(cid)).await;
Ok(())
}
/// Pull image when creating container and return the bundle path with rootfs.
pub async fn pull_image_for_container(
&self,
image: &str,
cid: &str,
image_metadata: &HashMap<String, String>,
) -> Result<String> {
info!(sl(), "image metadata: {:?}", image_metadata);
Self::set_proxy_env_vars();
let is_sandbox = if let Some(value) = image_metadata.get("io.kubernetes.cri.container-type")
{
value == "sandbox"
} else if let Some(value) = image_metadata.get("io.kubernetes.cri-o.ContainerType") {
value == "sandbox"
} else {
false
};
if is_sandbox {
let mount_path = Self::unpack_pause_image(cid, "pause")?;
self.add_image(String::from(image), String::from(cid)).await;
return Ok(mount_path);
}
let bundle_path = Path::new(CONTAINER_BASE).join(cid).join("images");
fs::create_dir_all(&bundle_path)?;
info!(sl(), "pull image {:?}, bundle path {:?}", cid, bundle_path);
let decrypt_config = self.get_security_config().await?;
let source_creds = None; // You need to determine how to obtain this.
self.common_image_pull(image, &bundle_path, &decrypt_config, source_creds, cid)
.await?;
Ok(format! {"{}/rootfs",bundle_path.display()})
}
/// Pull image when recieving the PullImageRequest and return the image digest.
async fn pull_image(&self, req: &image::PullImageRequest) -> Result<String> {
Self::set_proxy_env_vars();
let cid = self.cid_from_request(req)?;
let image = req.image();
if cid.starts_with("pause") {
Self::unpack_pause_image(&cid, "")?;
self.add_image(String::from(image), cid).await;
return Ok(image.to_owned());
}
// Image layers will store at KATA_CC_IMAGE_WORK_DIR, generated bundles
// with rootfs and config.json will store under CONTAINER_BASE/cid.
let bundle_path = Path::new(CONTAINER_BASE).join(&cid);
fs::create_dir_all(&bundle_path)?;
let decrypt_config = self.get_security_config().await?;
let source_creds = (!req.source_creds().is_empty()).then(|| req.source_creds());
self.common_image_pull(
image,
&bundle_path,
&decrypt_config,
source_creds,
cid.clone().as_str(),
)
.await?;
Ok(image.to_owned())
}
async fn add_image(&self, image: String, cid: String) {
self.images.lock().await.insert(image, cid);
}
// When being passed an image name through a container annotation, merge its
// corresponding bundle OCI specification into the passed container creation one.
pub async fn merge_bundle_oci(&self, container_oci: &mut oci::Spec) -> Result<()> {
if let Some(image_name) = container_oci
.annotations
.get(&ANNO_K8S_IMAGE_NAME.to_string())
{
let images = self.images.lock().await;
if let Some(container_id) = images.get(image_name) {
let image_oci_config_path = Path::new(CONTAINER_BASE)
.join(container_id)
.join(CONFIG_JSON);
debug!(
sl(),
"Image bundle config path: {:?}", image_oci_config_path
);
let image_oci =
oci::Spec::load(image_oci_config_path.to_str().ok_or_else(|| {
anyhow!(
"Invalid container image OCI config path {:?}",
image_oci_config_path
)
})?)
.context("load image bundle")?;
if let Some(container_root) = container_oci.root.as_mut() {
if let Some(image_root) = image_oci.root.as_ref() {
let root_path = Path::new(CONTAINER_BASE)
.join(container_id)
.join(image_root.path.clone());
container_root.path =
String::from(root_path.to_str().ok_or_else(|| {
anyhow!("Invalid container image root path {:?}", root_path)
})?);
}
}
if let Some(container_process) = container_oci.process.as_mut() {
if let Some(image_process) = image_oci.process.as_ref() {
self.merge_oci_process(container_process, image_process);
}
}
}
}
Ok(())
}
// Partially merge an OCI process specification into another one.
fn merge_oci_process(&self, target: &mut oci::Process, source: &oci::Process) {
if target.args.is_empty() && !source.args.is_empty() {
target.args.append(&mut source.args.clone());
}
if target.cwd == "/" && source.cwd != "/" {
target.cwd = String::from(&source.cwd);
}
for source_env in &source.env {
let variable_name: Vec<&str> = source_env.split('=').collect();
if !target.env.iter().any(|i| i.contains(variable_name[0])) {
target.env.push(source_env.to_string());
}
}
}
}
#[async_trait]
impl protocols::image_ttrpc_async::Image for ImageService {
async fn pull_image(
&self,
_ctx: &ttrpc::r#async::TtrpcContext,
req: image::PullImageRequest,
) -> ttrpc::Result<image::PullImageResponse> {
match self.pull_image(&req).await {
Ok(r) => {
let mut resp = image::PullImageResponse::new();
resp.image_ref = r;
return Ok(resp);
}
Err(e) => {
return Err(ttrpc_error(ttrpc::Code::INTERNAL, e.to_string()));
}
}
}
}
#[cfg(test)]
mod tests {
use super::ImageService;
use protocols::image;
#[tokio::test]
async fn test_cid_from_request() {
struct Case {
cid: &'static str,
image: &'static str,
result: Option<&'static str>,
}
let cases = [
Case {
cid: "",
image: "",
result: None,
},
Case {
cid: "..",
image: "",
result: None,
},
Case {
cid: "",
image: "..",
result: None,
},
Case {
cid: "",
image: "abc/..",
result: None,
},
Case {
cid: "",
image: "abc/",
result: None,
},
Case {
cid: "",
image: "../abc",
result: Some("abc_4"),
},
Case {
cid: "",
image: "../9abc",
result: Some("9abc_5"),
},
Case {
cid: "some-string.1_2",
image: "",
result: Some("some-string.1_2"),
},
Case {
cid: "0some-string.1_2",
image: "",
result: Some("0some-string.1_2"),
},
Case {
cid: "a:b",
image: "",
result: None,
},
Case {
cid: "",
image: "prefix/a:b",
result: Some("a_b_6"),
},
Case {
cid: "",
image: "/a/b/c/d:e",
result: Some("d_e_7"),
},
];
let image_service = ImageService::new();
for case in &cases {
let mut req = image::PullImageRequest::new();
req.set_image(case.image.to_string());
req.set_container_id(case.cid.to_string());
let ret = image_service.cid_from_request(&req);
match (case.result, ret) {
(Some(expected), Ok(actual)) => assert_eq!(expected, actual),
(None, Err(_)) => (),
(None, Ok(r)) => panic!("Expected an error, got {}", r),
(Some(expected), Err(e)) => {
panic!("Expected {} but got an error ({})", expected, e)
}
}
}
}
#[tokio::test]
async fn test_merge_cwd() {
#[derive(Debug)]
struct TestData<'a> {
container_process_cwd: &'a str,
image_process_cwd: &'a str,
expected: &'a str,
}
let tests = &[
// Image cwd should override blank container cwd
// TODO - how can we tell the user didn't specifically set it to `/` vs not setting at all? Is that scenario valid?
TestData {
container_process_cwd: "/",
image_process_cwd: "/imageDir",
expected: "/imageDir",
},
// Container cwd should override image cwd
TestData {
container_process_cwd: "/containerDir",
image_process_cwd: "/imageDir",
expected: "/containerDir",
},
// Container cwd should override blank image cwd
TestData {
container_process_cwd: "/containerDir",
image_process_cwd: "/",
expected: "/containerDir",
},
];
let image_service = ImageService::new();
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let mut container_process = oci::Process {
cwd: d.container_process_cwd.to_string(),
..Default::default()
};
let image_process = oci::Process {
cwd: d.image_process_cwd.to_string(),
..Default::default()
};
image_service.merge_oci_process(&mut container_process, &image_process);
assert_eq!(d.expected, container_process.cwd, "{}", msg);
}
}
#[tokio::test]
async fn test_merge_env() {
#[derive(Debug)]
struct TestData {
container_process_env: Vec<String>,
image_process_env: Vec<String>,
expected: Vec<String>,
}
let tests = &[
// Test that the pods environment overrides the images
TestData {
container_process_env: vec!["ISPRODUCTION=true".to_string()],
image_process_env: vec!["ISPRODUCTION=false".to_string()],
expected: vec!["ISPRODUCTION=true".to_string()],
},
// Test that multiple environment variables can be overrided
TestData {
container_process_env: vec![
"ISPRODUCTION=true".to_string(),
"ISDEVELOPMENT=false".to_string(),
],
image_process_env: vec![
"ISPRODUCTION=false".to_string(),
"ISDEVELOPMENT=true".to_string(),
],
expected: vec![
"ISPRODUCTION=true".to_string(),
"ISDEVELOPMENT=false".to_string(),
],
},
// Test that when none of the variables match do not override them
TestData {
container_process_env: vec!["ANOTHERENV=TEST".to_string()],
image_process_env: vec![
"ISPRODUCTION=false".to_string(),
"ISDEVELOPMENT=true".to_string(),
],
expected: vec![
"ANOTHERENV=TEST".to_string(),
"ISPRODUCTION=false".to_string(),
"ISDEVELOPMENT=true".to_string(),
],
},
// Test a mix of both overriding and not
TestData {
container_process_env: vec![
"ANOTHERENV=TEST".to_string(),
"ISPRODUCTION=true".to_string(),
],
image_process_env: vec![
"ISPRODUCTION=false".to_string(),
"ISDEVELOPMENT=true".to_string(),
],
expected: vec![
"ANOTHERENV=TEST".to_string(),
"ISPRODUCTION=true".to_string(),
"ISDEVELOPMENT=true".to_string(),
],
},
];
let image_service = ImageService::new();
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let mut container_process = oci::Process {
env: d.container_process_env.clone(),
..Default::default()
};
let image_process = oci::Process {
env: d.image_process_env.clone(),
..Default::default()
};
image_service.merge_oci_process(&mut container_process, &image_process);
assert_eq!(d.expected, container_process.env, "{}", msg);
}
}
}

View File

@@ -22,7 +22,6 @@ extern crate slog;
use anyhow::{anyhow, Context, Result};
use cfg_if::cfg_if;
use clap::{AppSettings, Parser};
use const_format::concatcp;
use nix::fcntl::OFlag;
use nix::sys::socket::{self, AddressFamily, SockFlag, SockType, VsockAddr};
use nix::unistd::{self, dup, Pid};
@@ -33,12 +32,9 @@ use std::os::unix::fs as unixfs;
use std::os::unix::io::AsRawFd;
use std::path::Path;
use std::process::exit;
use std::process::Command;
use std::sync::Arc;
use tracing::{instrument, span};
#[cfg(feature = "confidential-data-hub")]
mod cdh;
mod config;
mod console;
mod device;
@@ -75,10 +71,12 @@ use tokio::{
task::JoinHandle,
};
mod image_rpc;
mod rpc;
mod tracer;
#[cfg(feature = "agent-policy")]
mod policy;
cfg_if! {
if #[cfg(target_arch = "s390x")] {
mod ap;
@@ -88,27 +86,6 @@ cfg_if! {
const NAME: &str = "kata-agent";
const OCICRYPT_CONFIG_PATH: &str = "/tmp/ocicrypt_config.json";
const AA_PATH: &str = "/usr/local/bin/attestation-agent";
const AA_UNIX_SOCKET_DIR: &str = "/run/confidential-containers/attestation-agent/";
const UNIX_SOCKET_PREFIX: &str = "unix://";
const AA_KEYPROVIDER_URI: &str =
concatcp!(UNIX_SOCKET_PREFIX, AA_UNIX_SOCKET_DIR, "keyprovider.sock");
const AA_GETRESOURCE_URI: &str =
concatcp!(UNIX_SOCKET_PREFIX, AA_UNIX_SOCKET_DIR, "getresource.sock");
const AA_ATTESTATION_SOCKET: &str = concatcp!(AA_UNIX_SOCKET_DIR, "attestation-agent.sock");
const AA_ATTESTATION_URI: &str = concatcp!(UNIX_SOCKET_PREFIX, AA_ATTESTATION_SOCKET);
const DEFAULT_LAUNCH_PROCESS_TIMEOUT: i32 = 6;
cfg_if! {
if #[cfg(feature = "confidential-data-hub")] {
const CDH_PATH: &str = "/usr/local/bin/confidential-data-hub";
const CDH_SOCKET: &str = "/run/confidential-containers/cdh.sock";
const API_SERVER_PATH: &str = "/usr/local/bin/api-server-rest";
}
}
lazy_static! {
static ref AGENT_CONFIG: AgentConfig =
// Note: We can't do AgentOpts.parse() here to send through the processed arguments to AgentConfig
@@ -117,6 +94,11 @@ lazy_static! {
AgentConfig::from_cmdline("/proc/cmdline", env::args().collect()).unwrap();
}
#[cfg(feature = "agent-policy")]
lazy_static! {
static ref AGENT_POLICY: Mutex<policy::AgentPolicy> = Mutex::new(AgentPolicy::new());
}
#[derive(Parser)]
// The default clap version info doesn't match our form, so we need to override it
#[clap(global_setting(AppSettings::DisableVersionFlag))]
@@ -353,6 +335,19 @@ async fn start_sandbox(
s.rtnl.handle_localhost().await?;
}
// - When init_mode is true, enabling the localhost link during the
// handle_localhost call above is required before starting OPA with the
// initialize_policy call below.
// - When init_mode is false, the Policy could be initialized earlier,
// because initialize_policy doesn't start OPA. OPA is started by
// systemd after localhost has been enabled.
#[cfg(feature = "agent-policy")]
if let Err(e) = initialize_policy(init_mode).await {
error!(logger, "Failed to initialize agent policy: {:?}", e);
// Continuing execution without a security policy could be dangerous.
std::process::abort();
}
let sandbox = Arc::new(Mutex::new(s));
let signal_handler_task = tokio::spawn(setup_signal_handler(
@@ -370,12 +365,8 @@ async fn start_sandbox(
let (tx, rx) = tokio::sync::oneshot::channel();
sandbox.lock().await.sender = Some(tx);
if !config.aa_kbc_params.is_empty() {
init_attestation_agent(logger, config)?;
}
// vsock:///dev/vsock, port
let mut server = rpc::start(sandbox.clone(), config.server_addr.as_str(), init_mode).await?;
let mut server = rpc::start(sandbox.clone(), config.server_addr.as_str(), init_mode)?;
server.start().await?;
rx.await?;
@@ -384,110 +375,6 @@ async fn start_sandbox(
Ok(())
}
// If we fail to start the AA, ocicrypt won't be able to unwrap keys
// and container decryption will fail.
fn init_attestation_agent(logger: &Logger, _config: &AgentConfig) -> Result<()> {
let config_path = OCICRYPT_CONFIG_PATH;
// The image will need to be encrypted using a keyprovider
// that has the same name (at least according to the config).
let ocicrypt_config = serde_json::json!({
"key-providers": {
"attestation-agent":{
"ttrpc":AA_KEYPROVIDER_URI
}
}
});
fs::write(config_path, ocicrypt_config.to_string().as_bytes())?;
env::set_var("OCICRYPT_KEYPROVIDER_CONFIG", config_path);
// The Attestation Agent will run for the duration of the guest.
launch_process(
logger,
AA_PATH,
&vec![
"--keyprovider_sock",
AA_KEYPROVIDER_URI,
"--getresource_sock",
AA_GETRESOURCE_URI,
"--attestation_sock",
AA_ATTESTATION_URI,
],
AA_ATTESTATION_SOCKET,
DEFAULT_LAUNCH_PROCESS_TIMEOUT,
)
.map_err(|e| anyhow!("launch_process {} failed: {:?}", AA_PATH, e))?;
#[cfg(feature = "confidential-data-hub")]
{
if let Err(e) = launch_process(
logger,
CDH_PATH,
&vec![],
CDH_SOCKET,
DEFAULT_LAUNCH_PROCESS_TIMEOUT,
) {
error!(logger, "launch_process {} failed: {:?}", CDH_PATH, e);
} else if !_config.rest_api.is_empty() {
if let Err(e) = launch_process(
logger,
API_SERVER_PATH,
&vec!["--features", &_config.rest_api],
"",
0,
) {
error!(logger, "launch_process {} failed: {:?}", API_SERVER_PATH, e);
}
}
}
Ok(())
}
fn wait_for_path_to_exist(logger: &Logger, path: &str, timeout_secs: i32) -> Result<()> {
let p = Path::new(path);
let mut attempts = 0;
loop {
std::thread::sleep(std::time::Duration::from_secs(1));
if p.exists() {
return Ok(());
}
if attempts >= timeout_secs {
break;
}
attempts += 1;
info!(
logger,
"waiting for {} to exist (attempts={})", path, attempts
);
}
Err(anyhow!("wait for {} to exist timeout.", path))
}
fn launch_process(
logger: &Logger,
path: &str,
args: &Vec<&str>,
unix_socket_path: &str,
timeout_secs: i32,
) -> Result<()> {
if !Path::new(path).exists() {
return Err(anyhow!("path {} does not exist.", path));
}
if !unix_socket_path.is_empty() && Path::new(unix_socket_path).exists() {
fs::remove_file(unix_socket_path)?;
}
Command::new(path).args(args).spawn()?;
if !unix_socket_path.is_empty() && timeout_secs > 0 {
wait_for_path_to_exist(logger, unix_socket_path, timeout_secs)?;
}
Ok(())
}
// init_agent_as_init will do the initializations such as setting up the rootfs
// when this agent has been run as the init process.
fn init_agent_as_init(logger: &Logger, unified_cgroup_hierarchy: bool) -> Result<()> {
@@ -522,6 +409,18 @@ fn init_agent_as_init(logger: &Logger, unified_cgroup_hierarchy: bool) -> Result
Ok(())
}
#[cfg(feature = "agent-policy")]
async fn initialize_policy(init_mode: bool) -> Result<()> {
let opa_addr = "localhost:8181";
let agent_policy_path = "/agent_policy";
let default_agent_policy = "/etc/kata-opa/default-policy.rego";
AGENT_POLICY
.lock()
.await
.initialize(init_mode, opa_addr, agent_policy_path, default_agent_policy)
.await
}
// The Rust standard library had suppressed the default SIGPIPE behavior,
// see https://github.com/rust-lang/rust/pull/13158.
// Since the parent's signal handler would be inherited by it's child process,
@@ -536,6 +435,9 @@ fn reset_sigpipe() {
use crate::config::AgentConfig;
use std::os::unix::io::{FromRawFd, RawFd};
#[cfg(feature = "agent-policy")]
use crate::policy::AgentPolicy;
#[cfg(test)]
mod tests {
use super::*;

267
src/agent/src/policy.rs Normal file
View File

@@ -0,0 +1,267 @@
// Copyright (c) 2023 Microsoft Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{bail, Result};
use serde::{Deserialize, Serialize};
use slog::Drain;
use tokio::io::AsyncWriteExt;
use tokio::time::{sleep, Duration};
static EMPTY_JSON_INPUT: &str = "{\"input\":{}}";
static OPA_DATA_PATH: &str = "/data";
static OPA_POLICIES_PATH: &str = "/policies";
static POLICY_LOG_FILE: &str = "/tmp/policy.txt";
/// Convenience macro to obtain the scope logger
macro_rules! sl {
() => {
slog_scope::logger()
};
}
/// Example of HTTP response from OPA: {"result":true}
#[derive(Debug, Serialize, Deserialize)]
struct AllowResponse {
result: bool,
}
/// Singleton policy object.
#[derive(Debug, Default)]
pub struct AgentPolicy {
/// When true policy errors are ignored, for debug purposes.
allow_failures: bool,
/// OPA path used to query if an Agent gRPC request should be allowed.
/// The request name (e.g., CreateContainerRequest) must be added to
/// this path.
query_path: String,
/// OPA path used to add or delete a rego format Policy.
policy_path: String,
/// Client used to connect a single time to the OPA service and reused
/// for all the future communication with OPA.
opa_client: Option<reqwest::Client>,
/// "/tmp/policy.txt" log file for policy activity.
log_file: Option<tokio::fs::File>,
}
impl AgentPolicy {
/// Create AgentPolicy object.
pub fn new() -> Self {
Self {
allow_failures: false,
..Default::default()
}
}
/// Wait for OPA to start and connect to it.
pub async fn initialize(
&mut self,
launch_opa: bool,
opa_addr: &str,
policy_name: &str,
default_policy: &str,
) -> Result<()> {
if sl!().is_enabled(slog::Level::Debug) {
self.log_file = Some(
tokio::fs::OpenOptions::new()
.write(true)
.truncate(true)
.create(true)
.open(POLICY_LOG_FILE)
.await?,
);
debug!(sl!(), "policy: log file: {}", POLICY_LOG_FILE);
}
if launch_opa {
start_opa(opa_addr)?;
}
let opa_uri = format!("http://{opa_addr}/v1");
self.query_path = format!("{opa_uri}{OPA_DATA_PATH}{policy_name}/");
self.policy_path = format!("{opa_uri}{OPA_POLICIES_PATH}{policy_name}");
let opa_client = reqwest::Client::builder().http1_only().build()?;
let policy = tokio::fs::read_to_string(default_policy).await?;
// This loop is necessary to get the opa_client connected to the
// OPA service while that service is starting. Future requests to
// OPA are expected to work without retrying, after connecting
// successfully for the first time.
for i in 0..50 {
if i > 0 {
sleep(Duration::from_millis(100)).await;
debug!(sl!(), "policy initialize: PUT failed, retrying");
}
// Set-up the default policy.
if opa_client
.put(&self.policy_path)
.body(policy.clone())
.send()
.await
.is_ok()
{
self.opa_client = Some(opa_client);
// Check if requests causing policy errors should actually
// be allowed. That is an insecure configuration but is
// useful for allowing insecure pods to start, then connect to
// them and inspect Guest logs for the root cause of a failure.
//
// Note that post_query returns Ok(false) in case
// AllowRequestsFailingPolicy was not defined in the policy.
self.allow_failures = self
.post_query("AllowRequestsFailingPolicy", EMPTY_JSON_INPUT)
.await?;
return Ok(());
}
}
bail!("Failed to connect to OPA")
}
/// Ask OPA to check if an API call should be allowed or not.
pub async fn is_allowed_endpoint(&mut self, ep: &str, request: &str) -> bool {
let post_input = format!("{{\"input\":{request}}}");
self.log_opa_input(ep, &post_input).await;
match self.post_query(ep, &post_input).await {
Err(e) => {
debug!(
sl!(),
"policy: failed to query endpoint {}: {:?}. Returning false.", ep, e
);
false
}
Ok(allowed) => allowed,
}
}
/// Replace the Policy in OPA.
pub async fn set_policy(&mut self, policy: &str) -> Result<()> {
if let Some(opa_client) = &mut self.opa_client {
// Delete the old rules.
opa_client.delete(&self.policy_path).send().await?;
// Put the new rules.
opa_client
.put(&self.policy_path)
.body(policy.to_string())
.send()
.await?;
// Check if requests causing policy errors should actually be allowed.
// That is an insecure configuration but is useful for allowing insecure
// pods to start, then connect to them and inspect Guest logs for the
// root cause of a failure.
//
// Note that post_query returns Ok(false) in case
// AllowRequestsFailingPolicy was not defined in the policy.
self.allow_failures = self
.post_query("AllowRequestsFailingPolicy", EMPTY_JSON_INPUT)
.await?;
Ok(())
} else {
bail!("Agent Policy is not initialized")
}
}
// Post query to OPA.
async fn post_query(&mut self, ep: &str, post_input: &str) -> Result<bool> {
debug!(sl!(), "policy check: {ep}");
if let Some(opa_client) = &mut self.opa_client {
let uri = format!("{}{ep}", &self.query_path);
let response = opa_client
.post(uri)
.body(post_input.to_string())
.send()
.await?;
if response.status() != http::StatusCode::OK {
bail!("policy: POST {} response status {}", ep, response.status());
}
let http_response = response.text().await?;
let opa_response: serde_json::Result<AllowResponse> =
serde_json::from_str(&http_response);
match opa_response {
Ok(resp) => {
if !resp.result {
if self.allow_failures {
warn!(
sl!(),
"policy: POST {} response <{}>. Ignoring error!", ep, http_response
);
return Ok(true);
} else {
error!(sl!(), "policy: POST {} response <{}>", ep, http_response);
}
}
Ok(resp.result)
}
Err(_) => {
warn!(
sl!(),
"policy: endpoint {} not found in policy. Returning false.", ep,
);
Ok(false)
}
}
} else {
bail!("Agent Policy is not initialized")
}
}
async fn log_opa_input(&mut self, ep: &str, input: &str) {
if let Some(log_file) = &mut self.log_file {
match ep {
"StatsContainerRequest" | "ReadStreamRequest" | "SetPolicyRequest" => {
// - StatsContainerRequest and ReadStreamRequest are called
// relatively often, so we're not logging them, to avoid
// growing this log file too much.
// - Confidential Containers Policy documents are relatively
// large, so we're not logging them here, for SetPolicyRequest.
// The Policy text can be obtained directly from the pod YAML.
}
_ => {
let log_entry = format!("[\"ep\":\"{ep}\",{input}],\n\n");
if let Err(e) = log_file.write_all(log_entry.as_bytes()).await {
warn!(sl!(), "policy: log_opa_input: write_all failed: {}", e);
} else if let Err(e) = log_file.flush().await {
warn!(sl!(), "policy: log_opa_input: flush failed: {}", e);
}
}
}
}
}
}
fn start_opa(opa_addr: &str) -> Result<()> {
let bin_dirs = vec!["/bin", "/usr/bin", "/usr/local/bin"];
for bin_dir in &bin_dirs {
let opa_path = bin_dir.to_string() + "/opa";
if std::fs::metadata(&opa_path).is_ok() {
// args copied from kata-opa.service.in.
std::process::Command::new(&opa_path)
.arg("run")
.arg("--server")
.arg("--disable-telemetry")
.arg("--addr")
.arg(opa_addr)
.arg("--log-level")
.arg("info")
.spawn()?;
return Ok(());
}
}
bail!("OPA binary not found in {:?}", &bin_dirs);
}

File diff suppressed because it is too large Load Diff

View File

@@ -27,29 +27,19 @@ use crate::{ccw, device::get_virtio_blk_ccw_device_name};
#[derive(Debug)]
pub struct VirtioBlkMmioHandler {}
impl VirtioBlkMmioHandler {
pub async fn update_device_path(
storage: &mut Storage,
ctx: &mut StorageContext<'_>,
) -> Result<()> {
if !Path::new(&storage.source).exists() {
get_virtio_mmio_device_name(ctx.sandbox, &storage.source)
.await
.context("failed to get mmio device name")?;
}
Ok(())
}
}
#[async_trait::async_trait]
impl StorageHandler for VirtioBlkMmioHandler {
#[instrument]
async fn create_device(
&self,
mut storage: Storage,
storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
Self::update_device_path(&mut storage, ctx).await?;
if !Path::new(&storage.source).exists() {
get_virtio_mmio_device_name(ctx.sandbox, &storage.source)
.await
.context("failed to get mmio device name")?;
}
let path = common_storage_handler(ctx.logger, &storage)?;
new_device(path)
}
@@ -58,11 +48,14 @@ impl StorageHandler for VirtioBlkMmioHandler {
#[derive(Debug)]
pub struct VirtioBlkPciHandler {}
impl VirtioBlkPciHandler {
pub async fn update_device_path(
storage: &mut Storage,
ctx: &mut StorageContext<'_>,
) -> Result<()> {
#[async_trait::async_trait]
impl StorageHandler for VirtioBlkPciHandler {
#[instrument]
async fn create_device(
&self,
mut storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
// If hot-plugged, get the device node path based on the PCI path
// otherwise use the virt path provided in Storage Source
if storage.source.starts_with("/dev") {
@@ -78,19 +71,6 @@ impl VirtioBlkPciHandler {
storage.source = dev_path;
}
Ok(())
}
}
#[async_trait::async_trait]
impl StorageHandler for VirtioBlkPciHandler {
#[instrument]
async fn create_device(
&self,
mut storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
Self::update_device_path(&mut storage, ctx).await?;
let path = common_storage_handler(ctx.logger, &storage)?;
new_device(path)
}
@@ -99,21 +79,6 @@ impl StorageHandler for VirtioBlkPciHandler {
#[derive(Debug)]
pub struct VirtioBlkCcwHandler {}
impl VirtioBlkCcwHandler {
pub async fn update_device_path(
_storage: &mut Storage,
_ctx: &mut StorageContext<'_>,
) -> Result<()> {
#[cfg(target_arch = "s390x")]
{
let ccw_device = ccw::Device::from_str(&_storage.source)?;
let dev_path = get_virtio_blk_ccw_device_name(_ctx.sandbox, &ccw_device).await?;
_storage.source = dev_path;
}
Ok(())
}
}
#[async_trait::async_trait]
impl StorageHandler for VirtioBlkCcwHandler {
#[cfg(target_arch = "s390x")]
@@ -123,7 +88,9 @@ impl StorageHandler for VirtioBlkCcwHandler {
mut storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
Self::update_device_path(&mut storage, ctx).await?;
let ccw_device = ccw::Device::from_str(&storage.source)?;
let dev_path = get_virtio_blk_ccw_device_name(ctx.sandbox, &ccw_device).await?;
storage.source = dev_path;
let path = common_storage_handler(ctx.logger, &storage)?;
new_device(path)
}
@@ -142,18 +109,6 @@ impl StorageHandler for VirtioBlkCcwHandler {
#[derive(Debug)]
pub struct ScsiHandler {}
impl ScsiHandler {
pub async fn update_device_path(
storage: &mut Storage,
ctx: &mut StorageContext<'_>,
) -> Result<()> {
// Retrieve the device path from SCSI address.
let dev_path = get_scsi_device_name(ctx.sandbox, &storage.source).await?;
storage.source = dev_path;
Ok(())
}
}
#[async_trait::async_trait]
impl StorageHandler for ScsiHandler {
#[instrument]
@@ -162,7 +117,10 @@ impl StorageHandler for ScsiHandler {
mut storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
Self::update_device_path(&mut storage, ctx).await?;
// Retrieve the device path from SCSI address.
let dev_path = get_scsi_device_name(ctx.sandbox, &storage.source).await?;
storage.source = dev_path;
let path = common_storage_handler(ctx.logger, &storage)?;
new_device(path)
}
@@ -171,26 +129,17 @@ impl StorageHandler for ScsiHandler {
#[derive(Debug)]
pub struct PmemHandler {}
impl PmemHandler {
pub async fn update_device_path(
storage: &mut Storage,
ctx: &mut StorageContext<'_>,
) -> Result<()> {
// Retrieve the device for pmem storage
wait_for_pmem_device(ctx.sandbox, &storage.source).await?;
Ok(())
}
}
#[async_trait::async_trait]
impl StorageHandler for PmemHandler {
#[instrument]
async fn create_device(
&self,
mut storage: Storage,
storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
Self::update_device_path(&mut storage, ctx).await?;
// Retrieve the device for pmem storage
wait_for_pmem_device(ctx.sandbox, &storage.source).await?;
let path = common_storage_handler(ctx.logger, &storage)?;
new_device(path)
}

View File

@@ -1,165 +0,0 @@
// Copyright (c) 2023 Alibaba Cloud
//
// SPDX-License-Identifier: Apache-2.0
//
use std::path::Path;
use std::sync::Arc;
use anyhow::{anyhow, Context, Result};
use image_rs::verity::{create_dmverity_device, destroy_dmverity_device};
use kata_sys_util::mount::create_mount_destination;
use kata_types::mount::{DmVerityInfo, StorageDevice};
use kata_types::volume::{
KATA_VOLUME_DMVERITY_OPTION_SOURCE_TYPE, KATA_VOLUME_DMVERITY_OPTION_VERITY_INFO,
KATA_VOLUME_DMVERITY_SOURCE_TYPE_PMEM, KATA_VOLUME_DMVERITY_SOURCE_TYPE_SCSI,
KATA_VOLUME_DMVERITY_SOURCE_TYPE_VIRTIO_CCW, KATA_VOLUME_DMVERITY_SOURCE_TYPE_VIRTIO_MMIO,
KATA_VOLUME_DMVERITY_SOURCE_TYPE_VIRTIO_PCI,
};
use protocols::agent::Storage;
use slog::Logger;
use tracing::instrument;
use crate::storage::block_handler::{
PmemHandler, ScsiHandler, VirtioBlkCcwHandler, VirtioBlkMmioHandler, VirtioBlkPciHandler,
};
use crate::storage::{common_storage_handler, StorageContext, StorageHandler};
use super::StorageDeviceGeneric;
#[derive(Debug)]
pub struct DmVerityHandler {}
impl DmVerityHandler {
fn get_dm_verity_info(storage: &Storage) -> Result<DmVerityInfo> {
for option in storage.driver_options.iter() {
if let Some((key, value)) = option.split_once('=') {
if key == KATA_VOLUME_DMVERITY_OPTION_VERITY_INFO {
let verity_info: DmVerityInfo = serde_json::from_str(value)?;
return Ok(verity_info);
}
}
}
Err(anyhow!("missing DmVerity information for DmVerity volume"))
}
async fn update_source_device(
storage: &mut Storage,
ctx: &mut StorageContext<'_>,
) -> Result<()> {
for option in storage.driver_options.clone() {
if let Some((key, value)) = option.split_once('=') {
if key == KATA_VOLUME_DMVERITY_OPTION_SOURCE_TYPE {
match value {
KATA_VOLUME_DMVERITY_SOURCE_TYPE_VIRTIO_PCI => {
VirtioBlkPciHandler::update_device_path(storage, ctx).await?;
}
KATA_VOLUME_DMVERITY_SOURCE_TYPE_VIRTIO_MMIO => {
VirtioBlkMmioHandler::update_device_path(storage, ctx).await?;
}
KATA_VOLUME_DMVERITY_SOURCE_TYPE_VIRTIO_CCW => {
VirtioBlkCcwHandler::update_device_path(storage, ctx).await?;
}
KATA_VOLUME_DMVERITY_SOURCE_TYPE_SCSI => {
ScsiHandler::update_device_path(storage, ctx).await?;
}
KATA_VOLUME_DMVERITY_SOURCE_TYPE_PMEM => {
PmemHandler::update_device_path(storage, ctx).await?;
}
_ => {}
}
}
}
}
Ok(())
}
}
#[async_trait::async_trait]
impl StorageHandler for DmVerityHandler {
#[instrument]
async fn create_device(
&self,
mut storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
Self::update_source_device(&mut storage, ctx).await?;
create_mount_destination(&storage.source, &storage.mount_point, "", &storage.fstype)
.context("Could not create mountpoint")?;
let verity_info = Self::get_dm_verity_info(&storage)?;
let verity_info = serde_json::to_string(&verity_info)
.map_err(|e| anyhow!("failed to serialize dm_verity info, {}", e))?;
let verity_device_path = create_dmverity_device(&verity_info, Path::new(storage.source()))
.context("create device with dm-verity enabled")?;
storage.source = verity_device_path;
common_storage_handler(ctx.logger, &storage)?;
Ok(Arc::new(DmVerityDevice {
common: StorageDeviceGeneric::new(storage.mount_point),
verity_device_path: storage.source,
logger: ctx.logger.clone(),
}))
}
}
struct DmVerityDevice {
common: StorageDeviceGeneric,
verity_device_path: String,
logger: Logger,
}
impl StorageDevice for DmVerityDevice {
fn path(&self) -> Option<&str> {
self.common.path()
}
fn cleanup(&self) -> Result<()> {
self.common.cleanup().context("clean up dm-verity volume")?;
let device_path = &self.verity_device_path;
debug!(
self.logger,
"destroy verity device path = {:?}", device_path
);
destroy_dmverity_device(device_path)?;
Ok(())
}
}
#[cfg(test)]
mod tests {
use kata_types::{mount::DmVerityInfo, volume::KATA_VOLUME_DMVERITY_OPTION_VERITY_INFO};
use protocols::agent::Storage;
use crate::storage::dm_verity::DmVerityHandler;
#[test]
fn test_get_dm_verity_info() {
let verity_info = DmVerityInfo {
hashtype: "sha256".to_string(),
hash: "d86104eee715a1b59b62148641f4ca73edf1be3006c4d481f03f55ac05640570".to_string(),
blocknum: 2361,
blocksize: 512,
hashsize: 4096,
offset: 1212416,
};
let verity_info_str = serde_json::to_string(&verity_info);
assert!(verity_info_str.is_ok());
let storage = Storage {
driver: KATA_VOLUME_DMVERITY_OPTION_VERITY_INFO.to_string(),
driver_options: vec![format!("verity_info={}", verity_info_str.ok().unwrap())],
..Default::default()
};
match DmVerityHandler::get_dm_verity_info(&storage) {
Ok(result) => {
assert_eq!(verity_info, result);
}
Err(e) => panic!("err = {}", e),
}
}
}

View File

@@ -10,7 +10,6 @@ use std::sync::Arc;
use anyhow::{anyhow, Context, Result};
use kata_types::mount::StorageDevice;
use kata_types::volume::KATA_VOLUME_OVERLAYFS_CREATE_DIR;
use protocols::agent::Storage;
use tracing::instrument;
@@ -51,15 +50,6 @@ impl StorageHandler for OverlayfsHandler {
.options
.push(format!("workdir={}", work.to_string_lossy()));
}
let overlay_create_dir_prefix = &(KATA_VOLUME_OVERLAYFS_CREATE_DIR.to_string() + "=");
for driver_option in &storage.driver_options {
if let Some(dir) = driver_option
.as_str()
.strip_prefix(overlay_create_dir_prefix)
{
fs::create_dir_all(dir).context("Failed to create directory")?;
}
}
let path = common_storage_handler(ctx.logger, &storage)?;
new_device(path)

View File

@@ -1,102 +0,0 @@
// Copyright (c) 2023 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{anyhow, Result};
use kata_types::mount::KATA_VIRTUAL_VOLUME_IMAGE_GUEST_PULL;
use kata_types::mount::{ImagePullVolume, StorageDevice};
use protocols::agent::Storage;
use std::sync::Arc;
use tracing::instrument;
use crate::image_rpc;
use crate::storage::{StorageContext, StorageHandler};
use super::{common_storage_handler, new_device};
#[derive(Debug)]
pub struct ImagePullHandler {}
impl ImagePullHandler {
fn get_image_info(storage: &Storage) -> Result<ImagePullVolume> {
for option in storage.driver_options.iter() {
if let Some((key, value)) = option.split_once('=') {
if key == KATA_VIRTUAL_VOLUME_IMAGE_GUEST_PULL {
let imagepull_volume: ImagePullVolume = serde_json::from_str(value)?;
return Ok(imagepull_volume);
}
}
}
Err(anyhow!("missing Image information for ImagePull volume"))
}
}
#[async_trait::async_trait]
impl StorageHandler for ImagePullHandler {
#[instrument]
async fn create_device(
&self,
mut storage: Storage,
ctx: &mut StorageContext,
) -> Result<Arc<dyn StorageDevice>> {
//Currently the image metadata is not used to pulling image in the guest.
let image_pull_volume = Self::get_image_info(&storage)?;
debug!(ctx.logger, "image_pull_volume = {:?}", image_pull_volume);
let image_name = storage.source();
debug!(ctx.logger, "image_name = {:?}", image_name);
let cid = ctx
.cid
.clone()
.ok_or_else(|| anyhow!("failed to get container id"))?;
let image_service = image_rpc::ImageService::singleton().await?;
let bundle_path = image_service
.pull_image_for_container(image_name, &cid, &image_pull_volume.metadata)
.await?;
storage.source = bundle_path;
storage.options = vec!["bind".to_string(), "ro".to_string()];
common_storage_handler(ctx.logger, &storage)?;
new_device(storage.mount_point)
}
}
#[cfg(test)]
mod tests {
use std::collections::HashMap;
use kata_types::mount::{ImagePullVolume, KATA_VIRTUAL_VOLUME_IMAGE_GUEST_PULL};
use protocols::agent::Storage;
use crate::storage::image_pull_handler::ImagePullHandler;
#[test]
fn test_get_image_info() {
let mut res = HashMap::new();
res.insert("key1".to_string(), "value1".to_string());
res.insert("key2".to_string(), "value2".to_string());
let image_pull = ImagePullVolume {
metadata: res.clone(),
};
let image_pull_str = serde_json::to_string(&image_pull);
assert!(image_pull_str.is_ok());
let storage = Storage {
driver: KATA_VIRTUAL_VOLUME_IMAGE_GUEST_PULL.to_string(),
driver_options: vec![format!("image_guest_pull={}", image_pull_str.ok().unwrap())],
..Default::default()
};
match ImagePullHandler::get_image_info(&storage) {
Ok(image_info) => {
assert_eq!(image_info.metadata, res);
}
Err(e) => panic!("err = {}", e),
}
}
}

View File

@@ -12,11 +12,7 @@ use std::sync::Arc;
use anyhow::{anyhow, Context, Result};
use kata_sys_util::mount::{create_mount_destination, parse_mount_options};
use kata_types::mount::{
StorageDevice, StorageHandlerManager, KATA_SHAREDFS_GUEST_PREMOUNT_TAG,
KATA_VIRTUAL_VOLUME_IMAGE_GUEST_PULL,
};
use kata_types::volume::KATA_VOLUME_TYPE_DMVERITY;
use kata_types::mount::{StorageDevice, StorageHandlerManager, KATA_SHAREDFS_GUEST_PREMOUNT_TAG};
use nix::unistd::{Gid, Uid};
use protocols::agent::Storage;
use protocols::types::FSGroupChangePolicy;
@@ -26,12 +22,9 @@ use tracing::instrument;
use self::bind_watcher_handler::BindWatcherHandler;
use self::block_handler::{PmemHandler, ScsiHandler, VirtioBlkMmioHandler, VirtioBlkPciHandler};
use self::dm_verity::DmVerityHandler;
use self::ephemeral_handler::EphemeralHandler;
use self::fs_handler::{OverlayfsHandler, Virtio9pHandler, VirtioFsHandler};
use self::image_pull_handler::ImagePullHandler;
use self::local_handler::LocalHandler;
use crate::device::{
DRIVER_9P_TYPE, DRIVER_BLK_MMIO_TYPE, DRIVER_BLK_PCI_TYPE, DRIVER_EPHEMERAL_TYPE,
DRIVER_LOCAL_TYPE, DRIVER_NVDIMM_TYPE, DRIVER_OVERLAYFS_TYPE, DRIVER_SCSI_TYPE,
@@ -44,10 +37,8 @@ pub use self::ephemeral_handler::update_ephemeral_mounts;
mod bind_watcher_handler;
mod block_handler;
mod dm_verity;
mod ephemeral_handler;
mod fs_handler;
mod image_pull_handler;
mod local_handler;
const RW_MASK: u32 = 0o660;
@@ -154,8 +145,6 @@ lazy_static! {
manager.add_handler(DRIVER_SCSI_TYPE, Arc::new(ScsiHandler{})).unwrap();
manager.add_handler(DRIVER_VIRTIOFS_TYPE, Arc::new(VirtioFsHandler{})).unwrap();
manager.add_handler(DRIVER_WATCHABLE_BIND_TYPE, Arc::new(BindWatcherHandler{})).unwrap();
manager.add_handler(KATA_VOLUME_TYPE_DMVERITY, Arc::new(DmVerityHandler{})).unwrap();
manager.add_handler(KATA_VIRTUAL_VOLUME_IMAGE_GUEST_PULL, Arc::new(ImagePullHandler{})).unwrap();
manager
};
}

View File

@@ -1,3 +1,2 @@
target
Cargo.lock
.idea

2390
src/dragonball/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -58,6 +58,6 @@ virtio-vsock = ["dbs-virtio-devices/virtio-vsock", "virtio-queue"]
virtio-blk = ["dbs-virtio-devices/virtio-blk", "virtio-queue"]
virtio-net = ["dbs-virtio-devices/virtio-net", "virtio-queue"]
# virtio-fs only work on atomic-guest-memory
virtio-fs = ["dbs-virtio-devices/virtio-fs", "virtio-queue", "atomic-guest-memory"]
virtio-fs = ["dbs-virtio-devices/virtio-fs-pro", "virtio-queue", "atomic-guest-memory"]
virtio-mem = ["dbs-virtio-devices/virtio-mem", "virtio-queue", "atomic-guest-memory"]
virtio-balloon = ["dbs-virtio-devices/virtio-balloon", "virtio-queue"]

View File

@@ -5,15 +5,6 @@
use serde_derive::{Deserialize, Serialize};
/// This struct represents the strongly typed equivalent of the json body
/// from confidential container related requests.
#[derive(Copy, Clone, Debug, Deserialize, PartialEq, Serialize)]
#[serde(deny_unknown_fields)]
pub enum ConfidentialVmType {
/// Intel Trusted Domain
TDX = 2,
}
/// The microvm state.
///
/// When Dragonball starts, the instance state is Uninitialized. Once start_microvm method is
@@ -67,12 +58,10 @@ pub struct InstanceInfo {
pub tids: Vec<(u8, u32)>,
/// Last instance downtime
pub last_instance_downtime: u64,
/// confidential vm type
pub confidential_vm_type: Option<ConfidentialVmType>,
}
impl InstanceInfo {
/// create instance info object with given id, version, platform type and confidential vm type.
/// create instance info object with given id, version, and platform type
pub fn new(id: String, vmm_version: String) -> Self {
InstanceInfo {
id,
@@ -83,14 +72,8 @@ impl InstanceInfo {
async_state: AsyncState::Uninitialized,
tids: Vec::new(),
last_instance_downtime: 0,
confidential_vm_type: None,
}
}
/// return true if VM confidential type is TDX
pub fn is_tdx_enabled(&self) -> bool {
matches!(self.confidential_vm_type, Some(ConfidentialVmType::TDX))
}
}
impl Default for InstanceInfo {
@@ -104,7 +87,6 @@ impl Default for InstanceInfo {
async_state: AsyncState::Uninitialized,
tids: Vec::new(),
last_instance_downtime: 0,
confidential_vm_type: None,
}
}
}

View File

@@ -12,7 +12,7 @@ pub use self::boot_source::{BootSourceConfig, BootSourceConfigError, DEFAULT_KER
/// Wrapper over the microVM general information.
mod instance_info;
pub use self::instance_info::{ConfidentialVmType, InstanceInfo, InstanceState};
pub use self::instance_info::{InstanceInfo, InstanceState};
/// Wrapper for configuring the memory and CPU of the microVM.
mod machine_config;

View File

@@ -692,6 +692,7 @@ impl VmmService {
));
}
#[cfg(feature = "dbs-upcall")]
vm.resize_vcpu(config, None).map_err(|e| {
if let VcpuResizeError::UpcallServerNotReady = e {
return VmmActionError::UpcallServerNotReady;

View File

@@ -475,7 +475,7 @@ impl<AS: GuestAddressSpace> VirtioFs<AS> {
let (mut rafs, rafs_cfg) = match config.as_ref() {
Some(cfg) => {
let rafs_conf: Arc<ConfigV2> = Arc::new(
serde_json::from_str(cfg).map_err(|e| FsError::BackendFs(e.to_string()))?,
ConfigV2::from_str(cfg).map_err(|e| FsError::BackendFs(e.to_string()))?,
);
(

View File

@@ -0,0 +1,94 @@
// Copyright 2023 Ant Group. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0
use std::any::Any;
use std::io::{Error, Read, Write};
use std::os::unix::io::{AsRawFd, RawFd};
use std::time::Duration;
use log::error;
use nix::errno::Errno;
use super::{VsockBackendType, VsockStream};
pub struct HybridStream {
pub hybrid_stream: std::fs::File,
pub slave_stream: Option<Box<dyn VsockStream>>,
}
impl AsRawFd for HybridStream {
fn as_raw_fd(&self) -> RawFd {
self.hybrid_stream.as_raw_fd()
}
}
impl Read for HybridStream {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
self.hybrid_stream.read(buf)
}
}
impl Write for HybridStream {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
// The slave stream was only used to reply the connect result "ok <port>",
// thus it was only used once here, and the data would be replied by the
// main stream.
if let Some(mut stream) = self.slave_stream.take() {
stream.write(buf)
} else {
self.hybrid_stream.write(buf)
}
}
fn flush(&mut self) -> std::io::Result<()> {
self.hybrid_stream.flush()
}
}
impl VsockStream for HybridStream {
fn backend_type(&self) -> VsockBackendType {
VsockBackendType::HybridStream
}
fn set_nonblocking(&mut self, nonblocking: bool) -> std::io::Result<()> {
let fd = self.hybrid_stream.as_raw_fd();
let mut flag = unsafe { libc::fcntl(fd, libc::F_GETFL) };
if nonblocking {
flag = flag | libc::O_NONBLOCK;
} else {
flag = flag & !libc::O_NONBLOCK;
}
let ret = unsafe { libc::fcntl(fd, libc::F_SETFL, flag) };
if ret < 0 {
error!("failed to set fcntl for fd {} with ret {}", fd, ret);
return Err(Error::last_os_error());
}
Ok(())
}
fn set_read_timeout(&mut self, _dur: Option<Duration>) -> std::io::Result<()> {
error!("unsupported!");
Err(Errno::ENOPROTOOPT.into())
}
fn set_write_timeout(&mut self, _dur: Option<Duration>) -> std::io::Result<()> {
error!("unsupported!");
Err(Errno::ENOPROTOOPT.into())
}
fn as_any(&self) -> &dyn Any {
self
}
fn recv_data_fd(
&self,
_bytes: &mut [u8],
_fds: &mut [RawFd],
) -> std::io::Result<(usize, usize)> {
Err(Errno::ENOPROTOOPT.into())
}
}

View File

@@ -9,13 +9,14 @@ use std::io::{Read, Write};
use std::os::unix::io::{AsRawFd, RawFd};
use std::time::Duration;
mod hybrid_stream;
mod inner;
mod tcp;
mod unix_stream;
pub use self::hybrid_stream::HybridStream;
pub use self::inner::{VsockInnerBackend, VsockInnerConnector, VsockInnerStream};
pub use self::tcp::VsockTcpBackend;
pub use self::unix_stream::HybridUnixStreamBackend;
pub use self::unix_stream::VsockUnixStreamBackend;
/// The type of vsock backend.
@@ -27,6 +28,8 @@ pub enum VsockBackendType {
Tcp,
/// Inner backend
Inner,
/// Fd passed hybrid stream backend
HybridStream,
/// For test purpose
#[cfg(test)]
Test,

View File

@@ -2,7 +2,6 @@
// SPDX-License-Identifier: Apache-2.0
use std::any::Any;
use std::io::{Read, Write};
use std::os::unix::io::{AsRawFd, RawFd};
use std::os::unix::net::{UnixListener, UnixStream};
use std::time::Duration;
@@ -13,66 +12,6 @@ use sendfd::RecvWithFd;
use super::super::{Result, VsockError};
use super::{VsockBackend, VsockBackendType, VsockStream};
pub struct HybridUnixStreamBackend {
pub unix_stream: Box<dyn VsockStream>,
pub slave_stream: Option<Box<dyn VsockStream>>,
}
impl VsockStream for HybridUnixStreamBackend {
fn backend_type(&self) -> VsockBackendType {
self.unix_stream.backend_type()
}
fn set_nonblocking(&mut self, nonblocking: bool) -> std::io::Result<()> {
self.unix_stream.set_nonblocking(nonblocking)
}
fn set_read_timeout(&mut self, dur: Option<Duration>) -> std::io::Result<()> {
self.unix_stream.set_read_timeout(dur)
}
fn set_write_timeout(&mut self, dur: Option<Duration>) -> std::io::Result<()> {
self.unix_stream.set_write_timeout(dur)
}
fn as_any(&self) -> &dyn Any {
self.unix_stream.as_any()
}
fn recv_data_fd(&self, bytes: &mut [u8], fds: &mut [RawFd]) -> std::io::Result<(usize, usize)> {
self.unix_stream.recv_data_fd(bytes, fds)
}
}
impl AsRawFd for HybridUnixStreamBackend {
fn as_raw_fd(&self) -> RawFd {
self.unix_stream.as_raw_fd()
}
}
impl Read for HybridUnixStreamBackend {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
self.unix_stream.read(buf)
}
}
impl Write for HybridUnixStreamBackend {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
// The slave stream was only used to reply the connect result "ok <port>",
// thus it was only used once here, and the data would be replied by the
// main stream.
if let Some(mut stream) = self.slave_stream.take() {
stream.write(buf)
} else {
self.unix_stream.write(buf)
}
}
fn flush(&mut self) -> std::io::Result<()> {
self.unix_stream.flush()
}
}
impl VsockStream for UnixStream {
fn backend_type(&self) -> VsockBackendType {
VsockBackendType::UnixStream

View File

@@ -60,6 +60,10 @@ pub enum Error {
#[error("error connecting to a backend: {0}")]
BackendConnect(#[source] std::io::Error),
/// Error set nonblock to a backend stream.
#[error("error set nonblocking to a backend: {0}")]
BackendSetNonBlock(#[source] std::io::Error),
/// Error reading from backend.
#[error("error reading from backend: {0}")]
BackendRead(#[source] std::io::Error),

View File

@@ -36,14 +36,14 @@
/// route all these events to their handlers, the muxer uses another
/// `HashMap` object, mapping `RawFd`s to `EpollListener`s.
use std::collections::{HashMap, HashSet};
use std::fs::File;
use std::io::Read;
use std::os::fd::FromRawFd;
use std::os::unix::io::{AsRawFd, RawFd};
use std::os::unix::net::UnixStream;
use log::{debug, error, info, trace, warn};
use super::super::backend::{HybridUnixStreamBackend, VsockBackend, VsockBackendType, VsockStream};
use super::super::backend::{HybridStream, VsockBackend, VsockBackendType, VsockStream};
use super::super::csm::{ConnState, VsockConnection};
use super::super::defs::uapi;
@@ -480,13 +480,17 @@ impl VsockMuxer {
.and_then(|(nfd, local_port, peer_port)| {
// Here we should make sure the nfd the sole owner to convert it
// into an UnixStream object, otherwise, it could cause memory unsafety.
let nstream = unsafe { UnixStream::from_raw_fd(nfd) };
let nstream = unsafe { File::from_raw_fd(nfd) };
let hybridstream = HybridUnixStreamBackend {
unix_stream: Box::new(nstream),
let mut hybridstream = HybridStream {
hybrid_stream: nstream,
slave_stream: Some(stream),
};
hybridstream
.set_nonblocking(true)
.map_err(Error::BackendSetNonBlock)?;
self.add_connection(
ConnMapKey {
local_port,

View File

@@ -75,10 +75,6 @@ pub enum Error {
/// Cannot open the VM file descriptor.
#[error(transparent)]
Vm(vm::VmError),
/// confidential vm type Error
#[error("confidential-vm-type can only be used in x86_64 now")]
ConfidentialVmType,
}
/// Errors associated with starting the instance.

View File

@@ -5,11 +5,14 @@
extern crate procfs;
use crate::metric::{IncMetric, METRICS};
use anyhow::{anyhow, Result};
use prometheus::{Encoder, IntCounter, IntGaugeVec, Opts, Registry, TextEncoder};
use std::sync::Mutex;
use anyhow::{anyhow, Result};
use dbs_utils::metric::IncMetric;
use prometheus::{Encoder, IntCounter, IntGaugeVec, Opts, Registry, TextEncoder};
use crate::metric::METRICS;
const NAMESPACE_KATA_HYPERVISOR: &str = "kata_hypervisor";
lazy_static! {
@@ -23,7 +26,7 @@ lazy_static! {
IntCounter::new(format!("{}_{}",NAMESPACE_KATA_HYPERVISOR,"scrape_count"), "Hypervisor metrics scrape count.").unwrap();
static ref HYPERVISOR_VCPU: IntGaugeVec =
IntGaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_HYPERVISOR,"vcpu"), "Hypervisor metrics specific to VCPUs' mode of functioning."), &["item"]).unwrap();
IntGaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_HYPERVISOR,"vcpu"), "Hypervisor metrics specific to VCPUs' mode of functioning."), &["cpu_id", "item"]).unwrap();
static ref HYPERVISOR_SECCOMP: IntGaugeVec =
IntGaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_HYPERVISOR,"seccomp"), "Hypervisor metrics for the seccomp filtering."), &["item"]).unwrap();
@@ -75,28 +78,33 @@ fn update_hypervisor_metrics() -> Result<()> {
}
fn set_intgauge_vec_vcpu(icv: &prometheus::IntGaugeVec) {
icv.with_label_values(&["exit_io_in"])
.set(METRICS.vcpu.exit_io_in.count() as i64);
icv.with_label_values(&["exit_io_out"])
.set(METRICS.vcpu.exit_io_out.count() as i64);
icv.with_label_values(&["exit_mmio_read"])
.set(METRICS.vcpu.exit_mmio_read.count() as i64);
icv.with_label_values(&["exit_mmio_write"])
.set(METRICS.vcpu.exit_mmio_write.count() as i64);
icv.with_label_values(&["failures"])
.set(METRICS.vcpu.failures.count() as i64);
icv.with_label_values(&["filter_cpuid"])
.set(METRICS.vcpu.filter_cpuid.count() as i64);
let metric_guard = METRICS.read().unwrap();
for (cpu_id, metrics) in metric_guard.vcpu.iter() {
icv.with_label_values(&[cpu_id.to_string().as_str(), "exit_io_in"])
.set(metrics.exit_io_in.count() as i64);
icv.with_label_values(&[cpu_id.to_string().as_str(), "exit_io_out"])
.set(metrics.exit_io_out.count() as i64);
icv.with_label_values(&[cpu_id.to_string().as_str(), "exit_mmio_read"])
.set(metrics.exit_mmio_read.count() as i64);
icv.with_label_values(&[cpu_id.to_string().as_str(), "exit_mmio_write"])
.set(metrics.exit_mmio_write.count() as i64);
icv.with_label_values(&[cpu_id.to_string().as_str(), "failures"])
.set(metrics.failures.count() as i64);
icv.with_label_values(&[cpu_id.to_string().as_str(), "filter_cpuid"])
.set(metrics.filter_cpuid.count() as i64);
}
}
fn set_intgauge_vec_seccomp(icv: &prometheus::IntGaugeVec) {
let metric_guard = METRICS.read().unwrap();
icv.with_label_values(&["num_faults"])
.set(METRICS.seccomp.num_faults.count() as i64);
.set(metric_guard.seccomp.num_faults.count() as i64);
}
fn set_intgauge_vec_signals(icv: &prometheus::IntGaugeVec) {
let metric_guard = METRICS.read().unwrap();
icv.with_label_values(&["sigbus"])
.set(METRICS.signals.sigbus.count() as i64);
.set(metric_guard.signals.sigbus.count() as i64);
icv.with_label_values(&["sigsegv"])
.set(METRICS.signals.sigsegv.count() as i64);
.set(metric_guard.signals.sigsegv.count() as i64);
}

View File

@@ -10,7 +10,7 @@ use kvm_bindings::KVM_API_VERSION;
use kvm_ioctls::{Cap, Kvm, VmFd};
use std::os::unix::io::{FromRawFd, RawFd};
use crate::error::{Error as VmError, Result};
use crate::error::{Error, Result};
/// Describes a KVM context that gets attached to the micro VM instance.
/// It gives access to the functionality of the KVM wrapper as long as every required
@@ -29,11 +29,11 @@ impl KvmContext {
// Safe because we expect kvm_fd to contain a valid fd number when is_some() == true.
unsafe { Kvm::from_raw_fd(fd) }
} else {
Kvm::new().map_err(VmError::Kvm)?
Kvm::new().map_err(Error::Kvm)?
};
if kvm.get_api_version() != KVM_API_VERSION as i32 {
return Err(VmError::KvmApiVersion(kvm.get_api_version()));
return Err(Error::KvmApiVersion(kvm.get_api_version()));
}
Self::check_cap(&kvm, Cap::Irqchip)?;
@@ -44,8 +44,7 @@ impl KvmContext {
Self::check_cap(&kvm, Cap::SetTssAddr)?;
#[cfg(target_arch = "x86_64")]
let supported_msrs =
dbs_arch::msr::supported_guest_msrs(&kvm).map_err(VmError::GuestMSRs)?;
let supported_msrs = dbs_arch::msr::supported_guest_msrs(&kvm).map_err(Error::GuestMSRs)?;
let max_memslots = kvm.get_nr_memslots();
Ok(KvmContext {
@@ -68,7 +67,7 @@ impl KvmContext {
/// Create a virtual machine object.
pub fn create_vm(&self) -> Result<VmFd> {
self.kvm.create_vm().map_err(VmError::Kvm)
self.kvm.create_vm().map_err(Error::Kvm)
}
/// Get the max vcpu count supported by kvm
@@ -76,9 +75,9 @@ impl KvmContext {
self.kvm.get_max_vcpus()
}
fn check_cap(kvm: &Kvm, cap: Cap) -> std::result::Result<(), VmError> {
fn check_cap(kvm: &Kvm, cap: Cap) -> std::result::Result<(), Error> {
if !kvm.check_extension(cap) {
return Err(VmError::KvmCap(cap));
return Err(Error::KvmCap(cap));
}
Ok(())
}
@@ -92,18 +91,6 @@ mod x86_64 {
use std::collections::HashSet;
impl KvmContext {
/// Create a virtual machine object with specific type.
/// vm_type: u64
/// 0: legacy vm
/// 2: tdx vm
pub fn create_vm_with_type(&self, vm_type: u64) -> Result<VmFd> {
let fd = self
.kvm
.create_vm_with_type(vm_type)
.map_err(VmError::Kvm)?;
Ok(fd)
}
/// Get information about supported CPUID of x86 processor.
pub fn supported_cpuid(
&self,
@@ -123,7 +110,7 @@ mod x86_64 {
// It's very sensible to manipulate MSRs, so please be careful to change code below.
fn build_msrs_list(kvm: &Kvm) -> Result<Msrs> {
let mut mset: HashSet<u32> = HashSet::new();
let supported_msr_list = kvm.get_msr_index_list().map_err(VmError::Kvm)?;
let supported_msr_list = kvm.get_msr_index_list().map_err(super::Error::Kvm)?;
for msr in supported_msr_list.as_slice() {
mset.insert(*msr);
}
@@ -216,7 +203,7 @@ mod x86_64 {
})
.collect();
Msrs::from_entries(&msrs).map_err(VmError::Msr)
Msrs::from_entries(&msrs).map_err(super::Error::Msr)
}
}
}
@@ -270,20 +257,4 @@ mod tests {
let _ = c.create_vm().unwrap();
}
#[test]
fn test_create_vm_with_type() {
let c = KvmContext::new(None).unwrap();
#[cfg(not(target_arch = "aarch64"))]
let _ = c.create_vm_with_type(0_u64).unwrap();
#[cfg(target_arch = "aarch64")]
{
/// aarch64 is using ipa_size to create vm
let mut ipa_size = 0; // Create using default VM type
if c.check_extension(kvm_ioctls::Cap::ArmVmIPASize) {
ipa_size = c.kvm.get_host_ipa_limit();
}
let _ = c.create_vm_with_type(ipa_size as u64).unwrap();
}
}
}

View File

@@ -2,15 +2,20 @@
// Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use dbs_utils::metric::SharedIncMetric;
use lazy_static::lazy_static;
use serde::Serialize;
pub use dbs_utils::metric::IncMetric;
lazy_static! {
/// Static instance used for handling metrics.
pub static ref METRICS: DragonballMetrics = DragonballMetrics::default();
/// # Static instance used for handling metrics.
///
/// Using a big lock over the DragonballMetrics since we have various device metric types
/// and the write operation is only used when creating or removing devices, it has a low
/// competitive overhead.
pub static ref METRICS: RwLock<DragonballMetrics> = RwLock::new(DragonballMetrics::default());
}
/// Metrics specific to VCPUs' mode of functioning.
@@ -50,9 +55,121 @@ pub struct SignalMetrics {
#[derive(Default, Serialize)]
pub struct DragonballMetrics {
/// Metrics related to a vcpu's functioning.
pub vcpu: VcpuMetrics,
pub vcpu: HashMap<u32, Arc<VcpuMetrics>>,
/// Metrics related to seccomp filtering.
pub seccomp: SeccompMetrics,
/// Metrics related to signals.
pub signals: SignalMetrics,
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use std::thread;
use dbs_utils::metric::IncMetric;
use crate::metric::{VcpuMetrics, METRICS};
#[test]
fn test_read_map() {
let metrics = Arc::new(VcpuMetrics::default());
let vcpu_id: u32 = u32::MIN;
METRICS
.write()
.unwrap()
.vcpu
.insert(vcpu_id, metrics.clone());
metrics.failures.inc();
assert_eq!(
METRICS
.read()
.unwrap()
.vcpu
.get(&vcpu_id)
.unwrap()
.failures
.count(),
1
);
}
#[test]
fn test_metrics_count() {
let metrics = Arc::new(VcpuMetrics::default());
let vcpu_id: u32 = 65535;
METRICS
.write()
.unwrap()
.vcpu
.insert(vcpu_id, metrics.clone());
let metrics1 = metrics.clone();
let thread1 = thread::spawn(move || {
for _i in 0..10 {
metrics1.exit_io_in.inc();
}
});
let metrics2 = metrics.clone();
let thread2 = thread::spawn(move || {
for _i in 0..10 {
metrics2.exit_io_in.inc();
}
});
thread1.join().unwrap();
thread2.join().unwrap();
assert_eq!(
METRICS
.read()
.unwrap()
.vcpu
.get(&vcpu_id)
.unwrap()
.exit_io_in
.count(),
20
);
}
#[test]
fn test_rw_lock() {
let metrics = Arc::new(VcpuMetrics::default());
let vcpu_id: u32 = u32::MAX;
METRICS
.write()
.unwrap()
.vcpu
.insert(vcpu_id, metrics.clone());
let write_thread = thread::spawn(move || {
for _ in 0..10 {
let metrics = Arc::new(VcpuMetrics::default());
let vcpu_id: u32 = 128;
METRICS
.write()
.unwrap()
.vcpu
.insert(vcpu_id, metrics.clone());
}
});
let read_thread = thread::spawn(move || {
for _ in 0..10 {
assert_eq!(
METRICS
.read()
.unwrap()
.vcpu
.get(&vcpu_id)
.unwrap()
.failures
.count(),
0
);
}
});
write_thread.join().unwrap();
read_thread.join().unwrap();
}
}

View File

@@ -1,11 +1,12 @@
// Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0
use dbs_utils::metric::IncMetric;
use libc::{_exit, c_int, c_void, siginfo_t, SIGBUS, SIGSEGV, SIGSYS};
use log::error;
use vmm_sys_util::signal::register_signal_handler;
use crate::metric::{IncMetric, METRICS};
use crate::metric::METRICS;
// The offset of `si_syscall` (offending syscall identifier) within the siginfo structure
// expressed as an `(u)int*`.
@@ -51,7 +52,7 @@ extern "C" fn sigsys_handler(num: c_int, info: *mut siginfo_t, _unused: *mut c_v
let syscall = unsafe { *(info as *const i32).offset(SI_OFF_SYSCALL) as usize };
// SIGSYS is triggered when bad syscalls are detected. num_faults is only added when SIGSYS is detected
// so it actually only collects the count for bad syscalls.
METRICS.seccomp.num_faults.inc();
METRICS.read().unwrap().seccomp.num_faults.inc();
error!(
"Shutting down VM after intercepting a bad syscall ({}).",
syscall
@@ -82,8 +83,8 @@ extern "C" fn sigbus_sigsegv_handler(num: c_int, info: *mut siginfo_t, _unused:
// Other signals which might do async unsafe things incompatible with the rest of this
// function are blocked due to the sa_mask used when registering the signal handler.
match si_signo {
SIGBUS => METRICS.signals.sigbus.inc(),
SIGSEGV => METRICS.signals.sigsegv.inc(),
SIGBUS => METRICS.read().unwrap().signals.sigbus.inc(),
SIGSEGV => METRICS.read().unwrap().signals.sigsegv.inc(),
_ => (),
}
@@ -182,19 +183,19 @@ mod tests {
.unwrap();
assert!(apply_filter(&TryInto::<BpfProgram>::try_into(filter).unwrap()).is_ok());
assert_eq!(METRICS.seccomp.num_faults.count(), 0);
assert_eq!(METRICS.read().unwrap().seccomp.num_faults.count(), 0);
// Call the blacklisted `SYS_mkdirat`.
unsafe { syscall(libc::SYS_mkdirat, "/foo/bar\0") };
// Call SIGBUS signal handler.
assert_eq!(METRICS.signals.sigbus.count(), 0);
assert_eq!(METRICS.read().unwrap().signals.sigbus.count(), 0);
unsafe {
syscall(libc::SYS_kill, process::id(), SIGBUS);
}
// Call SIGSEGV signal handler.
assert_eq!(METRICS.signals.sigsegv.count(), 0);
assert_eq!(METRICS.read().unwrap().signals.sigsegv.count(), 0);
unsafe {
syscall(libc::SYS_kill, process::id(), SIGSEGV);
}
@@ -211,9 +212,9 @@ mod tests {
// tests, so we use this as an heuristic to decide if we check the assertion.
if cpu_count() > 1 {
// The signal handler should let the program continue during unit tests.
assert!(METRICS.seccomp.num_faults.count() >= 1);
assert!(METRICS.read().unwrap().seccomp.num_faults.count() >= 1);
}
assert!(METRICS.signals.sigbus.count() >= 1);
assert!(METRICS.signals.sigsegv.count() >= 1);
assert!(METRICS.read().unwrap().signals.sigbus.count() >= 1);
assert!(METRICS.read().unwrap().signals.sigsegv.count() >= 1);
}
}

View File

@@ -10,7 +10,6 @@ use std::ops::Deref;
use std::sync::mpsc::{channel, Sender};
use std::sync::Arc;
use crate::IoManagerCached;
use dbs_arch::{regs, VpmuFeatureLevel};
use dbs_boot::get_fdt_addr;
use dbs_utils::time::TimestampUs;
@@ -19,8 +18,10 @@ use vm_memory::{Address, GuestAddress, GuestAddressSpace};
use vmm_sys_util::eventfd::EventFd;
use crate::address_space_manager::GuestAddressSpaceImpl;
use crate::metric::VcpuMetrics;
use crate::vcpu::vcpu_impl::{Result, Vcpu, VcpuError, VcpuStateEvent};
use crate::vcpu::VcpuConfig;
use crate::IoManagerCached;
#[allow(unused)]
impl Vcpu {
@@ -67,6 +68,7 @@ impl Vcpu {
support_immediate_exit,
mpidr: 0,
exit_evt,
metrics: Arc::new(VcpuMetrics::default()),
})
}

View File

@@ -15,6 +15,7 @@ use std::sync::mpsc::{Receiver, Sender, TryRecvError};
use std::sync::{Arc, Barrier};
use std::thread;
use dbs_utils::metric::IncMetric;
use dbs_utils::time::TimestampUs;
use kvm_bindings::{KVM_SYSTEM_EVENT_RESET, KVM_SYSTEM_EVENT_SHUTDOWN};
use kvm_ioctls::{VcpuExit, VcpuFd};
@@ -25,7 +26,7 @@ use vmm_sys_util::eventfd::EventFd;
use vmm_sys_util::signal::{register_signal_handler, Killable};
use super::sm::StateMachine;
use crate::metric::{IncMetric, METRICS};
use crate::metric::{VcpuMetrics, METRICS};
use crate::signal_handler::sigrtmin;
use crate::IoManagerCached;
@@ -303,6 +304,9 @@ pub struct Vcpu {
// Whether kvm used supports immediate_exit flag.
support_immediate_exit: bool,
// metrics for a vCPU.
metrics: Arc<VcpuMetrics>,
// CPUID information for the x86_64 CPU
#[cfg(target_arch = "x86_64")]
cpuid: kvm_bindings::CpuId,
@@ -446,7 +450,7 @@ impl Vcpu {
#[cfg(target_arch = "x86_64")]
VcpuExit::IoIn(addr, data) => {
let _ = self.io_mgr.pio_read(addr, data);
METRICS.vcpu.exit_io_in.inc();
self.metrics.exit_io_in.inc();
Ok(VcpuEmulation::Handled)
}
#[cfg(target_arch = "x86_64")]
@@ -454,17 +458,17 @@ impl Vcpu {
if !self.check_io_port_info(addr, data)? {
let _ = self.io_mgr.pio_write(addr, data);
}
METRICS.vcpu.exit_io_out.inc();
self.metrics.exit_io_out.inc();
Ok(VcpuEmulation::Handled)
}
VcpuExit::MmioRead(addr, data) => {
let _ = self.io_mgr.mmio_read(addr, data);
METRICS.vcpu.exit_mmio_read.inc();
self.metrics.exit_mmio_read.inc();
Ok(VcpuEmulation::Handled)
}
VcpuExit::MmioWrite(addr, data) => {
let _ = self.io_mgr.mmio_write(addr, data);
METRICS.vcpu.exit_mmio_write.inc();
self.metrics.exit_mmio_write.inc();
Ok(VcpuEmulation::Handled)
}
VcpuExit::Hlt => {
@@ -477,12 +481,12 @@ impl Vcpu {
}
// Documentation specifies that below kvm exits are considered errors.
VcpuExit::FailEntry(reason, cpu) => {
METRICS.vcpu.failures.inc();
self.metrics.failures.inc();
error!("Received KVM_EXIT_FAIL_ENTRY signal, reason {reason}, cpu number {cpu}");
Err(VcpuError::VcpuUnhandledKvmExit)
}
VcpuExit::InternalError => {
METRICS.vcpu.failures.inc();
self.metrics.failures.inc();
error!("Received KVM_EXIT_INTERNAL_ERROR signal");
Err(VcpuError::VcpuUnhandledKvmExit)
}
@@ -495,7 +499,7 @@ impl Vcpu {
Ok(VcpuEmulation::Stopped)
}
_ => {
METRICS.vcpu.failures.inc();
self.metrics.failures.inc();
error!(
"Received KVM_SYSTEM_EVENT signal type: {}, flag: {}",
event_type, event_flags
@@ -504,7 +508,7 @@ impl Vcpu {
}
},
r => {
METRICS.vcpu.failures.inc();
self.metrics.failures.inc();
// TODO: Are we sure we want to finish running a vcpu upon
// receiving a vm exit that is not necessarily an error?
error!("Unexpected exit reason on vcpu run: {:?}", r);
@@ -523,7 +527,7 @@ impl Vcpu {
Ok(VcpuEmulation::Interrupted)
}
_ => {
METRICS.vcpu.failures.inc();
self.metrics.failures.inc();
error!("Failure during vcpu run: {}", e);
#[cfg(target_arch = "x86_64")]
{
@@ -731,7 +735,7 @@ impl Vcpu {
fn waiting_exit(&mut self) -> StateMachine<Self> {
// trigger vmm to stop machine
if let Err(e) = self.exit_evt.write(1) {
METRICS.vcpu.failures.inc();
self.metrics.failures.inc();
error!("Failed signaling vcpu exit event: {}", e);
}
@@ -765,11 +769,17 @@ impl Vcpu {
pub fn vcpu_fd(&self) -> &VcpuFd {
self.fd.as_ref()
}
pub fn metrics(&self) -> Arc<VcpuMetrics> {
self.metrics.clone()
}
}
impl Drop for Vcpu {
fn drop(&mut self) {
let _ = self.reset_thread_local_data();
let id: u32 = self.id as u32;
METRICS.write().unwrap().vcpu.remove(&id);
}
}

Some files were not shown because too many files have changed in this diff Show More