It's been released for a while now, and we need to keep consistency
between what we used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c5cfad7023)
Signed-off-by: Greg Kurz <groug@kaod.org>
hub is now deprecated, which has been causing issues with our release
process.
Let's move to the GH cli (https://cli.github.com/manual), and unblock
this release.
**NOTE**: This commit is purposefully not touching anywhere else hub is
used, as that would require more time and investigation to do the
switch, and right now we just want to unblock the release.
Fixes: #8286
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 710eb8ab9d)
Signed-off-by: Greg Kurz <groug@kaod.org>
- 3.2 backport bunch
- stable-3.2: Backport everything related to the tests
- 3.2 backport | kata-deploy: Use host's systemctl
- 3.2 backport | dragonball: use version 0.10.4 of `fuse-backend-rs`
2cda69b284 release: Adapt kata-deploy for 3.2.0
93c7d165dc ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
12b8cbb4f6 tests: Adjust timeout for agent stability test
37c99a46b1 tests: Enable agent stability test
92f283f062 runtime: Validate hypervisor section name in config file
8cf5506700 metrics: fixes common.sh function to always return true
544f261433 metrics: skips docker restart when it is not installed or is masked.
26c6ca93d3 metrics: removing trailing comma characters from json file.
0e0aabfd87 metrics: removal of reference in the documentation to the dax test.
5d911db5e2 tests: Remove unused function from scability test
a380437380 tests: Fix path for versions yaml for soak parallel test
4495a79721 tests: Enable scability test for stability CI
961daee983 scripts: Use install_yq from the `kata-containers` repo
9b48525af1 release: tag_repos: Stop tagging / updating the `tests` repo
668c8979f0 runtime: fix reading cgroup stats of sandboxes
11e2f2a458 versions: Bump virtiofsd to v1.8.0
9eb8723a5b clh: arm: Use static_sandbox_resource_mgmt=true
e7579d20f7 runtime/qemu: Rework QMP/HMP support
f0278f41d7 runtime/virtiofsd: Drop all references to "--cache=none"
4679aa7712 runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr"
03d712ab25 runtime: Allow virtio_fs_extra_args annotation
e0513094a0 runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure
c17cbd30f0 runtime: fail early when starting docker container with FC
7e6f8010bd runtime: run prestart hooks before starting VM for FC
fa824af234 qemu: tdx: Workaround SMP issue with TDX 1.5
07471cd7a6 qemu: tdx: Adapt to the TDX 1.5 stack
2f28866f26 versions: tdx: Update Kernel to 6.2 + TDX
a36064c729 versions: tdx: Update TDVF to the "edk2-stable202302"
65e0b99eb4 versions: tdx: Update QEMU to v7.2 + TDX v1.10
9ce8ee6c0c runtime/fc: fix image/initrd annotation handling
f86bfe0da3 runtime/clh: fix image/initrd annotation handling
59fae423b5 runtime/qemu: fix image/initrd annotation handling
ef65c5767f kata-agent: use default filemode for block device when it is set to 0
93609aa0cd deps: Bump dependent crate versions
7ff98daecf gha: Add install dependencies for stability tests
ef49db59f7 gha: Add general dependencies to stability tests
a818f628d7 tests: Add soak parallel stability test
602c56c0d7 tests: Enable soak parallel test
a195539307 ci: k8s: set KUBERNETES default value
c4456c21d9 tests: run k8s-volume on a given node
58ad833300 tests: run k8s-file-volume on a given node
a54bdd00d5 tests: exec_host() now gets the node name
0eaf81c1a2 tests: add get_one_kata_node() to tests_common.sh
5f2c7c78ff ci: k8s: set KATA_HYPERVISOR default value
7fceb21362 ci: k8s: configurable deploy kata timeout
c4b0f1f31b ci: k8s: shellcheck fixes to gha-run.sh
6fb40ad47d kata-deploy: re-format kata-[deploy|cleanup].yaml
5cd2e947dc ci: k8s: run_tests() for kcli
56cebfb485 ci: k8s: add deploy-kata-kcli() to gh-run.sh
6b76d21568 ci: k8s: add cleanup-kcli() to gha-run.sh
308ce26438 ci: k8s: set default image for deploy_kata()
c3b91ed394 ci: k8s: create k8s clusters with kcli
33791f0944 metrics: stops kata components and k8s deployment when test finishes
621e6e6d8c gha: combine coco jobs into a single yaml
fe52c0900c gha: combine basic amd64 jobs into a single yaml
301a7d94e3 gha: ci: Revert tracing test PR to unbreak CI
c1da29b9b1 ci: Port runk tests to this repo
63be808730 ci: Add placeholder for runk tests
6541969a83 ci: Move tracing tests here
5d232c8143 ci: Add placeholder for tracing tests
619ef169fb ci: Create a function to install docker
16e31dd409 metrics: Use jq tool to pretty-print json metrics output
1f9a4e908f metrics: Enables FIO test for kata containers
fe4f72e0a1 gha: Add containerd stability tests to ci yaml
7963298ba2 gha: Add stability gha run script
a4e0929054 gha: Add stability tests workflow for gha
be3a3c221b gha: arm64: Ensure the builder is arm64-builder
f20164dc75 packaging: tools: Remove `set -x` leftover
1941d87b84 packaging: release: Mention newly added images
95da1c71ec packaging: tools: Fix container image env var name
508016fca1 packaging: Allow passing the TOOLS_CONTAINER_BUILDER
bb1efe0d46 packaging: stable-3.2: Remove everything related to agent policy
892c9f2f03 gha: Build the kata-agent as part of our workflows
a586b8c581 packaging: Build the kata-agent
766a5fa118 agent: Allow specifying DESTDIR and AGENT_POLICY via env vars
050a4260b9 packaging: Add get_agent_image_name()
3770b200a8 gha: Fix k0s deployment
cf254bc4ee tests: Add general stability fixes
1edf2d9bc1 tests: Add agent stability test
a8eec39559 tests: Add cassandra stress in stability tests
240c584ae2 tests: Add stressng dockerfile for stability tests
e95d3b1be5 tests: Add stressor CPU test for stability tests
4393f553e9 metrics: Add stability test for kata CI
362adea8cd metrics: Fix general check static warnings
16c349e76c docs: Update url in kata vra document
5800be5029 ci: Build src/tools components as part of our tests / releases
41b509e0a6 kata-deploy: Build components from src/tools
a5d7ba6662 static-build: Add scripts to build content from src/tools
d503daf75e packaging: Add get_tools_image_name()
b2e432c024 packaging: Use git abbreviated hash
c22fdb46e3 metrics: Increase qemu jitter value
8a1af8689b metrics: Increase jitter value for clh
f3fcf6cbf9 metrics: Add checkmetrics for latency test
ce03e9f97a metrics: Add qemu latency value limit
cd82a351bd metrics: Add latency value limits for kata CI
1709f99975 ci: kata-monitor: Move tests over
a50c7f1972 ci: Add placeholder for kata-monitor tests
c42d19619d ci: Make install_kata aware of container engines
5017435734 ci: Create a generic install_crio function
98e9434be4 ci: Add install_cni_plugins helper
c61b488b66 ci: Modify containerd default config
7c4617cfac metrics: Add init_env function to latency test
e106ecd1e4 metrics: Fix latency yamls path
665805c81c metrics: Fix spelling warnings
b0c9b4254b metrics: Fix metrics README
c28a0a03f0 metrics: Fix C-Ray documentation
48a9b4ab13 ci: crio: Trail '\r' from exec_host() output
2de1c8bac2 ci: crio: Enable default capabilities
d1d3c7cbda kata-deploy: Fix CRI-O detection
0de3216b08 kata-deploy: Add k0s support
468a3218f5 ci: crio: Pass `-y` to apt
3f2780fca6 metrics: Add latency benchmark for gha
73a084a7d4 metrics: Enable latency test in gha run script
cf3abd308f local-build: Fix .docker ownership before build-payload
8b607ff79a gha: Add pandoc as a dependency for static checks
6a9384ed40 gha: Install hunspell for static checks
a11e8867af ci: Trigger payload-after-push on workflow_dispatch
390bde3182 ci: Actually enable the CRI-O tests
f2953e6448 ci: k8s: rke2: Use sudo to call systemd
08bdb6b5da ci: k8s: Add a CRI-O test
b41fa6d946 ci: k8s: Add a method to install CRI-O
67fef9d5c6 ci: k8s: k0s: Allow passing parameters to the k0s installer
2c3f130c85 ci: kata-deploy: Fix runner name
7a8d848a92 ci: Enable kata-deploy tests for all the supported k8s flavours
7fc2f7d003 ci: kata-deploy: Add the ability to deploy rke2
59a4b00d29 ci: kata-deploy: Add the ability to deploy k0s
1a605c33ad ci: kata-deploy: Add deploy-k8s argument to gha-run.sh
19ee6c9fd7 ci: kata-deploy: Expland tests to run on k0s / rke2
03a8bed32b ci: kata-deploy: Add placeholder for tests on GARM
f09c255766 ci: kata-deploy: Export KUBERNETES env var
abe9dc9904 ci: Move deploy_k8s() to gha-run-k8s-common.sh
ea6489653e ci: Properly set K8S_TEST_UNION
7892e04dd1 ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name
882d7d7d89 ci: Create clusters in individual resource groups
b09a3f8f8e metrics: Add parallel bandwidth limit for qemu
63e8c38a7a metrics: Enable parallel bandwidth iperf limit
f3c42ff5fe nydus: Temporarily skip tests on dragonball
49c1a37330 nydus: Use `kata-${KATA_HYPERVISOR}` instead of `kata`
ae55c0b510 static-build: Fix arch error on nydus build
65e5bfe9eb tests: nydus: Update nydus tests
079ab1e0ac versions: Bump nydus and nydus-snapshotter to its latest release
d9e910702b gha: nydus: Populate run()
33a4427845 gha: nydus: Populate install_dependencies()
70c1c7d868 gha: nydus: Actually install kata when `install-kata` is called
30efa3e563 gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh
9ad6000676 tests: nydus: Add timeout to the crictl calls
6d9b8e2437 tests: nydus: Add uid / namespace to the nydus container / sandbox
fd5935da9d tests: nydus: Decorate some calls with `sudo`
4b58777eec tests: nydus: Adapt "source ..." to GHA
82c531978f tests: nydus: Adapt check to "clh" instead "cloud-hypervisor"
4915605b20 tests: common: Add install_nydus_snapshotter()
8e4180f697 tests: common: Add install_nydus()
625a05aa2a ci: static-checks: Clean up static-checks job
9784ded336 ci: static-checks: Run tests depending on KVM
668b7effb4 ci: static-checks: Move "sudo make test" to the new test matrix
4b660a4991 ci: static-checks: Move "make test" to the new test matrix
9e614ce466 runtime-rs: Ensure static-checks-build is a dep of `make test`
d5d21f4cb4 kata-ctl: Use `loop` instead of `kvm` module in tests
93577381a5 kata-ctl: Ensure GENERATED_CODE is a dep of `make test`
93440dc141 agent: Ensure GENERATED_CODE is a dep of `make test`
d269f09a66 ci: install_libseccomp: Do not depend on the tests repo
bb920178ad ci: static-checks: Move "make check" to the new test matrix
d6996d01c0 kata-ctl: Add `kata-types` to the Cargo.lock file
a62e18b27f kata-ctl: Ensure GENERATED_CODE is a dep of `make check`
cd6ab3cf07 tests: install_rust: Also install clippy
d288e1ab87 ci: static-checks: Move vendor check to its own job
755057c9ed tests: Move install_rust.sh from the tests repo
d3a04b7b8f tests: install_go: Remove tests repo dependency
c18c412db7 tests: Move functions from kata_arch script here
bb8d1be300 ci: static-checks: Move kernel config check to its own job
7c4a0f7fac ci: Use variable size of VMs depending on the tests running
7019a25f25 ci: cache: Fix ovmf-sev cache
dc9f2c24f1 ci: cache: Check the sha256sum of the component
a55c082fa1 ci: cache: Remove the script used to cache artefacts on Jenkins
e464bbfc93 ci: cache: Also store the ${component} sha256sum
b5da4ce0d8 ci: cache: Use the cached artefacts from ORAS
2f280659b1 ci: k8s: Temporarily disable tests that require a bigger VM instance
f160effaee ci: cache: Push cached artefacts to ghcr.io
6f8ded36b6 kata-deploy: Generate latest_{artefact,image_builder} files
0210db6e34 ci: cache: Install ORAS in the kata-deploy binaries builder container
27dd77469d ci: k8s: devmapper: Use a smaller / cheaper VM instance
3b64c8d687 ci: nydus: Use a smaller / cheaper VM instance
03857041e4 ci: nerdctl: Use a smaller / cheaper VM instance
301edcb92e ci: docker: Use a smaller / cheaper VM instance
594fcdce56 ci: cri-containerd: Use a smaller / cheaper VM instance
fa9dd46041 ci: k8s: Don't set cpu limit request for k8s-inotofy test
767ccb117f ci: Reduce the size of the AKS VMs
054895fcdd ci: cache: For consistency, read all used env vars
5e22a3085b ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker
bda0354491 ci: cache: Export env vars needed to use ORAS
c78f740854 metrics: Add iperf cpu utilization limit for qemu
73e989c4b1 metrics: Add iperf value for cpu utilization
1c32b31589 tests: Apply timeout to 'ctr t kill'
1d78871713 tests/vfio: Bump VM image to Fedora 38
b40a42699d tests/vfio: Accept single device in vfio group for CLH
82a0225159 tests/vfio: Get rid of sync's
a1aed0c78e gha: vfio: Set test timeout to 15m
32be55aa8a packaging: kernel: Enable VIRTIO_IOMMU on x86_64
3b5c5bcfa4 runtime: clh: Support enabling iommu
a0f59829b2 tests/vfio: Give commands 30s to execute
65943d5b77 tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms
18a8b8df03 runtime: Remove redundant check in checkPCIeConfig
d86af5923f runtime: Add test cases for checkPCIeConfig
0a918d0d20 runtime: Check config for supported CLH (cold|hot)_plug_vfio values
86201ace5a runtime: clh: Add hot_plug_vfio entry to config
01265fb217 tests/vfio: Gather debug info and disable tdp_mmu
44f37f689a tests/vfio: Capture journal from vm
a69d0d1772 tests/vfio: Change to get the test working in GHA
e90027f38c tests/vfio: Move dependency installation to gha-run.sh
62804d637c gha: vfio: Import jobs scripts from tests repo
97283b18b4 metrics: Increase jitter value for qemu
3c5bd8c44d metrics: Increase value limit for jitter in clh
6abf513f06 ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io
9a664ea8bb ci: nerdctl: Create the containerd config
5734c4cbca ci: nerdctl: Switch to tcp port 80 ping
55c8a47a40 ci: docker: Switch to tcp port 80 ping
31c3d9bd80 metrics: Add iperf bandwidth value for qemu
40ae855f0e metrics: Add iperf bandwidth value for kata metrics
deadacd58f metrics: Ensure docker is running in init_env
31c33f9c1c metrics: Add Cassandra Metrics documentation
0968bf1eb9 metrics: this PR skips the FIO test temprarily to fix issues
e5e3951398 ci: docker: Also run the smoke test with runc
c7147dabce ci: docker: Run the tests after the kata-static is created
33430ad60c ci: Add a very basic nerdctl sanity test
69dd11f459 ci: Add a very basic docker sanity test
fcfa6c6e1a ci: use github.ref_name instead of $GITHUB_REF_NAME
19d9fd9eb1 ci: Add more target-branch related fixes
fe4247a90c ci: Fix target-branch usage
9f510d059b metrics: Remove warning from metrics documentation
400418bce0 kata-deploy: Remove curl after it's used
1df997c38c kata-deploy: Fix aarch64 image build
61b1a99fca gha: Manually rebase PR atop of the target branch before testing
db563709e3 kata-deploy: Switch to an alpine image
bb5dbfbbce k8s: ci: Skip "Pod quota" test with firecracker
263ed4afd1 ci: k8s: Remove useless skip statement from tests
7e135294a7 ci: k8s: Also check for "fc" (for firecracker)
8892d9a7b2 ci: k8s: Add clean-up-garm argument for gha-run.sh
c723a7d9c8 ci: k8s: devmapper tests should be using ubuntu 20.04
aee6f36c86 ci: k8s: Add a kata-deploy-garm target
5bb77b628d ci: k8s: Export KUBERNETES env var
7ce5c8b3fa ci: k8s: Install bats on GARM runners
9fb291d88a ci: k8s: Wait some time after restarting k3s
053308eefc metrics: fix FIO test initialization
89345b6731 ci: k8s: Append, instead of overwrite, the devmapper config
bb675f8101 ci: k8s: Decrease k3s sleep from 4 to 2 minutes
695c7162ef ci: k8s: Use vanilla kubectl with k3s
7f865be398 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
7a96d0a589 ci: k8s: Use the proper command for sleep
92fdaf9719 metrics: Use TensorFlow optimized image
1b7ffeac53 ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
79de72592f ci: k8s: Add k8s devmapper tests (part 0)
a41a56e326 ci: k8s: Add a function to configure devmapper for containerd
315288a000 ci: k8s: Add a function to deploy k3s
899c823c0b packaging: do not install docker-compose-plugin for s390x|ppc64le
374e77d330 metrics: Add write 95 percentile for FIO for qemu
22ce1671a6 metrics: Add write 95 percentile FIO value
5e90c8e176 metrics: Add checkmetrics to gha run script
651b89ba41 metrics: Add checkmetrics value for qemu for iperf
907baa3464 metrics: Add jitter value for clh
d9408a7283 metrics: Add test selector to iperf metrics
3583f373f5 metrics: Enable iperf benchmark on gha for kata metrics
7fd7186780 CI: switch static-checks-dragonball CI machines to Azure
9b6c5eaff1 kata-deploy: Create kata-static.tar with correct ownership
4403af74ec metrics: re-enable memory-usage initialization step
d2d7c041f3 metrics: fix parsing issue on memory-usage test
8c7a4fd121 gha: Rebase atop of the target branch
75dcca5a53 metrics: Add grabdata script for metrics report
59e7c3a347 gha: Update to checkout@v3 action
8f1cc278ca metrics: Add report generator link to general documentation
05180b61a0 metrics: Add README for kata metrics report
17c88a1a7f metrics: Add limit for 90 percentile for qemu value
dbb4761c4b metrics: Add limit for write 90 percentile value for clh
aebf392e45 metrics: Enable FIO limits for kata metrics
41d05b8857 metrics: Fix memory footprint qemu limit
3491407581 metrics: Fix memory inside limits for kata metrics
08027f2282 metrics: Add test setup details to metrics report
99103db1fb metrics: Add boot lifecycle times to metrics report
75c92ba474 metrics: Add memory inside container to metrics report
1c1eb98107 metrics: Add scaling system footprint in metrics report
01f6e6a1a3 metrics: Add metrics reportgen
428eb6908d metrics: Add report file titles
a8fa3d99da metrics: Generate PNGs alongside the PDF report
80625ed573 metrics: Add metrics report R files
9f8e194e6f metrics: Add report dockerfile
03c206f87f metrics: Add metrics report script
2684b267f7 tests: Expand confidential test to support TDX
4976629aee tests: Expand confidential test to support SNP
019849071e tests: Add confidential test for SEV
1b7c7901d9 local-build: Remove $HOME/.docker/buildx/activity/default
6a34bae03d gha: Avoid "fail-fast" in tests that are known to be flaky
17d22cae34 tests: use unique test name
e8c24fa0b9 tests: delete k8s deployment at the test's end
3e07c89d39 metrics: Remove unused variable in tensorflow nhwc script
5b9a69433d kata-deploy: Don't try to remove /opt/kata
e99a13d26c gha: vfio: Run on Ubuntu 23.04 runner
394d146b89 local-build: Remove GID before creating group
7421737229 metrics: Add TensorFlow ResNet50 fp32 Dockerfile
9acbf2faf7 metrics: Add TensorFlow ResNet50 FP32 benchmark
4f2c9372c3 kata-deploy: Avoid failing on content removal
6ea1d3bffd metrics: Add disk link to README
ad2036927f metrics: Fix FIO path
abcb225ce3 metrics: Use function from metrics common in pytorch script
508f1bba15 gha: capture additional kata-deploy output
d46c300608 metrics: Enable kata runtime in K8s for FIO test.
3d3882a06a metrics: Update tensorflow name in gha run script
7d0a3dbf24 metrics: Fix check results for tensorflow benchmark
3e2a383b7d gha: kata-deploy: Do the runtime class cleanup as part of the cleanup
2c5db14a1a gha: kata-deploy: Add the first kata-deploy test
0b4fb826de metrics: Remove unused variable in tensorflow mobilenet script
b38624e2b3 tests: common: Ensure test_type is used as part of the cluster's name
cdfcd9aba8 tests: commob: Don't fail if yq is not part of the cache
74edbaac96 gha: kata-deploy: Add run-kata-deploy-tests.sh
d7130f48b0 gha: k8s: Stop running kata-deploy tests as part of the k8s suite
810507e8a3 tests: k8s: Call ensure_yq() in setup.sh
915bace795 kata-deploy: Properly create default runtime class
870d8004a0 metrics: Fix MobileNet help me description
145450544d gha: ci: Start running kata-deploy tests
bd29413721 docs: Fix TensorFlow word across the document
a845e94139 docs: Add Tensorflow Resnet50 documentation
6e5a5b8249 metrics: Add Dockerfile for ResNet50 int8
5d85cac1d6 metrics: Add Tensorflow ResNet50 int8 benchmark
7474e50ae2 gha: cri-containerd: Enable tests
20be3d93d5 gha: cri-containerd: Add timeout to the crictl calls on testContainerStop
10058f718a gha: cri-containerd: Show pod before deleting it
585d5fba03 gha: cri-containerd: Print kata logs in case of error
2fea5a5f8b gha: cri-containerd: Group containerd logs
3c7597f4ba gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account
738d808cac metrics: Rename tensorflow scripts
4bb8fcc0c0 tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx
f5e14ef283 tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks
e812c437fe tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder
c19cebfa80 tests: Add gha-run-k8s-common.sh
4e8c512346 metrics: fix the loop used to stop kata components #762947f32c4983 metrics: Add cassandra statefulset yaml
d5a14449fc metrics: Add cassandra service yaml
1292b51092 metrics: Add block loop pvc yaml for cassandra
105a556a30 metrics: Add block loop pv yaml for cassandra test
1b126eb4ce metrics: Add block loop pvc for cassandra test
671ad98451 metrics: Add Cassandra Kubernetes benchmark for kata metrics
058b304455 gha: static-checks: Move to the Azure instances
b600659df2 metrics: Add check containers are running in tensorflow mobilenet
1b30aa818e metrics: Add check containers are up in tensorflow script
3502bb4b20 metrics: Remove unused variable in tensorflow script
b07c19eb5f metrics: Add check containers are running function
fc89392745 metrics: Add check containers are up in tensorflow mobilenet script
73843b786d metrics: Use check containers are up in tensorflow script
7fffa7f9ce metrics: Add check containers are up in common script
1b68145b6a metrics: Use collect_results function in tensorflow mobilenet test
f29f811470 metrics: Remove collect results function definition
6b6a6ee724 metrics: Add common functions to the common script
a341c2f324 metrics: compute tensorflow statistics
b8b4ca10e9 ci: unencrypted-image: Fix build context
dcc35781f7 ci: unencrypted-image: Don't fail to build on s390x
babbd4186c ci: create-confidential-image: Add dependent actions
cecb30dbb2 metrics: Add nginx documentation to network README
1971fe4986 metrics: Add nginx kubernetes yaml
6c921ce3db metrics: Add network nginx benchmark
a5a3e4124f ci: k8s: tees: Ensure PR_NUMBER is exported
3a21c485bf ci: {{ pr-number }} should be {{ inputs.pr-number }}
218d83bd3f tests: k8s: Ensure the runtime classes are properly created
0625d8dfc1 ci: Add build-and-publish-tee-confidential-unencrypted-image
6ae591c618 ci: k8s: Add the image used for unencrypted confidential tests
8d4f9ef256 tests: upgrade bats version
a484666890 metrics: install kata once and run multiple checks
759b0fa385 metrics: General improvements to mobilenet tensorflow test
d6398ccf9e metrics: Add iperf to gha run script
a75db20167 gha: Add iperf network metrics
b33d4de013 metrics: Add latency test to network README
db23b95b53 metrics: Add latency server yaml
2b60fe0fe0 metrics: Add latency client yaml
aa71d6f931 metrics: Add network latency test
b2c627aac9 metrics: Improve naming testing containers in launch times test
ea1fdd2cb9 metrics: Clean kata components before start a metric test.
7d5f65be7c kata-deploy: Use host's systemctl
2881bad407 dragonball: use version 0.10.4 of `fuse-backend-rs`
Signed-off-by: Greg Kurz <groug@kaod.org>
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
* kata-deploy / kata-cleanup: bump the release to the new one.
There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.
Signed-off-by: Greg Kurz <groug@kaod.org>
Previously, if you accidentally modified the name of the hypervisor
section in the config file, the default golang runtime gives a cryptic
error message ("`VM memory cannot be zero`"). This can be demonstrated
using the `kata-runtime` utility program which uses the same golang
config package as the actual runtime (`containerd-shim-kata-v2`):
```bash
$ kata-runtime env >/dev/null; echo $?
0
$ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml
$ kata-runtime env >/dev/null; echo $?
VM memory cannot be zero
1
```
The hypervisor name is now validated so that the behaviour becomes:
```bash
$ kata-runtime env >/dev/null; echo $?
0
$ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml
$ ./kata-runtime env >/dev/null; echo $?
/etc/kata-containers/configuration.toml: configuration file contains invalid hypervisor section: "foo"
1
```
Fixes: #8212.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
(cherry picked from commit 3e8cf6959c)
Signed-off-by: Greg Kurz <groug@kaod.org>
This PR corrects the init env() helper function, to make that
systemctl always returns true when enumerating masked services,
and preventing the test from failing
Fixes: #8242
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit 4f9681b411)
Signed-off-by: Greg Kurz <groug@kaod.org>
To avoid errors when initializing the test environment, the
kill_processes_before_start() helper function needs to verify that
docker is installed before attempting to stop it.
Fixes: #8218
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit 908519db9d)
Signed-off-by: Greg Kurz <groug@kaod.org>
This PR removes trailing commas so that the json results
file is valid.
This PR also changes the way data results are collected by
terating through the array of memory values to calculate
their average.
Fixes: #8204
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit c2763120aa)
Signed-off-by: Greg Kurz <groug@kaod.org>
This PR removes the reference in the documentation to the DAX
subtest of the FIO benchmark, because this metric is currently
WIP.
Fixes: #8159
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit 89c9454fca)
Signed-off-by: Greg Kurz <groug@kaod.org>
As the file is already part of the kata-containers repo, and the tests
repo is about to become read-only, we're good to drop the tests
references from here and use everything coming from the
`kata-containers` repo instead.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit fbc8f8f466)
Signed-off-by: Greg Kurz <groug@kaod.org>
As we've moved all the tests to the `kata-containers` repo, the `tests`
repo will become a read-only repo.
Fixes: #8200
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 65b1a2d277)
Signed-off-by: Greg Kurz <groug@kaod.org>
The cgroup stats come from resourcecontrol package in the form of pointers
to structs. The sandbox Stat() method incorrectly was expecting structs.
This caused the cpu and memory stats to always be 0, which in turn caused
incorrect pod overhead metrics.
Fixes#8035
Signed-off-by: Peteris Rudzusiks <rye@stripe.com>
(cherry picked from commit 94e2ccc2d5)
Signed-off-by: Greg Kurz <groug@kaod.org>
Users have noticed that this is needed, as CLH does not yet implement a
way to hotplug resources on aarh64.
With this patch, when building for x86_64, I can see the this is the
resulting config:
```
$ ARCH=amd64 make
...
$ cat config/configuration-clh.toml | grep static_sandbox_resource_mgmt
static_sandbox_resource_mgmt=false
```
And when building for aarch64:
```
$ ARCH=arm64 make
...
$ cat config/configuration-clh.toml | grep static_sandbox_resource_mgmt
static_sandbox_resource_mgmt=true
```
Fixes: #7941
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 72599f1911)
Signed-off-by: Greg Kurz <groug@kaod.org>
PR #6146 added the possibility to control QEMU with an extra HMP socket
as an aid for debugging. This is great for development or bug chasing
but this raises some concerns in production.
The HMP monitor allows to temper with the VM state in a variety of ways.
This could be intentionally or mistakenly used to inject subtle bugs in
the VM that would be extremely hard if not even impossible to debug. We
definitely don't want that to be enabled by default.
The feature is currently wired to the `enable_debug` setting in the
`[hypervisor.qemu]` section of the configuration file. This setting has
historically been used to control "debug output" and it is used as such
by some downstream users (e.g. Openshift). Forcing people to have the
extra HMP backdoor at the same time is abusive and dangerous.
A new `extra_monitor_socket` is added to `[hypervisor.qemu]` to give
fine control on whether the HMP socket is wanted or not. This setting
is still gated by `enable_debug = true` to make it clear it is for
debug only. The default is to not have the HMP socket though. This
isn't backward compatible with #6416 but it is for the sake of "better
safe than sorry".
An extra monitor socket makes the QEMU instance untrusted. A warning is
thus logged to the journal when one is requested.
While here, also allow the user to choose between HMP and QMP for the
extra monitor socket. Motivation is that QMP offers way more options to
control or introspect the VM than HMP does. Users can also ask for
pretty json formatting well suited for human reading. This will improve
the debugging experience.
This feature is only made visible in the base and GPU configurations
of QEMU for now.
Fixes#7952
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 1f16b6627b)
Signed-off-by: Greg Kurz <groug@kaod.org>
This syntax belongs to the legacy C virtiofsd implementation that
we don't support anymore since kata-containers 3.1.3 because
of other API breaking changes.
People have been warned to switch from "none" to "never" since
kata-containers 2.5.2. Let's officially do that.
The compat code that would convert "none" to "never" isn't
needed anymore. Just drop it.
Fixes#7864
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 72c510d057)
Signed-off-by: Greg Kurz <groug@kaod.org>
The "-o" syntax belongs to the legacy C virtiofsd. It is deprecated
with the rust implementation.
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 81536f21af)
Signed-off-by: Greg Kurz <groug@kaod.org>
Some use cases may just require passing extra arguments to virtiofsd,
and having this disabled by default makes it impossible to set when
using kata-deploy, as changes in the configuration file would be
overwritten by the daemon-set.
With this in mind, let's allow users to pass whatever thet need (and
here I'm specifically looking at `--xattr`) as a virtio_fs_extra_arg.
Fixes: #7853
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b1dd09a4d3)
Signed-off-by: Greg Kurz <groug@kaod.org>
If we are running FC hypervisor, it is not started when prestart hooks
are executed. So we should just ignore such error and just go ahead and
run the hooks.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit 2e4c874726)
Signed-off-by: Greg Kurz <groug@kaod.org>
FC does not support network device hotplug. Let's add a check to fail
early when starting containers created by docker.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit 21204caf20)
Signed-off-by: Greg Kurz <groug@kaod.org>
Add a new hypervisor capability to tell if it supports device hotplug.
If not, we should run prestart hooks before starting new VMs as nerdctl
is using the prestart hooks to set up netns. To make nerdctl + FC
to work, we need to run the prestart hooks before starting new VMs.
Fixes: #6384
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit 32fd013716)
Signed-off-by: Greg Kurz <groug@kaod.org>
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.
Make sure annotations are preferred over config options in image and initrd
path handling.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit 18d42da21e)
Signed-off-by: Greg Kurz <groug@kaod.org>
We should make sure annotations are preferred over
config options in image and initrd path handling.
Fixes: #7705
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit 9fda7059a5)
Signed-off-by: Greg Kurz <groug@kaod.org>
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.
Add a helper function ImageOrInitrdAssetPath to make sure annotations
are preferred over config options in image and initrd path handling.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit 1a0092d631)
Signed-off-by: Greg Kurz <groug@kaod.org>
When the FileMode field for the device is unset (0), use a default value instead
to allow the use of the device from the container.
This behaviour is seen from cri-o typically.
Note: this is what runc is doing, which is why regular containers don't have an
issue. This change makes sure kata behaves the same as runc.
Fixes: #7717
Signed-off-by: Julien Ropé <jrope@redhat.com>
(cherry picked from commit 40914b25d4)
Signed-off-by: Greg Kurz <groug@kaod.org>
This pull request is mainly for updating vm-memory and vmm-sys-util.
The affacted crates include:
- vm-memory: from 0.9.0 to 0.10.0
- vmm-sys-util: from 0.10.0 to 0.11.0
- virtio-queue: from 0.6.0 to 0.7.0
- fuse-backend-rs: from 0.10.4 to 0.10.5
- linux-loader: from 0.6.0 to 0.8.0
- nydus-api: from 0.3.0 to 0.3.1
- nydus-rafs: from 0.3.1 to 0.3.2
- nydus-storage: from 0.6.3 to 0.6.4
Fixes: #0000
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
(cherry picked from commit b23c5ed155)
Signed-off-by: Greg Kurz <groug@kaod.org>
The KUBERNETES variable is mostly used by kata-deploy whether to apply
k3s specific deployments or not. It is used to select the type of
kubernetes to be installed (k3s, k0s, rancher...etc) and it is always
set on CI. Running the script locally we want to set a value by default
to avoid `KUBERNETES: unbound variable` errors.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit e669282c25)
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.
Fixes#7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit c30c3ff185)
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.
Fixes#7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 666993da8d)
The exec_host() simply fails on cluster with multi-nodes because
`kubectl get node -o name" will return a list o names. Moreover, it will
return control nodes names which usually don't have kata installed.
Fixes#7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 3a00fc9101)
The introduced get_one_kata_node() returns the first node that
has the kata-runtime=true label, i.e., supposedly a node with
kata installed.
This is useful for tests that should run on a determined worker
node on a multi-nodes cluster.
Fixes#7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 61c9c17bff)
Let KATA_HYPERVISOR be qemu by default in gh-run.sh as this variable
is required to tweak some configurations of kata-deploy.
Fixes#7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 68f083c4d0)
The deploy-kata() of gha-run.sh will wait for 10 minutes for the kata
deploy installation finish. This allow users of the script to overwrite
that value by exporting the KATA_DEPLOY_WAIT_TIMEOUT environment
variable.
Fixes#7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 6677a61fe4)
Fixed a couple of warns shellcheck emitted and disabled others:
* SC2154 (var is referenced but not assigned)
* SC2086 (Double quote to prevent globbing and word splitting)
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 200e542921)
The .tests/integration/kubernetes/gh-run.sh script run `yq write` a
couple of times to edit the kata-[deploy|cleanup].yaml, resulting
on the file being formatted again. This is annoying because leaves
the git tree dirty.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 4af78be13a)
The only difference to the other platforms is that it needs to
export KUBECONFIG.
Fixes#7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit d54e6d9cda)
The cleanup-kcli() behaves like other deploy kata for
bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG
should be exported.
Fixes#7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit c2ef1f0fb0)
The cleanup-kcli() behaves like other clean up for bare-metal (e.g. sev,
tdx...etc) except that KUBECONFIG should be exported.
Fixes#7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit d2be8eef1a)
On CI workflows the variables DOCKER_REGISTRY, DOCKER_REPO and
DOCKER_TAG are exported to match the built image. However, when running
the script outside of CI context, a developer might just use the latest
image which in this case will be
`quay.io/kata-containers/kata-deploy-ci:kata-containers-latest`.
Fixes#7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit cbb9aa15b6)
Adapted the gha-run.sh script to create a Kubernetes cluster locally
using the kcli tool.
Use `./gha-run.sh create-cluster-kcli` to create it, and
`./gha-run.sh delete-cluster-kcli` to delete.
Fixes#7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 89bef7d036)
This PR adds a trap whenever the scrip exits, it deletes the iperf
k8s deployment and k8s services, and deletes the kata components.
This way, when the script finishes, it verifies that there are
indeed no kata components still running.
Fixes: #8126
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit bba34910df)
So that we don't risk exceeding the GHA 20 rerefenced yaml files limit
that easy.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit 954d40cce5)
GHA has an undocumented limitation that there can be at most 20
referenced yamls in a single yaml file. We workaround it by combining
multiple jobs into a single yaml file.
Fixes: #8161
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit b60e0a9b57)
I'm basically moving the runk tests from the tests repo to this one, and
I'm adding the "Signed-off-by:" of every single contributor the tests.
Fixes: #8116
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Chen Yiyang <cyyzero@qq.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit da91c9df88)
The runk test has been executed as part of the former "ubuntu" jenkins
CI.
We're porting it to GHA and running it against LTS containerd.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 7f23772763)
The tracing tests are currently running as part of the Jenkins CI with
the following setups:
* Container Engines: containerd
* VMMs: QEMU | Cloud Hypervisor
* Snapshotters: overlayfs | devmapper
We'll be restricting those tests to be running on LTS version of
containerd, without devmapper.
As it's known due to our GHA limitation, this is just a placeholder and
the tests will actually be added in the next interations.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3bb2923e5d)
This PR enables the use of jq pretty-print feature to
improve the formatting of metric results json files.
Fixes: #8081
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit 8c498ef5ee)
FIO benchmark is enabled to measure IO in Kata
at different latencies using containerd client,
in order to complement the CI metrics testing set.
This PR asl deprecated the previous Fio bench
based on k8s.
Fixes: #8080
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit a2159a6361)
Otherwise we'll use any arm64 machine that's added as a runner, and
whenever new machines are added those may end up being only used for
running some specific set of the tests.
Fixes: #8109
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 119f03de26)
This was used for debugging, and ended up being merged with that.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 560bbffb57)
We've added two new containerd builder images recently, one for the
components under `src/tools` and another one for the Kata Containers
agent.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 18fa483d90)
This should be TOOLS_CONTAINER_BUILDER instead of
VIRTIOFSD_CONTAINER_BUILDER.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ca3b888371)
This follows what we've been doing for all the components we're
building, but was missed as part of #8077.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 5ca66795c7)
The kata-agent binary won't be released, just built so it can be used,
later on, as part of our tests and as part of the rootfs build.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 02acef9575)
Let's add the needed functions to start building the kata-agent, with or
without the OPA support.
For now this build is not used as part of the rootfs build, but later on
this will (not as part of this series, though).
Fixes: #8099
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 5208386ab1)
This will help to build the agent binary as part of the kata-deploy
localbuild, as we need to pass the DESTDIR to where the agent will be
installed, and also whether we're building the agent with policy support
enabled or not.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 1727487eef)
Conflicts:
src/agent/Makefile
The conflict happened as AGENT_POLICY is not part of the stable-3.2,
thus I've manually removed it.
The tests are failing when setting up k0s, and that happens because we
download a kubectl binary matching the kubernetes version k0s is using,
and we do that by:
```
sudo k0s kubectl version --short 2>/dev/null | ...
```
With kubectl 1.28, which is now the default on k0s, `kubectl version
--short` has been removed, leading us to an empty stringm causing then
the error in the CI.
Fixes: #8105
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 70e7ec3e23)
Let's add targets and actually enable users and oursevles to build those
components in the same way we build the rest of the project.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 501a168a81)
As we'd like to ship the content from src/tools, we need to build them
in the very same way we build the other components, and the first step
is providing scripts that can build those inside a container.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 6ef42db5ec)
This will be used for building all the (rust) components from src/tools.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4d08ec29bc)
This will make it easier to build images that rely on several
directories hashes.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 98097c96de)
The kata-monitor tests is currently running as part of the Jenkins CI
with the following setups:
* Container Engines: CRI-O | containerd
* VMMs: QEMU
When using containerd, we're testing it with:
* Snapshotter: overlayfs | devmapper
We will stop running those tests on devmapper / overlayfs as that hardly
would get us a functionality issue.
Also, we're restricting this to run with the LTS version of containerd,
when containerd is used.
As it's known due to our GHA limitation, this is just a placeholder and
the tests will actually be added in the next iterations.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit a3fb067f1b)
This will serve us quite will in the upcoming tests addition, which will
also have to be executed using CRi-O.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit de1eeee334)
This will become handy when doing tests with CRI-O, as CRI-O doesn't
install the CNI plugins for us.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 64a2000859)
Let's ensure we have runc running with `SystemdCgroups = false`,
otherwise we'll face failures when running tests depending on runc on
Ubuntu 22.04, woth LTS containerd.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 8132fe15c9)
This PR fixes general spelling warnings detected by the spelling check.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 97e73b2234)
This PR fixes the network metrics section at the README by leaving
the current tests that we have in our kata metrics.
Fixes#8017
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 36c8cd6f1f)
We've faced this as part of the CI, only happening with the CRI-O tests:
```
not ok 1 Test readonly volume for pods
# (from function `exec_host' in file tests_common.sh, line 51,
# in test file k8s-file-volume.bats, line 25)
# `exec_host "echo "$file_body" > $tmp_file"' failed with status 127
# [bats-exec-test:38] INFO: k8s configured to use runtimeclass
# bash: line 1: $'\r': command not found
#
# Error from server (NotFound): pods "test-file-volume" not found
```
I must say I didn't dig into figuring out why this is happening, but we
may be safe enough to just trail the '\r', as long as all the tests keep
passing on containerd.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ef63d67c41)
We need the default capabilities to be enabled, especially `SYS_CHROOT`,
in order to have tests accessing the host to pass.
A huge thanks to Greg Kurz for spotting this and suggesting the fix.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 74c12b2927)
Some of the "k8s distros" allow using CRI-O in a non-official way, and
if that's done we cannot simply assume they're on containerd, otherwise
kata-deploy will simply not work.
In order to avoid such issue, let's check for `cri-o` as the container
engine as the first place and only proceed with the checks for the "k8s
distros" after we rule out that CRI-O is not being used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 358dc2f569)
Add k0s support to kata-deploy, in the very same way kata-containers
already supports k3s, and rke2.
k0s support requires v1.27.1, which is noted as part of the kata-deploy
documentation, as it's the way to use dynamic configuration on
containerd CRI runtimes.
This support will only be part of the `main` branch, as it's not a bug
fix that can be backported to the `stable-3.2` branch, and this is also
noted as part of the documentation.
Fixes: #7548
Signed-off-by: Steve Fan <29133953+stevefan1999-personal@users.noreply.github.com>
(cherry picked from commit 72cbcf040b)
The permissions on .docker/buildx/activity/default are regularly broken by us
passing docker.sock + $HOME/.docker to a container running as root and then
using buildx inside. Fixup ownership before executing docker commands.
Fixes: #8027
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 15425a2b80)
To avoid the failure of not finding pandoc command this PR adds that
package as a dependency for static checks.
Fixes#8041
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 13ca7d9f97)
Seems like the static checks are failing due the missing of the hunspell
package this PR fixes that.
Fixes#8019
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 87a8616488)
This will allow us to easily test failures and fixes on that workflows.
Fixes: #8031
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 0c95697cc4)
The test has been added to the repo, but we have to also add it to the
list of jobs to be executed.
Fixes: #8005
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 8c3c50ca8a)
Otherwise we'll face the following error:
```
Failed to enable unit: Interactive authentication required.
```
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 07a6e63a6b)
Let's make sure we'll also be testing k8s using CRI-O.
For now, we'll only be running the CRI-O test with QEMU. Once it
becomes stable we can expand this to other Hypervisors as well.
Fixes: #8005
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 03b82e8484)
We'll need this in order to setup k0s with a different container engine.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 54c0a471b1)
It should be garm-ubuntu-2004-smaller instead of garm-ubuntu-2004-small.
Fixes: #7890
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3a2c83d69b)
Let's ensure we test kata-deploy on RKE2 and k0s as well.
Fixes: #7890
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f7fa7f602a)
This will be very useful in the near future, when we start testing
kata-deploy with rke2 as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2c908b598c)
This will be very useful in the near future, when we start testing
kata-deploy with k0s as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit eaf6164916)
We'll be using exactly the same code used for the k8s tests, which are
already deploying k3s on GARM.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 0015257636)
We just need to make sure the correct overlay is applied, following what
we already have been doing for k3s.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit bf2cb02283)
We'll be testing kata-deploy with different kubernetes flavours as part
of our GARM tests, and this is a place-holder for this.
Once enabled, we'll do nothing, just `return 0`, so we can then properly
add the tests after this commit gets merged.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b12b9e1886)
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.
This was also done as part of fa62a4c01b,
for the k8s tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 9e1fb8a966)
This will allow us to re-use the function in the kata-deploy tests,
which will come soon.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 09cc0ed438)
Ideally we'd add the instance_type or the full K8S_TEST_HOST_TYPE but
that exceeds the maximum amount of characteres allowed for the cluster
name. With this in mind, let's use the first letter of
K8S_TEST_HOST_TYPE instead.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
(cherry picked from commit d9ef1352af)
This makes it so that each AKS cluster is created in its own individual
resource group, rather than using the "kataCI" resource group for all
test clusters.
This is to accommodate a tool that we recently introduced in our Azure
subscription which automatically deletes resource groups after a set
amount of time, in order to keep spending under control.
The tool will automatically delete any resource group, unless it has a
tag SkipAutoDeleteTill = YYYY-MM-DD. When this tag is present, the
resource group will be retained until the specified date.
Note that I tagged all current resource groups in our subscription with
SkipAutoDeleteTill = 2043-01-01 so that we don't lose any existing
resources.
Fixes: #7982
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
(cherry picked from commit 68267a3996)
We're hitting a specific issue after updating, which will require some
work on dragonball before it can be re-added here.
The issue:
```
...
3: failed to do rafs mount\\n
4: fail to attach rafs \\\"/var/lib/containerd-nydus/snapshots/2/fs/image/image.boot\\\"\\n
5: add share fs mount\\n
6: Mount rafs at
/rafs/197ef3db03c86b91bf3045ff59183ce8b5750941ad1d3484f4a8301a70f5109f/rootfs_lower
error: Failed to Mount backend
...
Caused by:
vmm action error: FsDevice(AttachBackendFailed(\\\"attach/detach a
backend filesystem failed:: missing field `version` at line 1 column
489\\\"))\"): unknown"
```
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit aba36ab188)
This will ensure we're testing with the correct runtime, instead of
using the `default` one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b8a8dfcd15)
Fix the arch error when downloading the nydus tarball.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Steven Horsman <steven@uk.ibm.com>
(cherry picked from commit f6df3d6efb)
To support the v0.12.0 nydus-snapshotter, we need to update the config
files and the commandline to start nydus-snapshotter.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
(cherry picked from commit 2f9c9e2e63)
And with this we finally enable the nydus tests to run as part of our
GHA CI.
Fixes: #6543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b73bde320d)
Let's have all the dependencies needed for running the nydus tests
installed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b3904a1a30)
We've been simply doing nothing whenever `install-kata` was called, and
that was the intent when we added the placeholder calls.
Now, let's install kata, as expected. :-)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit d2b3b67f5d)
As we've added install_nydus() and install_nydus_snapshotter(), which do
conform with the pattern we're following on GHA, let's rely on them
rather than relying on the bits coming from nydus_test.sh.
Later on we'll have install_nydus() and install_nydus_snapshotter() as
part of the dependencies install in our `gha-run.sh`.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 0ec00ad42e)
Similarly to what's been done for the cri-containerd tests, as part of
84dd02e0f9, we need to add the timeout
here for the crictl calls.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 568439c77b)
Otherwise we may face errors like:
```
getting sandbox status of pod "d3af2db414ce8": metadata.Name,
metadata.Namespace or metadata.Uid is not in metadata
"&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}"
getting sandbox status of pod "-A": rpc error: code = NotFound desc = an
error occurred when try to find sandbox: not found
```
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 5ac3b76eb1)
Otherwise we canoot properly start the nydus snapshotter, nor properly
kill it after it's been started.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 376574a16c)
The "source ..." we've been doing was not changed since those tests were
part of the Jenkins tests, and we need to adapt them, either setting the
correct path or entirely removing the ones that are not relevant to us
anymore.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4290fd4b67)
As that's what we've been using as part of the GHA.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit a84efa3e87)
This function will be used to download and install the
nydus-snapshotter, and it follows the same pattern we already have
introduced for downloading and installing another dependencies from
GitHub.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 56a14b3950)
This function will be used to download and install nydus, and it follows
the same pattern we already have introduced for downloading and
installing another dependencies from GitHub.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b6563783e2)
Now that the static-checks job only takes care of running the
static-checks, let's clean it up, remove all the unneeded steps, make
sure that we're using the actions in their latest version, and have it
running in a cost free runner.
At some point I'd like to see those tests done in parallel, in the same
way that I've organised the build-checks, but that's something for
someone else, at some other time.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 8b1e9b0c75)
With this we're removing the dragonball static-checks CI, as the test is
running here now. :-)
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2c5ca2eaf8)
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` and `make check` checks.
This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 509c309ab2)
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` and `make check` checks.
This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4e963cedf4)
Otherwise `make test` will simply fail with:
```
error[E0583]: file not found for module `config`
```
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 08f2e5ae0b)
This makes it pssible to run the tests in the cost free runners, which
are not KVM capable.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2bc3a616ae)
Otherwise `make test` will simply fail with:
```
error[E0583]: file not found for module `version`
```
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 46daddc500)
Otherwise `make test` will fail with:
```
error[E0583]: file not found for module `version`
```
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ec826f328f)
It makes things way simpler, waaaaay simpler.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 1d32410a83)
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` checks.
This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit bf888b9a5e)
clippy is used as part our tests, so it's useful to have it installed
while we're already installing rust.
In case of developers, they also better be using it. :-)
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e125775863)
Similarly to the static-check jobs, those jobs can be run on the zero
cost runners.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e2c61a152c)
We'll use it as part of the refactoring we're doing in the static check
tests.
I can see a lot of other uses of this, but changing all of them to this
one is out of the scope for this PR.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 6794d4c843)
We can rely on the functions that are now part of the common.bash.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e64508c308)
We can use this a lot as part of our CI, but right now I'm just moving
those here with the intent to use later on in this series.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 11dff731b7)
It doesn't make sense to run this for all the bits of the matrix,
neither it's demanding enough to require running this in one of our
Azure sponsored runners.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 75c974c802)
Let me start with a fair warning that this commit is hard to split into
different parts that could be easily tested (or not tested, just
ignored) without breaking pieces.
Now, about the commit itself, as we're on the run to reduce costs
related to our sponsorship on Azure, we can split the k8s tests we run
in 2 simple groups:
* Tests that can be run in the smaller Azure instance (D2s_v5)
* Tests that required the normal Azure instance (D4s_v5)
With this in mind, we're now passing to the tests which type of host
we're using, which allows us to select to run either one of the two
types of tests, or even both in case of running the tests on a baremetal
system.
Fixes: #7972
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c69a1e33bd)
We've removed this in the part 2 of this effort, as we were not caching
the sha256sum of the component. Now that this part has been merged,
let's get back to checking it.
Fixes: #7834 -- part 3
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 86c41074b4)
That's not needed anymore, as we've switched to using ORAS and an OCI
registry to cache the artefacts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 460988c5f7)
This is something that was done by our Jenkins jobs, but that I ended up
missing when writing d0c257b3a7.
Now, let's also add the sha256sum to the cached artefact, and in a
coming up PR (after this one is merged) we will also start checking for
that.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4533a7a416)
In the previous series related to the artefacts we build, we've
switching from storing the artefacts on Jenkins, to storing those in the
ghcr.io/kata-containers/cached-artefacts/${artefact_name}.
Now, let's take advantage of that and actually use the artefacts coming
from that "package" (as GitHub calls it).
NOTE: One thing that I've noticed that we're missing, is storing and
checking the sha256sum of the artefact. The storing part will be done
in a different commit, and the checking the sha256sum will be done in a
different PR, as we need to ensure those were pushed to the registry
before actually taking the bullet to check for them.
Fixes: #7834 -- part 2
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit eccc76df63)
The list of tests which require a bigger VM instance is:
* k8s-number-cpus.bats -- failing on all CIs
* k8s-parallel.bats -- only failing on the cbl-mariner CI
* k8s-scale-nginx.bats -- only failing on the cbl-mariner CI
We'll keep those disabled while we re-work the logic to **only run
those** in a bigger (and more expensive) VM instance.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 094b6b2cf8)
Let's push the artefacts to ghcr.io and stop relying on jenkins for
that.
Fixes: #7834 -- part 1
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit d0c257b3a7)
Right now this is not used, but it'll be used when we start caching the
artefacts using ORAS.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 108f1b60dd)
ORAS is the tool which will help us to deal with our artefacts being
pushed to and pulled from a container registry.
As both the push to and the pull from will be done inside the
kata-deploy binaries builder container, we need it installed there.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit be2eb7b378)
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense. With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.
Fixes: #7958
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit fb24fb0dc1)
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense. With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.
Fixes: #7958
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 1daf02f5d4)
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense. With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.
Fixes: #7958
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e60d81f554)
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense. With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.
Fixes: #7958
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4db416997c)
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense. With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.
Fixes: #7958
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 32841827b8)
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 92fff129fd)
Instead of having some of them only being considered if explicitly
passed to the script.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit adc18ecdb1)
As the environment variables are now being passed down from the GitHub
Actions, let's make sure they're exposed to the container used to build
the kata-deploy binaries, and during the build process we'll be able to
use those to log in and push the artefacts to the OCI registry, using
ORAS.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c7a851efd7)
We do the build of our artefacts inside a container image, and we need
to expose some env vars to the container so ORAS can be used there to
push the artefacts we want to cache to ghcr.io.
The env vars we're exposing are:
* ARTEFACT_REGISTRY: The registry where we're going to save the
artefacts.
* ARTEFACT_REGISTRY_USERNAME: The username to log in to the registry, as
ORAS does not use the same json file used by docker.
* ARTEFACT_REGISTRY_PASSWORD: The pasword to log in to the the registry,
as the ORAS does not use the same json file used by docker.
* TARGET_BRANCH: The target branch, which will be part of the tag of the
artefact, as we may end up caching the artefacts for both main and
stable branches.
Fixes: #7834 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 6bd15a85d5)
This PR adds the iperf cpu utilization limit for qemu for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit cd4fd1292a)
We need a very recent L2 guest kernel to fix all the bugs that occur in nested
virtualization.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 9d93036783)
cloud hypervisor does not emulate pcie switches or pci bridges, so we need to
accept a lonely device.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit faee59b520)
It is fine to start a VM with the disk image without syncing it as we now run
the test in an ephemeral Azure instance.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit df3dc1105c)
Sometimes the test gets stuck running commands in the container - need to
investigate why later.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 7211c3dccc)
Cloud Hypervisor exposes a VIRTIO_IOMMU device to the VM when IOMMU support is
enabled. We need to add it to the whitelist because dragonball uses kernel
v5.10 which restricted VIRTIO_IOMMU to ARM64 only.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 1b02f89e4f)
Conflicts:
tools/packaging/kernel/kata_config_version
by enabling IOMMU on the default PCI segment. For hotplug to work we need a
virtualized iommu and clh exposes one if there is some device or PCI segment
that requests it. I would have preferred to add a separate PCI segment for
hotplugging vfio devices but unfortunately kata assumes there is only one
segment all over the place. See create_pci_root_bus_path(),
split_vfio_pci_option() and grep for '0000'.
Enabling the IOMMU on the default PCI segment requires passing enabling IOMMU on
every device that is attached to it, which is why it is sprinkled all over the
place.
CLH does not support IOMMU for VirtioFs, so I've added a non IOMMU segment for
that device.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 3a1db7a86b)
This is a to catch the case of the guest getting stuck.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 9f1a42c6cc)
This shouldn't be hiding behind only a qemu check, we need this for clh as
well.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit b46b0ecf8b)
There is no way for this branch to be hit, as port is only set when it is
different than config.NoPort.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit bfc93927fb)
These test cases shows which options are valid for CLH/Qemu, and test that we
correctly catch unsupported combinations.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 7c4e73b609)
The only supported options are hot_plug_vfio=root-port or no-port.
cold_plug_vfio not supported yet.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit fc51e4b9eb)
hot_plug_vfio needs to be set to root-port, otherwise attaching vfio devices to
CLH VMs fails. Either cold_plug_vfio or hot_plug_vfio is required, and we have
not implemented support for cold_plug_vfio in CLH yet.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 509771e6f5)
tdp_mmu had some issues up until around Linux v6.3 that make it work
particularly bad when running nested on Hyper-V. Reload the module at the start
of the test and disable the tdp_mmu param.
Gather debug info at the end of the test to make it easier to figure out what
went wrong. This uses github actions group syntax so that each section can be
collapsed.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 5f6475a28a)
For debugging (though this doesn't get exposed yet).
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 8fffdc81c5)
- reduce memory and cpu usage to fit in a D4s_v5
- source correct lib
- mount workspace from 9p
- disable cpu mitigations for speed
- drop unused commands and variables
- install containerd
- install kata from built artifacts
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit df815087e7)
To match the flow of other github actions workflows.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit a92ddeea15)
This imports the vfio test scripts github.com/kata-containers/tests. The test
case doesn't work yet but doing the changes in a separate commit will make it
easier to track the changes. The only change in this commit is renaming
vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh
Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 5a551a85b1)
This will ensure that we're calling the correct binary for the
hypervisor.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 813bfdec01)
Otherwise we'll fail to configure kata-containers in the `install-kata`
step.
This is mostly needed because the nerdctl-full tarball doesn't provide a
contaienrd configuration, just the binary, as contaienrd does not
actually require a configuration file to run with the default config.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 46bc0b1c01)
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.
This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity
For your own sanity, do not read the comments, after all this is
internet. :-)
Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping. With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.
Fixes: #7910
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 13968aa7f6)
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.
This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity
For your own sanity, do not read the comments, after all this is
internet. :-)
Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping. With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.
Fixes: #7910
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e0c811678b)
This PR ensures that docker is running as part of the init_env function
in kata metrics to avoid failures like docker is not running and making
the kata metrics CI to fail.
Fixes#7898
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit d53eb73eec)
FIO test is showing ongoing issues when running in k8s.
Working on running FIO on the ctr client which has been
shown to be stable.
Fixes: #7920
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit a58ea66592)
This will help us to make sure that the failure is actually related to
Kata Containers.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f536ef5ce1)
There's no reason to wait till the payload is created to run the tests,
as we rely on the tarball, not on the kata-deploy payload.
That was a mistake on my side, and that's already fixed for the nerdctl
tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c83f167c59)
Let's add a very basic sanity test to check that we can spawn a
containers using nerdctl + Kata Containers.
This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.
In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.
Fixes: #7911
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 12d833d07d)
Let's add a very basic sanity test to check that we can spawn a
containers using docker + Kata Containers.
This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.
For now we're running this test against Cloud Hypervisor and QEMU only,
due to an already reported issue with dragonball:
https://github.com/kata-containers/kata-containers/issues/7912
In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.
Fixes: #7910
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 348b8644d6)
As, regardless of what's mentioned in the documentation, it seems that
$GITHUB_REF_NAME is passed down as a literal string.
Fixes: #7414
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f811b064ca)
The ones for the payload-after-push.yamland ci-nightly.yaml are not that
much important right now, but they're needed for when we start running
those on stable branches as well.
The other ones were missed during
bd24afcf73.
Fixes: #7414
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 6d795c089e)
Now that the metrics migration from the tests to kata containers has been completed, this PR removes the warning from the main metrics documentation.
Fixes#7894
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 060499dcae)
There's no need to keep curl there after the kubectl binary has already
been downloaded.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 8b4a0b368f)
Similarly to what's been done for x86_64 -> amd64, we need to do a
aarch64 -> arm64 change in order to be able to download the kubectl
binary.
Fixes: #7861
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 139c7f03ab)
We're changing what's been done as part of ac939c458c, as we've
notcied issues using `github.event.pull_request.merge_commit_sha`.
Basically, whenever a force-push would happen, the reference of
merge_commit_sha wouldn't be updated, leading us to test PRs with the
old code. :-/
In order to get the rebase properly working, we need to ensure we pull
the hash of the commit as part of checkout action, and ensure
fetch-depth is set to 0.
Fixes: #7414
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit bd24afcf73)
This will make our image smaller, and still ensure it's multi-arch
support.
Fixes: #7861
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 670a8e9c73)
There's absolutely no need to have the skip check as part of the test
itself when it's already done as part of the setup function.
We're only touching the files here that were touched in the previous
commit.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f6cd3930c5)
Let's keep both checks for now, but in the future we'll be able to
remove the check for "firecracker", as the hypervisor name used as part
of the GitHub Actions has to match what's used as part of the
kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class)
instead of `firecracker`.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3cc20b47a6)
The tests are failing to finish as the argument is invalid.
Fixes: #6542
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b5bad3cb0f)
That's what we've been using as part of Jenkins, so let's ensure things
will work as they did before, and only after that consider upgrading the
base OS used for the tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit aaec5a09f3)
We've been using the `kata-deploy-tdx` target as that also uses k3s as
base, but it's better to just have a specific garm target.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 27fa7d828d)
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit fa62a4c01b)
GARM runners do not come with the whole set of tools we need, or are
used to when it comes to the GHA runners, so we need to manually install
bats on those.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 8c9380a798)
Let's put a 1 minute sleep, just to make sure everything is back up
again.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3de23034f8)
This PR changes the order in which the FIO test first
cleans the environment and then checks if the environment
is indeed clean.
Fixes: #7869
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit adfea55b8f)
As we were using `tee` without the `-a` (or `--apend`) aptton, the
containerd config would be overwritten, leading to a NotReady state of
the Node.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2df183fd99)
It should be plenty, and worked well in local tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 369a8af8f7)
Let's download the vanilla kubectl binary into `/usr/bin/`, as we need
to avoid hitting issues like:
```sh
error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied
```
The issue basically happens because k3s links `/usr/local/bin/kubectl`
to `/usr/local/bin/k3s`, and that does extra stuff that vanilla
`kubectl` doesn't do.
Also, in order to properly use the k3s.yaml config with the vanilla
kubectl, we're copying it to ~/.kube/config.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ada65b988a)
Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users
than root.
As --write-config-mode is being passed, and that's an option that has to
be passed to the `server`, -s is also added to the command line.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ad45ab5d33)
`wait` waits for a job to complete, not a number of seconds. Not sure
how I got that wrong in the first place, but it's what it's.
Fixes: #6542
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 028a97e0d5)
This PR replaces the ubuntu image for one which has TensorFlow optimized
for kata metrics.
Fixes#7866
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 3a427795ea)
Let's enable the devmapper kubernetes tests to match exactly what's been
tested as part of the Jenkins CI.
Fixes: #6542
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 0e8bd50cbb)
This function right now is completely based on what's part of the tests
repo[0], and that's the reason I'm keeping the `Signed-off-by` of all
the contributors to that file.
This is not perfect, though, as it changes the default snapshotter to
devmapper, instead of only doing so for the Kata Containers specific
runtime handlers. OTOH, this is exactly what we've always been doing as
part of the tests.
We'll improve it, soon enough, when we get to also add a way for
kata-deploy to set up different snapshotters for different handlers.
But, for now, this is as good (or as bad) as it's always been.
It's important to note that the devmapper setup doesn't take into
consideration a BM machine, and this is not suitable for that. We're
really only targetting GHA runners which will be thrown away after the
run is over.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit b28b54df04)
One can use different kubernetes flavours for getting a kubernetes
cluster up and running.
As part of our CI, though, I really would like to avoid contributors
spending time maintaining and updating kubernetes dependencies, as done
with the tests repo, and which has been proven to be really good on
getting things rotten.
With this in mind, I'm taking the bullet and using "k3s" as the way to
deploy kubernetes for the devmapper related tests, and that's the reason
I'm adding a function to do so, and this will be used later on as part
of this series.
It's important to note that the k3s setup doesn't take into
consideration a BM machine, and this is not suitable for that. We're
really only targetting GHA runners which will be thrown away after the
run is over.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 54f7117212)
This PR is to skip installing docker-compose-plugin while buiding a `build-kata-deploy` image for s390x|ppc64le.
It is a temporary solution to fix current CI failures for s390x regarding `hash sum mismatch`.
Fixes: #7848
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
(cherry picked from commit 2efda20c77)
This PR adds the write 95 percentile for FIO for qemu for
checkmetrics for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 438fbf9669)
This PR adds the write 95 percentile FIO value for checkmetrics
for kata metrics.
Fixes#7842
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 024b4d2ffe)
Previously, static-checks-dragonball is using machines from Alibaba
Cloud to run all the CI jobs.
Currently, we are going through an internal process to apply for the new
machines for Dragonball CI. Before the internal process is over, we will
temporarily use Azure VM to run static-checks-dragonball jobs.
fixes: #7838
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
(cherry picked from commit 60f733d301)
Pass --owner and --group to the tar invokation to prevent gihtub runner user
from leaking into release artifacts.
Fixes: #7832
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 18c94ebbe3)
This PR fixes an issues in the parsing results stage,
by collecting just the n-results from the n-running
containers, discarding irrelevant data.
Fixes: #7774
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit 538c965c2b)
We have two scenarios we care about this, `pull_request` and
`pull_request_target` events triggered a job.
`pull_request` event:
When using the checkout action, it'll already provide a "rebased atop of
main" repo for us, nothing else is needed, and that's basically what we
already have as part of the jobs in our CI.
`pull_request_target` event:
This one is a little bit tricky, as the checkout action, unless passing
a spsecific repo, give us the PR checked out rebased atop of the HEAD of
the PR branch. Jeremi Piotrowski nicely pointed out that we could use
github.event.pull_request.merge_commit_sha instead, which is the result
of the PR's branch with the official repo target branch.
Now, the only cases where the contributor's rebase would still be needed
is when the action itself has been changed.
Fixes: #7414
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ac939c458c)
This PR adds the grabdata script so it can be used for the metrics report
for kata metrics.
Fixes#7812
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 6668825752)
At this point we should always be using the latest checkout action.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit d7a996c686)
This PR adds the limit for 90 percentile for qemu value for
FIO kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit a7b59a5bf9)
This PR adds the limit for write 90 percentile value for clh for
FIO metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 99db6568e9)
This PR fixes the memory inside limit for clh for kata metrics due
to the recent changes that we had in the script which impacted
in the performance measurement.
Fixes#7786
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 8877ec62fb)
Let's expand the confidential test to also support TDX.
The main difference on the test, though, is that we're not grepping for
a string in the `dmesg` output, but rather relying on `cpuid` to detect
a TDX guest.
Fixes: #7184
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e286e842c1)
Let's expand the confidential test to also support SNP.
Fixes: #7184
Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
(cherry picked from commit e31f099be1)
Add a test case for the launch of unencrypted confidential
container, verifying that we are running inside a TEE.
Right now the test only works with SEV, but it'll be expanded in the
coming commits, as part of this very same series.
Fixes: #7184
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c3b9d4945e)
The file can be removed between builds without causing any issue, and
leaving it around has been causing us some headache due to:
```
ERROR: open /home/runner/.docker/buildx/activity/default: permission denied
```
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3818bf3311)
Otherwise we'll have to re-run all the tests due to a flaky behaviour in
one of the parts.
Fixes: #7757
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit fb49d5d7ce)
k8s-pid-ns.bats was already using the test name from
k8s-kill-all-process-in-container.bats - probably a copy/paste bug.
Fixes: #7753
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
(cherry picked from commit 183f51d6f6)
At the end of k8s-kill-all-process-in-container.bats, delete the
deployment it created.
Fixes: #7752
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
(cherry picked from commit 6a974679f2)
The directory is a host path mount and cannot be removed from within the
container. What we actually want to remove is whatever is inside that
directory.
This may raise errors like:
```
rm: cannot remove '/opt/kata/': Device or resource busy
```
Fixes: #7746
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit d8f3ce6497)
The vfio test requires nested-nested virtualization:
L0 Azure host
-> L1 Ubuntu VM
-> L2 Fedora VM
-> L3 Kata
This hits a kernel bug on v5.15 but works quite nicely on the v6.2 kernel
included in Ubuntu 23.04. We can switch back to Ubuntu 22.04 when they roll out
v6.2.
Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 936e8091a7)
docker install now creates a group with gid 999 which happens to match what we
need to get docker-in-docker to work. Remove the group first as we don't need
it.
Fixes: #7726
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 3b881fbc0e)
This PR adds the TensorFlow ResNet50 fp32 Dockerfile for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 959ca49447)
We can simply use `rm -f` all over the place and avoid the container
returning any error.
Fixes: #7733
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 5cba38c175)
This PR configures the corresponding kata runtime in K8s
based on the tested hypervisor.
This PR also enables FIO metrics test in the kata metrics-ci.
Fixes: #7665
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit fb571f8be9)
This PR fixes the check results for tensorflow benchmark now
that we change the name of the test.
Fixes#7684
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit e8a5119343)
Instead of doing this as part of the test itself, let's ensure it's done
before running the tests and during the tests cleanup.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2d896ad12f)
This test, at least for now, only checks whether the runtimeclasses
have been properly created.
This is just a migration from a test we had as part of the k8s suite.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4ffc2c86f3)
By doing this we can make sure there won't be any clash on the cluster
name created for either the k8s or the kata-deploy tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 285e616b5e)
This will have the same function as run-k8s-tests.sh has, but for
kata-deploy.
Right now it doesn't have any tests, and the command to actually run the
tests is commented out, but right now this is just a placeholder that
will be populated sooner than later.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ce6adecd0a)
In a follow-up series, we'll add a whole suite for the kata-deploy
tests. With this in mind, let's already get rid of this one and avoid
more kata-deploy tests to land here.
Fixes: #7642
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit cfc29c11a3)
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.
Fixes: #7663
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
(cherry picked from commit 339569b69c)
Let's add the tests as part of the ci.yaml, so they an be triggered as
part of each PR.
For this PR those tests won't be triggered, courtesy to the
`pull_request_target` event we rely on.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit d19a75e80c)
This PR fixes the TensorFlow word across the document to have uniformity
across all the document.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit bade6a5c3b)
As the cri-containerd tests have been fully migrated to GHA, let's make
sure we get them running.
Fixes: #6543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b3592ab25c)
As part of the runners, we're hitting a timeout that I cannot reproduce,
at all, when allocating the same instance and running the tests
manually.
The default timeout to connect to the server is 2s when using `crictl`.
Let's increase this to 20s.
It's fairly important to mention that in the first tests I used a
timeout of 10s, and that helped but we still hit issues every now and
then.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 84dd02e0f9)
It'll help us to debug failures with the pod stop / pod delete.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b29782984a)
We need this to fully understand what are the issues we're facing.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ae0930824a)
This improves readability in case of failures by a lot.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 6c8b2ffa60)
This PR renames the tensorflow scripts to include the data format
that is being used as we will have multiple tests with different
data and model formats for tensorflow so this will help us to
distinguish them.
Fixes#7645
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 18a7fd8e4e)
This will not be tested as part of the PR, thanks to the
`pull_request_target` event, but we want it to be added so we can build
atop of that in a coming up series.
Fixes: #7642
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e55fa93db9)
This will not be tested as part of the PR, thanks to the
`pull_request_target` event, but we want it to be added so we can build
atop of that in a coming up series.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit d9ee17aaec)
Right now this file does nothing, as it's not even called by any GHA.
However, it'll be populated later on as part of a different series,
where we'll have kata-deploy specific tests running here.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 831e73ff91)
Let's split a good portion of `tests/integration/kuberentes/gha-run.sh`
out, and put them in a place where they can be used to the soon-to-come
kata-deploy specific tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit af1b46bbf2)
This PR fixed the loop that stops the kata-shim and the
hypervisors used in metrics checks.
Fixes: #7628
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit 767434d50a)
The GHA runners are not exactly powerful, which makes the static-checks
take way too long (almost an hour).
Let's give a try and move those to the same size of Azure instances used
as part of our CI, and probably have this time reduced.
Fixes: #7446
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c52d090522)
This PR adds check containers are running in tensorflow mobilenet
that is being defined in common script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit fdcd52ff78)
This PR adds the check containers are up function from common
in tensorflow script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 36337ee146)
This PR adds the check containers are running function the common metrics
script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 833cf7a684)
This PR adds the check containers are up in the common script
in the tensorflow mobilenet script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 918c783084)
This PR uses the check containers are up from the common script
in the tensorflow script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 9d57a1fab4)
This PR adds check containers are up in common script for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 1c84680d8c)
This PR uses the collect results function defined in common for
the tensorflow mobilenet test.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit d3e57cf454)
This PR removes the collect results function from tensorflow script
as it is going to be referenced in the common metrics script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 286de046af)
This PR computes average results for TF bench.
Additionally, it improves the data parsing from
all running containers.
Fixes: #7603
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit 473b0d3a31)
The build context should be the folder where the Dockerfile is present,
otherwise the files copied into the image won't be found.
Fixes: #7595
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 03d1fa67b1)
Let's make sure that we don't fail in case we're building non x86_64.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit eb463b38ec)
Following the example on https://github.com/docker/build-push-action,
it's clear that the actions to "Set up QEMU" and "Set up Docker Buildx"
are missing.
Let's add them, and also take the advantage to bump the
build-push-action to its v4, which, by the way, had a typo on its name
(build-and-push-action does **NOT** exist, build-push-action does).
Fixes: #7595
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit a2d731ad26)
Right now this is not being used, but it'll as the image generated for
the confidential tests have that as part of their tag.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 43fe5d1b90)
One of the joys to rely on the `pull_request_target` is to only be able
to catch those after those are merged.
Fixes: #7595
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 54f6a78500)
With these 2 simple checks we can ensure that we do not regress on the
behaviour of allowing the runtime classes / default runtime class to be
created by the kata-deploy payload.
Fixes: #7491
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 034d7aab87)
This will be done before running TEE tests, and it's a hard dependency
fr them.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit fac8ccf5cd)
Let's add here the image we'll be using for unencrypted confidential
tests. Later on, we'll make sure to build and use this image as part of
our CI.
The image can easily be built as a multi-arch image, and has `cpuid`
installed in case of `x86_64` build, so it can be used to detect whether
we're running on a TEE guest without having to rely on `dmesg | grep
...`.
Fixes: #7595
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ab5f603ffa)
Instead of using package manager to install bats, building
this from source. This gives us the updated version of bats
which supports functions such as setup_file and
teardown_file.
We can use these functions into our current tests.
Fixes: #7597
Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
(cherry picked from commit aeaec9dae9)
This PR changes the metrics workflow in order to just install
kata once, and run the checks for multiple hypervisor variations.
In this way we save time avoiding installing kata for each
hypervisor to be tested.
Fixes: #7578
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit e664969862)
This PR renames the mobilenet tensorflow test to have a more specific
tensorflow name mainly because tensorflow has different configurations
and we will add more tensorflow tests so we want to distinguish each
tensorflow test.
Fixes#7571
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 863283716d)
This PR adds the iperf network metrics to the github actions
for kata metrics.
Fixes#7535
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 5b5caf8908)
This commit provides a new way to name the containers used
in the launch-times-test in this form:
'kata_launch_times_RANDOM_NUMBER', where RANDOM_NUMBER is
in the 0-1000 range.
Fixes: #7529
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit 1e15369e59)
when interacting with systemd. We have occasionally faced issues with
compatibility between the systemctl version used inside the kata-deploy
container and the systemd version on the host. Instead of using a containerized
systemctl with bind mounted sockets, nsenter the host and run systemctl from
there. This provides less coupling between the kata-deploy container and the
host.
Fixes: #7511
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Version 0.10.5, which was just released, breaks `nydus-storage`.
This is a workaround to fix the CI which is blocking other PRs.
Fixes: #7541
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-07 10:53:41 +02:00
282 changed files with 9902 additions and 3222 deletions
@@ -219,9 +218,6 @@ func (v *virtiofsd) valid() error {
ifv.cache==""{
v.cache=typeVirtioFSCacheModeAuto
}elseifv.cache==typeVirtioFSCacheModeNone{
v.Logger().Warn("virtio-fs cache mode `none` is deprecated since Kata Containers 2.5.0 and will be removed in the future release, please use `never` instead. For more details please refer to https://github.com/kata-containers/kata-containers/issues/4234.")
sudo tar xfz "${tarball_name}" -C /usr/local/bin --strip-components=1
rm -f "${tarball_name}"
}
function _get_os_for_crio(){
source /etc/os-release
if["${NAME}" !="Ubuntu"];then
echo"Only Ubuntu is supported for now"
exit2
fi
echo"x${NAME}_${VERSION_ID}"
}
# version: the CRI-O version to be installe
function install_crio(){
localversion=${1}
os=$(_get_os_for_crio)
echo"deb https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/${os}/ /"|sudo tee /etc/apt/sources.list.d/devel:kubic:libcontainers:stable.list
echo"deb http://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/${version}/${os}/ /"|sudo tee /etc/apt/sources.list.d/devel:kubic:libcontainers:stable:cri-o:${version}.list
# Set the latest image, the one generated as part of the PR, to be used as part of the tests
sed -i -e "s|quay.io/kata-containers/kata-deploy:3.2.0|${DOCKER_REGISTRY}/${DOCKER_REPO}:${DOCKER_TAG}|g" "${repo_root_dir}/tools/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml"
grep "${DOCKER_REGISTRY}/${DOCKER_REPO}:${DOCKER_TAG}" "${repo_root_dir}/tools/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml" || die "Failed to setup the tests image"
sed -i -e "s|quay.io/kata-containers/kata-deploy:3.2.0|${DOCKER_REGISTRY}/${DOCKER_REPO}:${DOCKER_TAG}|g" "${repo_root_dir}/tools/packaging/kata-deploy/kata-cleanup/base/kata-cleanup.yaml"
grep "${DOCKER_REGISTRY}/${DOCKER_REPO}:${DOCKER_TAG}" "${repo_root_dir}/tools/packaging/kata-deploy/kata-cleanup/base/kata-cleanup.yaml" || die "Failed to setup the tests image"
for KATA_DEPLOY_TEST_ENTRY in ${KATA_DEPLOY_TEST_UNION[@]}
do
bats "${KATA_DEPLOY_TEST_ENTRY}"
done
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.